Jawed
Legend
Clocks were meant to be 20-30% higher?The mystery is why did Nvidia bother with GDDR6X in the first place.
3090 based Quadro?
Clocks were meant to be 20-30% higher?The mystery is why did Nvidia bother with GDDR6X in the first place.
There's a new 3dmark ray tracing feature test, just in time.
According to Dave Oldcorn, this is the preferred mode for AMD.The 3DMark DirectX Raytracing feature test is available now. You'll need Windows 10, 64-bit with the May 2020 Update (version 2004) and a graphics card with drivers that support DirectX Raytracing Tier 1.1 to run the test.
What does it test, reflections, GI, shadows, path tracing?There's a new 3dmark ray tracing feature test, just in time.
What does it test, reflections, GI, shadows, path tracing?
How much rays per pixel?
Does it use denoisers?
How large is scene?
Hopefully it's not completely synthetic and unrelated to real-world performance in games like other feature tests.
https://benchmarks.ul.com/news/new-3dmark-test-measures-pure-raytracing-performance
It's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.
https://s3.amazonaws.com/download-aws.futuremark.com/3dmark-technical-guide.pdfIt's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.
https://benchmarks.ul.com/news/new-3dmark-test-measures-pure-raytracing-performance
It's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.
Sure, but the bench looks like something, which should be limited by ray-triangle intersetion performance alone.Nice! Even if it's not perfect it's a useful data point.
https://s3.amazonaws.com/download-aws.futuremark.com/3dmark-technical-guide.pdf
Just finished reading this feature test section.
So it's noisy DOF with 12 (default setting) relatively coherent rays (due to CPU sorting, which is impossible with other effects), tons of instancing in the video above and likely relatively shallow BVH due to the heavy instancing.
To be honest, this doesn't look like anything representative of real games, such as Minecraft RTX, where there are 0.5-1 rays on avarage for a given effect, BVH occupies up to several gigabytes of memory and rays are incoherent.
Sure, but RT is a complex thing, it's a whole pipeline with millions on nuances.True, but synthetic tests can have their uses as long as you don't make the assumption that they'll represent game performance.
Sure, but the bench looks like something, which should be limited by ray-triangle intersetion performance alone.
The scene is completely static, no need to rebuild BVH for dynamic geometry, which is an essential part of RT.
Also scene complexity looks relatively low, you can pack the scene into a few AABBs, while in reality RDNA2 has 4 ray/box intersection blocks in CU for a reason - Ray-AABBs tests should dominate in execution time since scene complexity in real games is high.
Rays are coherent too, while secondary rays are almost always quite divergent.
This test looks completely useless unless someone adds the same DOF implementation into real games (still looks like a waste of performance since rasterisation will be much faster for this effect).
I wish they simply added a number of knobs, such as BVH depth, type of rays (primary, secondary), rays dispercy, number of skinned and static models in scene, this would have been so much better syntetic test.
Is there any indication which window sizes (PCIe BAR Size) are supported by GCN and if these were reconfigurable after booting?It's not exclusive to RDNA, it goes back all the way to the first GCN
This type of driver-assisted coherence is just a fallback which incurs significant overhead. Truely heterogeneous Unified Memory Architecture (UMA) is only possible with AMD APUs or supercomputer systems like the NVidia DGX-2 and the upcoming HP/Cray El Capitan, since they use proprietary protocols (Infinity Fabric / NVLink) that support hardware cache coherence with atomic memory access.all modern x86 CPUs have automatic cache coherence via snooping built into their PCIE controllers. You can see in the Vulkan GpuInfo database that any system memory heaps (those without the DEVICE_LOCAL bit) all have the HOST_COHERENT bit set, meaning any GPU writes to system memory are automatically coherent with the CPU
Going the other way, CPU access to GPU memory on AMD GPUs is always considered coherent, but not automatically in hardware. Instead it's because the kernel mode driver explicitly flushes/invalidates the GPU's "host data path" caches every time a command buffer is submitted from user space.
But isn't that an artistic representation of the Die, rather than a real photo of the die itself?
The general layout is correct, but the fine logic details are glossed over.But isn't that an artistic representation of the Die, rather than a real photo of the die itself?