AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

Jawed · Nov 2, 2020

trinibwoy said:
The mystery is why did Nvidia bother with GDDR6X in the first place.

Clocks were meant to be 20-30% higher?

3090 based Quadro?

Scott_Arm · Nov 2, 2020

There's a new 3dmark ray tracing feature test, just in time.

Deleted member 90741 · Nov 2, 2020

Scott_Arm said:
There's a new 3dmark ray tracing feature test, just in time.

The 3DMark DirectX Raytracing feature test is available now. You'll need Windows 10, 64-bit with the May 2020 Update (version 2004) and a graphics card with drivers that support DirectX Raytracing Tier 1.1 to run the test.

According to Dave Oldcorn, this is the preferred mode for AMD.

OlegSH · Nov 2, 2020

Scott_Arm said:
There's a new 3dmark ray tracing feature test, just in time.

What does it test, reflections, GI, shadows, path tracing?
How much rays per pixel?
Does it use denoisers?
How large is scene?

Hopefully it's not completely synthetic and unrelated to real-world performance in games like other feature tests.

Scott_Arm · Nov 2, 2020

OlegSH said:
What does it test, reflections, GI, shadows, path tracing?
How much rays per pixel?
Does it use denoisers?
How large is scene?

Hopefully it's not completely synthetic and unrelated to real-world performance in games like other feature tests.

https://benchmarks.ul.com/news/new-3dmark-test-measures-pure-raytracing-performance

It's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.

SimBy · Nov 2, 2020

Scott_Arm said:
https://benchmarks.ul.com/news/new-3dmark-test-measures-pure-raytracing-performance

It's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.

What's up with all the noise.

OlegSH · Nov 2, 2020

Scott_Arm said:
It's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.

https://s3.amazonaws.com/download-aws.futuremark.com/3dmark-technical-guide.pdf
Just finished reading this feature test section.
So it's noisy DOF with 12 (default setting) relatively coherent rays (due to CPU sorting, which is impossible with other effects), tons of instancing in the video above and likely relatively shallow BVH due to the heavy instancing.
To be honest, this doesn't look like anything representative of real games, such as Minecraft RTX, where there are 0.5-1 rays on avarage for a given effect, BVH occupies up to several gigabytes of memory and rays are incoherent.

OlegSH · Nov 2, 2020

I wonder how it would fare on Pascal GPUs relative to Turing

trinibwoy · Nov 2, 2020

Scott_Arm said:
https://benchmarks.ul.com/news/new-3dmark-test-measures-pure-raytracing-performance

It's a "synthetic" test, I guess. It's designed to isolate ray tracing performance. Seems to be path traced.

Nice! Even if it's not perfect it's a useful data point.

OlegSH · Nov 2, 2020

trinibwoy said:
Nice! Even if it's not perfect it's a useful data point.

Sure, but the bench looks like something, which should be limited by ray-triangle intersetion performance alone.
The scene is completely static, no need to rebuild BVH for dynamic geometry, which is an essential part of RT.
Also scene complexity looks relatively low, you can pack the scene into a few AABBs, while in reality RDNA2 has 4 ray/box intersection blocks in CU for a reason - Ray-AABBs tests should dominate in execution time since scene complexity in real games is high.
Rays are coherent too, while secondary rays are almost always quite divergent.
This test looks completely useless unless someone adds the same DOF implementation into real games (still looks like a waste of performance since rasterisation will be much faster for this effect).
I wish they simply added a number of knobs, such as BVH depth, type of rays (primary, secondary), rays dispercy, number of skinned and static models in scene, this would have been so much better syntetic test.

Scott_Arm · Nov 2, 2020

OlegSH said:
https://s3.amazonaws.com/download-aws.futuremark.com/3dmark-technical-guide.pdf
Just finished reading this feature test section.
So it's noisy DOF with 12 (default setting) relatively coherent rays (due to CPU sorting, which is impossible with other effects), tons of instancing in the video above and likely relatively shallow BVH due to the heavy instancing.
To be honest, this doesn't look like anything representative of real games, such as Minecraft RTX, where there are 0.5-1 rays on avarage for a given effect, BVH occupies up to several gigabytes of memory and rays are incoherent.

True, but synthetic tests can have their uses as long as you don't make the assumption that they'll represent game performance. The problem with using only gmaing benchmarks is it can be hard to extrapolate game performance across generations, like when the consoles shift to new minimum specs. Generally new features get leveraged etc and it's hard to predict what the outcome will be on released pc hardware. For example, UE5 is going to behave very stranglely compared to just about any other game.

OlegSH · Nov 2, 2020

Scott_Arm said:
True, but synthetic tests can have their uses as long as you don't make the assumption that they'll represent game performance.

Sure, but RT is a complex thing, it's a whole pipeline with millions on nuances.
A good synthetic test for RT should have tons of knobs to play with scene configurations, materials, effects, etc, etc, not just the number of rays, it's not a tesselation.
For now, it's a bad synthetic test (since it doesn't represent any real world RT configurations), which configuration is likely skewed for one of vendors, otherwise I don't know why would they productize the benchmark at all.

trinibwoy · Nov 2, 2020

OlegSH said:
Sure, but the bench looks like something, which should be limited by ray-triangle intersetion performance alone.
The scene is completely static, no need to rebuild BVH for dynamic geometry, which is an essential part of RT.
Also scene complexity looks relatively low, you can pack the scene into a few AABBs, while in reality RDNA2 has 4 ray/box intersection blocks in CU for a reason - Ray-AABBs tests should dominate in execution time since scene complexity in real games is high.
Rays are coherent too, while secondary rays are almost always quite divergent.
This test looks completely useless unless someone adds the same DOF implementation into real games (still looks like a waste of performance since rasterisation will be much faster for this effect).
I wish they simply added a number of knobs, such as BVH depth, type of rays (primary, secondary), rays dispercy, number of skinned and static models in scene, this would have been so much better syntetic test.

You’re right of course. However raytracing performance is determined by so many factors that any synthetic benchmark won’t necessarily predict in-game performance.

I would rather have simple feature tests that tease out raw triangle and box intersection throughput as a baseline to help us understand the hardware. That’s how we did it for fillrate and texturing.

Then maybe layer on other tests that focus on ray divergence, instancing and more complex BVHs.

BRiT · Nov 2, 2020

*ahem* This is the AMD PC GPU Product thread. Please keep on topic.

DmitryKo · Nov 2, 2020

MuteyM said:
It's not exclusive to RDNA, it goes back all the way to the first GCN

Is there any indication which window sizes (PCIe BAR Size) are supported by GCN and if these were reconfigurable after booting?

ROCm drivers support BAR Size of >4 GB at least on GFX8 (Polaris GCN4) and GFX9 (Vega GCN5), but these are the only two officially supported GPU architectures in ROCm so far, with limited unofficial support for GFX7 (Hawaii GCN2).

all modern x86 CPUs have automatic cache coherence via snooping built into their PCIE controllers. You can see in the Vulkan GpuInfo database that any system memory heaps (those without the DEVICE_LOCAL bit) all have the HOST_COHERENT bit set, meaning any GPU writes to system memory are automatically coherent with the CPU
Going the other way, CPU access to GPU memory on AMD GPUs is always considered coherent, but not automatically in hardware. Instead it's because the kernel mode driver explicitly flushes/invalidates the GPU's "host data path" caches every time a command buffer is submitted from user space.

This type of driver-assisted coherence is just a fallback which incurs significant overhead. Truely heterogeneous Unified Memory Architecture (UMA) is only possible with AMD APUs or supercomputer systems like the NVidia DGX-2 and the upcoming HP/Cray El Capitan, since they use proprietary protocols (Infinity Fabric / NVLink) that support hardware cache coherence with atomic memory access.

Deleted member 90741 · Nov 2, 2020

Godfall FidelityFX CAS + Luminance Preserving Mapper + DXR1.1 (for shadows) showcase.

SimBy · Nov 2, 2020

https://twitter.com/x/status/1323290890874449920

xEx · Nov 2, 2020

But isn't that an artistic representation of the Die, rather than a real photo of the die itself?

SimBy · Nov 2, 2020

xEx said:
But isn't that an artistic representation of the Die, rather than a real photo of the die itself?

Probably a CGI version of the real thing with some RGB to make it pretty.

fellix · Nov 2, 2020

xEx said:
But isn't that an artistic representation of the Die, rather than a real photo of the die itself?

The general layout is correct, but the fine logic details are glossed over.

AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

Jawed

Scott_Arm

Deleted member 90741

Guest

OlegSH

Scott_Arm

SimBy

OlegSH

OlegSH

trinibwoy

Meh

OlegSH

Scott_Arm

OlegSH

trinibwoy

Meh

BRiT

(>• •)>⌐■-■ (⌐■-■)

DmitryKo

Deleted member 90741

Guest

SimBy

xEx

SimBy

fellix

Similar threads