GPU Ray Tracing Performance Comparisons [2021-2022]

Can we stop being total assholes to one another because of the need to worship an arbitrary platform of choice!

Remain civil or bans will be handed out like talk show give-aways. Don't force the mods to get involved.
 
Agree on your post, nicely written. I wasnt debating software vs hardware, cause obviously software is better if possible, but we dont have 200TF GPU's just yet. Its like PS2 vs GF4 Ti4600....
Again what i didnt agree on is that 'console RT is better than PC RT', which is the first time i heard someone claiming this btw, not to withstand what the actual results show.
They aren't.

They're claiming console RT API is better than DXR at the moment because they have more control over RT on consoles, and it provides the flexibility for them to generate the results they want without extravagant work arounds to optimize the hardware.
 
I don't think people are against customization at the lowest possible of levels. I think the debate is around perspective of what needs to arrive first: faster generic speed, at the cost of customization, or slower generic speed with customization.
Faster generic speed? All the stars aligned with Ampere versus Turing when it comes to ray tracing performance. NVidia seemed to indicate that raw ray tracing performance in Ampere is 2x+ that of Turing and I believe there were some non-gaming benchmarks that demonstrated this.

But in gaming, using just the DXR tests from this page:

GeForce 466.77 Driver Performance Analysis – Using Ampere and Turing (babeltechreviews.com)

in the 7 games tested at 1440p native resolution (no DLSS) with "maximum" settings for 3080 versus 2080Ti (two non-reference cards at stock settings):
  • 39% faster on average
  • 41% for 1% lows
  • 42% for 0.2% lows
So, where is generic gaming performance going to come from? The next 40% is looking like it's going to be very difficult because bandwidth and compute have hit the wall.

In my opinion this is analogous to geometry shading versus mesh shading. It's a slow motion car crash that in the worst case is going to take 10 years to play out.

It's worth remembering that AMD with a compute-SIMD "slow" approach has equalled NVidia's dedicated-MIMD in Turing. With Ampere, NVidia gained 40% on Turing in games, but it seems likely there are no major gains to be had from "better MIMD" in Lovelace.

I don't know if there was ever an in-depth analysis of how NVidia gained 2x+ raw ray tracing performance in Ampere. Links? Such an analysis would be the most productive baseline for a discussion of where NVidia can go from here, hence "a roadmap of generic speed".
 
It's worth remembering that AMD with a compute-SIMD "slow" approach has equalled NVidia's dedicated-MIMD in Turing.
Nope, once you push RT workload upwards, Turing gets even more faster than RDNA2, as demonestrated in Minecraft, Quake 2, Call of Duty Cold War, Cyperbunk, Control .. etc.

I don't know if there was ever an in-depth analysis of how NVidia gained 2x+ raw ray tracing performance in Ampere. Links?
Here, almost 2x increase from 2080Ti to 3090 in many pro RT apps.
https://techgage.com/article/mid-2021-gpu-rendering-performance/1/
https://techgage.com/article/mid-2021-gpu-rendering-performance/2/
 
I don't know if there was ever an in-depth analysis of how NVidia gained 2x+ raw ray tracing performance in Ampere. Links? Such an analysis would be the most productive baseline for a discussion of where NVidia can go from here, hence "a roadmap of generic speed".

Double FP32 throughput, better L1 Cache, double triangle intersection, 50% more bandwidth with GDDR6X and 50% more transistors.
 
Nope, once you push RT workload upwards, Turing gets even more faster than RDNA2, as demonestrated in Minecraft, Quake 2, Call of Duty Cold War, Cyperbunk, Control .. etc./
Why is 6900XT nearly twice as fast as 3090 here:

TweakTown.com Enlarged Image

with no DLSS/CAS? 10.6fps versus 6.6fps.

Double FP32 throughput, better L1 Cache, double triangle intersection, 50% more bandwidth with GDDR6X and 50% more transistors.
Out of that list only "triangle intersection rate" (did you mean ray traversal rate, or this is a specific part of ray traversal that you're referring to?) and bandwidth are on topic for Ampere's increased performance over Turing.

Is the rate a side-effect solely of more ray tracing cores?
 
It's worth remembering that AMD with a compute-SIMD "slow" approach has equalled NVidia's dedicated-MIMD in Turing. With Ampere, NVidia gained 40% on Turing in games, but it seems likely there are no major gains to be had from "better MIMD" in Lovelace.

I don't know if there was ever an in-depth analysis of how NVidia gained 2x+ raw ray tracing performance in Ampere. Links? Such an analysis would be the most productive baseline for a discussion of where NVidia can go from here, hence "a roadmap of generic speed".

RTX 2080 Ti = 68 RT cores.
RTX 3080 = 68 RT cores.

Those cores are "not equal".
Is this what you are looking for?
Ampere vs Turing real time ray tracing overhead – Coreteks
 
Why is 6900XT nearly twice as fast as 3090 here:

TweakTown.com Enlarged Image

with no DLSS/CAS? 10.6fps versus 6.6fps.


Out of that list only "triangle intersection rate" (did you mean ray traversal rate, or this is a specific part of ray traversal that you're referring to?) and bandwidth are on topic for Ampere's increased performance over Turing.

Is the rate a side-effect solely of more ray tracing cores?

This?
upload_2021-7-10_15-2-54.png

Talk about finding a border case of unplayable architechture limits.
 
I don't know if there was ever an in-depth analysis of how NVidia gained 2x+ raw ray tracing performance in Ampere. Links? Such an analysis would be the most productive baseline for a discussion of where NVidia can go from here, hence "a roadmap of generic speed".
Those were quoted wrt to professional apps, the closest game at 1,8x was Q2 RTX:
upload_2021-7-10_16-17-19.png

And regarding where the performance comes from in addition to 2x the intersection testing rate claimed by Nvidia, there's this:
upload_2021-7-10_16-20-21.png
 
Turing RT core:
upload_2021-7-10_16-29-39.png

Ampere RT core:
upload_2021-7-10_16-30-23.png

This makes for this:
NVIDIA-GeForce-RTX-30-Tech-Session-00012_8E7903036AE74BDFA681C17C4533F4DB.jpg


NVIDIA-GeForce-RTX-30-Tech-Session-00014_6D4C90C974B146AFB223108CC0A8D805-1024x576.jpg


NVIDIA-GeForce-RTX-30-Tech-Session-00015_24C6C532328D4B609C2E99B21B453B34-1024x576.jpg


So they did some real work between Gen1 and Gen2.
 
So they did some real work between Gen1 and Gen2.
I almost forgot about the added MB support. It's an unexpected improvement over Turing.
I assume they target offline rendering application, not games? Has DXR even support?
And if offline is the target, i think one serious limitation here was limited levels of instancing. So are there improvements too, and all those things are exposed by Optix maybe?
 
Faster generic speed? All the stars aligned with Ampere versus Turing when it comes to ray tracing performance. NVidia seemed to indicate that raw ray tracing performance in Ampere is 2x+ that of Turing and I believe there were some non-gaming benchmarks that demonstrated this.
For gaming, Nvidia only ever said to expect up to 2x 2080 performance.
 
I can't post a link yet, but you can see the pure RT performance in the GPSnoopy's RayTracingInVulkan demo. (you can google it)
The reality is that the 6900XT is slower than the 2080Ti. The 6900XT is only faster if the scene has very shallow BVH depth and no triangle geometry.
 
I almost forgot about the added MB support. It's an unexpected improvement over Turing.
I assume they target offline rendering application, not games? Has DXR even support?
And if offline is the target, i think one serious limitation here was limited levels of instancing. So are there improvements too, and all those things are exposed by Optix maybe?

I don't think it is for gaming yet...3rd gen will be quite telling for where they think RT is going.
 
I think it could make sense for games too, even while still in hybrid era. Primary (rasterized) stuff can use SS fakes, but shadows and reflections don't work well.
And this could fix it. Maybe already practical in games which do not have many RT effects.

Question is if this requires BVH refit to extend boxes with motion, or if RT Cores can do this on the fly avoiding such problem.
 
Back
Top