GPU Ray Tracing Performance Comparisons [2021-2022]

trinibwoy · Jul 10, 2021

Jawed said:
It's worth remembering that AMD with a compute-SIMD "slow" approach has equalled NVidia's dedicated-MIMD in Turing. With Ampere, NVidia gained 40% on Turing in games, but it seems likely there are no major gains to be had from "better MIMD" in Lovelace.

Are you drawing that conclusion based on mixed workloads where RT is just one consideration? For reference the 3080 and 2080 Ti have the same number of RT “cores” yet the former is 85% faster in Optix.

Jawed · Jul 11, 2021

HLJ said:
This?
View attachment 5667

Talk about finding a border case of unplayable architechture limits.

Ah, you must be new around here.

That's what DavidGraham suggested:

DavidGraham said:
Nope, once you push RT workload upwards

That's what we do here at B3D: discuss what happens at the limits. If you don't like that, there's plenty of other forums.

DavidGraham said:
Faulty testing for sure.

You have facts to base that conclusion on?

Is Metro Exodus:Enhanced Edition at 8K with Ultra ray traciing and no DLSS at 10.8fps:

TweakTown.com Enlarged Image

faulty testing?

iroboto · Jul 11, 2021

Jawed said:
You have facts to base that conclusion on?

Is Metro Exodus:Enhanced Edition at 8K with Ultra ray traciing and no DLSS at 10.8fps:

TweakTown.com Enlarged Image

faulty testing?

yea unlikely.
I think 8K broke it. Bottleneck differences may be breaking the camels' back sort of speak here for the nvidia card. Might be useful to use the nvidia Nsight here to see what's happening at 8K. Worth while to explore if @Clukos or another member has some time to try it. I'd be curious to see what's happening here.

Would also be curious to see an equivalent AMD one for the 6900XT. But I don't know the name of that tool and whether it's free for downloading and usage.

Deleted member 2197 · Jul 11, 2021

DavidGraham said:
Faulty testing for sure.

Yeah, or they used rebar as their default setup; at high resolutions it had a negative effect on 3090's in Cyberpunk.

Frenetic Pony · Jul 11, 2021

pharma said:
Yeah, or they used rebar as their default setup; at high resolutions it had a negative effect on 3090's in Cyberpunk.

Weeird, though from everything on here I get the impression Nvidia has somehow dropped the ball on their driver support, or some combination of such. Rebar works great on AMD and there's no "this driver works best for this game but it's not the newest drivers" there. So... while it might be "detrimental" at some point one has to start laying some blame on Nvidia for the underlying problems here.

If they'd clean up whatever issues they're having no one would have to bring up driver versions or what exact setup settings you have for what exact games. Certainly make life easier on everyone as well.

Deleted member 2197 · Jul 11, 2021

Frenetic Pony said:
Weeird, though from everything on here I get the impression Nvidia has somehow dropped the ball on their driver support, or some combination of such. Rebar works great on AMD and there's no "this driver works best for this game but it's not the newest drivers" there. So... while it might be "detrimental" at some point one has to start laying some blame on Nvidia for the underlying problems here.

If they'd clean up whatever issues they're having no one would have to bring up driver versions or what exact setup settings you have for what exact games. Certainly make life easier on everyone as well.

Rebar sometimes also results in negative gains on AMD cards. For any "pure" testing they should just disable it.

HLJ · Jul 11, 2021

Jawed said:
Ah, you must be new around here.

That's what DavidGraham suggested:

That's what we do here at B3D: discuss what happens at the limits. If you don't like that, there's plenty of other forums.

You have facts to base that conclusion on?

Is Metro Exodus:Enhanced Edition at 8K with Ultra ray traciing and no DLSS at 10.8fps:

TweakTown.com Enlarged Image

faulty testing?

All you show is that AMD is better at 8K in unplayable settings, which will tell you nothing outside this specific border case.
When native 8K RT because a real option, it will not be with current SKU design or even 1st and 2nd gen RT cores.

But of course if you want to pivot AMD's lesser RT solution as "better"...this is all you really got but then you have to ignore all the real world gaming that doesn't suit that agenda.

DavidGraham · Jul 11, 2021

Jawed said:
You have facts to base that conclusion on?

The small gap between the 3090 and 6900XT with RT on in TweakTown's testing.

In contrast to that, Computerbase shows that @4K with Ultra RT, the 3080 is 84% faster than the 6800XT, the 3090 stands to be even faster.
https://www.computerbase.de/2021-03...berpunk-2077-raytracing-und-dlss-in-3840-2160

PCGH shows the 3090 to be 92% faster than 6900XT @4K Ultra RT.
https://www.pcgameshardware.de/Cybe...als/Update-120-Benchmarks-Raytracing-1369667/

Frankly, Tweaktown has never stood to be a reliable source of testing for anyone for quite a long time.

DavidGraham · Jul 11, 2021

TopSpoiler said:
I can't post a link yet, but you can see the pure RT performance in the GPSnoopy's RayTracingInVulkan demo. (you can google it)
The reality is that the 6900XT is slower than the 2080Ti. The 6900XT is only faster if the scene has very shallow BVH depth and no triangle geometry.

Not just that, here is PCGH's testing of the ART Mark RT demo, the 2080Ti remains faster than 6900XT, the 3090 is more than twice as fast.

https://www.pcgameshardware.de/Rayt...cials/ART-Mark-Raytracing-Benchmarks-1371125/

And here is the Boundary game demo, with the same story.

.
https://www.tomshardware.com/reviews/amd-radeon-rx-6900-xt-review/3

Control, same story.

https://www.tomshardware.com/reviews/amd-radeon-rx-6900-xt-review/3

COD Cold War too.

pjbliverpool · Jul 11, 2021

DavidGraham said:
Not just that, here is PCGH's testing of the ART Mark RT demo, the 2080Ti remains faster than 6900XT, the 3090 is more than twice as fast.

https://www.pcgameshardware.de/Rayt...cials/ART-Mark-Raytracing-Benchmarks-1371125/

And here is the Boundary game demo, with the same story.

.
https://www.tomshardware.com/reviews/amd-radeon-rx-6900-xt-review/3

COD Cold War, same story.

https://www.tomshardware.com/reviews/amd-radeon-rx-6900-xt-review/3

Art Mark is interesting there, clearly leveraging some advantage of Amperes RT implementation over Turing that we don't typically see in games. I'd love to understand more what the difference is there and why we're not seeing it in games.

Rootax · Jul 11, 2021

pjbliverpool said:
Art Mark is interesting there, clearly leveraging some advantage of Amperes RT implementation over Turing that we don't typically see in games. I'd love to understand more what the difference is there and why we're not seeing it in games.

I guess in games other limiting factor are more frequently involved ? Like, maybe you have 1.6-2x perfs on the pure RT part/ rt core work, but the shading part after that is not x2 so we never see the rt gain ?

pjbliverpool · Jul 11, 2021

Rootax said:
I guess in games other limiting factor are more frequently involved ? Like, maybe you have 1.6-2x perfs on the pure RT part/ rt core work, but the shading part after that is not x2 so we never see the rt gain ?

But we usually see the performance ratio between similar GPUs (say 2080Ti and 3070) remain pretty much the same with RT off and on. I'd expect the 3070 to get relatively faster with RT on if that part of the rendering process is faster on that card.

I'm wondering if there's some feature of Ampere RT that just isn't being used by games right now that the benchmark is using. If true then I'd like to understand the likely hood of seeing that in future games.

JoeJ · Jul 11, 2021

pjbliverpool said:
I'm wondering if there's some feature of Ampere RT that just isn't being used by games right now that the benchmark is using.

Pretty sure it's not. Games have just too many other work going on on GPU, including any rasterization and compute, but also non accelerated RT tasks like BVH build/refit due to animation and streaming, denoising, shading, ray generation, and optimizations like ray binning.
What we see in games benchmarks so depends more on the overall improvement of Ampere > Turing, and RT Cores - even if they are 4 x faster - is just one factor of many. (IIRC, up to 4 x was communicated, but could be wrong.)

Personally i wonder much more about offline results. HW acceleration often shows only a net win of 2. That's suspiciously small. I think it's a result of 'missing optimizations on all ends', e.g. using really complex materials, construction of all BVH each frame, huge uploads each frame, etc.

JoeJ · Jul 11, 2021

A-RT setup from PCHW:

To hilight RT core improvement, we would want to turn TAA off and increase those settings to the max, even if resulting FPS end up 'unplayable'.

JoeJ · Jul 11, 2021

Oh sorry -they include results with high settings:

vs.

Showing almost 3 x improvement. Makes sense.

OlegSH · Jul 11, 2021

pjbliverpool said:
Art Mark is interesting there, clearly leveraging some advantage of Amperes RT implementation over Turing that we don't typically see in games.

This benchmark features perfect mirror reflections and lighting, this lighting is also visible in reflections, so pretty sure this benchmark mostly tests shading performance rather than tracing.
Perfect mirror rays are so cheap that Crytek were able to trace them efficiently in SW on last gen consoles, obviously, HW RT is still a way to go on PC for the best quality and performance.
Ampere has 2x FP32 SIMDs and 2x L1/texture bandwidth, so no wonder it works way better with shading heavy workloads. Even if there is some divergence due to materials (though all the shiny balls in this demo seem to be using the same materials), more SIMDs would mean better performance on scenes with lots of divergence.
I remember there were in-game Quake II RTX breakdowns of lighting, BVH and other passes somewhere here and all compute limited passes were close to 2x faster on 3090 vs 6900 XT

JoeJ · Jul 11, 2021

OlegSH said:
This benchmark features perfect mirror reflections and lighting, this lighting is also visible in reflections, so pretty sure this benchmark mostly tests shading performance rather than tracing.

Seems it even uses SM for shadows (impression from looking at settings).

OlegSH said:
Perfect mirror rays are so cheap

I assume the 50 bounces mean reflections of reflections, so that's no longer cheap. Divergence will also increase with each bounce, so it's no bad test for HW RT.
But idk if paths terminate after hitting some diffuse surface, and how long the average path really is.

OlegSH · Jul 11, 2021

JoeJ said:
I assume the 50 bounces mean reflections of reflections, so that's no longer cheap. Divergence will also increase with each bounce, so it's no bad test for HW RT.

It seems this benchmark is built on UE4, so one can easily check with Unreal Unlocker where this bench spends the most of time via the "stat GPU" command. Nsight profiling would be even more revealing.

Man from Atlantis · Jul 11, 2021

OlegSH said:
This benchmark features perfect mirror reflections and lighting, this lighting is also visible in reflections, so pretty sure this benchmark mostly tests shading performance rather than tracing.
Perfect mirror rays are so cheap that Crytek were able to trace them efficiently in SW on last gen consoles, obviously, HW RT is still a way to go on PC for the best quality and performance.
Ampere has 2x FP32 SIMDs and 2x L1/texture bandwidth, so no wonder it works way better with shading heavy workloads. Even if there is some divergence due to materials (though all the shiny balls in this demo seem to be using the same materials), more SIMDs would mean better performance on scenes with lots of divergence.
I remember there were in-game Quake II RTX breakdowns of lighting, BVH and other passes somewhere here and all compute limited passes were close to 2x faster on 3090 vs 6900 XT

https://forum.beyond3d.com/posts/2185240

JoeJ · Jul 11, 2021

Wow, seems AMDs BVH build needs some work. Also the overall loss on denoising is unexpected to me.

Related to RT Core benchmarks, Q2 RTX has (or had?) a mode with all mirror surfaces and 10 bounces. Would be ideal because denoising off.

GPU Ray Tracing Performance Comparisons [2021-2022]

trinibwoy

Meh

Jawed

iroboto

Daft Funk

Deleted member 2197

Guest

Frenetic Pony

Deleted member 2197

Guest

HLJ

DavidGraham

DavidGraham

pjbliverpool

B3D Scallywag

Rootax

pjbliverpool

B3D Scallywag

JoeJ

JoeJ

JoeJ

OlegSH

JoeJ

OlegSH

Man from Atlantis

JoeJ

Similar threads