GPU Ray Tracing Performance Comparisons [2021-2022]

pcchen · Apr 11, 2022

CarstenS said:
Why can't anyone be bothered to normalize their fps*? It's not really rocket science, you can google it into excel in five minutes max and have much more valuable data.

*yes, i realize they have tables normalized to a geo mean of fps, but that's like so...

edit: I feel I need to spell it out once. With geo means, you potentially (numbers very much exaggerated) do this:
Game A
Card n0 120 fps
Card n1 240 fps
Card n2 390 fps
Game B
Card n0 60 fps
Card n1 30 fps
Card n2 10 fps (for example too little VRAM)
Your geo mean results:
Card n0 avg. 90 fps
Card n1 avg. 135 fps
Card n2 avg. 200 fps

Your (flawed) conclusion: Card n0 is slowest, card n1 is mediocre and card n2 is uber!!! You not only completely miss the fact that you can barely play Game B on card n1 and almost not at all on card n2, your conclusion points in the opposite direction!

I think you mean arithmetic mean (which is flawed)?

Using geometric mean, your example would be like:

Card n0 mean: 84.85 (sqrt(120*60))
Card n1 mean: 84.85 (sqrt(240*30))
Card n2 mean: 62.44 (sqrt(390*10))

Qesa · Apr 11, 2022

CarstenS said:
Why can't anyone be bothered to normalize their fps*? It's not really rocket science, you can google it into excel in five minutes max and have much more valuable data.

*yes, i realize they have tables normalized to a geo mean of fps, but that's like so...

edit: I feel I need to spell it out once. With geo means, you potentially (numbers very much exaggerated) do this:
Game A
Card n0 120 fps
Card n1 240 fps
Card n2 390 fps
Game B
Card n0 60 fps
Card n1 30 fps
Card n2 10 fps (for example too little VRAM)
Your geo mean results:
Card n0 avg. 90 fps
Card n1 avg. 135 fps
Card n2 avg. 200 fps

Your (flawed) conclusion: Card n0 is slowest, card n1 is mediocre and card n2 is uber!!! You not only completely miss the fact that you can barely play Game B on card n1 and almost not at all on card n2, your conclusion points in the opposite direction!

With normalized numbers, you arrive at
Game A
Card n0 120 fps = 30,8%
Card n1 240 fps = 61,5%
Card n2 390 fps = 100%
Game B
Card n0 60 fps = 100%
Card n1 30 fps = 50%
Card n2 10 fps = 16,7%
You could just average the normalized values:
Card n0 geo mean of avgs. 65,4%
Card n1 geo mean of avgs. 55,8%
Card n2 geo mean of avgs. 58,4%

Or, if you want to give a fixed reference point, you could normalize those percentages again, so people can readily see, how good 58,4% really is, without an external anchor.
Card n0 normalized index 100%
Card n1 normalized index 85,3 %
Card n2 normalized index 89,3%

At least, you're not presenting a false winner. And still, you don't see that there are games, which just don't run well on cards #2 and 3 with the chosen settings. So please provide your raw fps/ms whatever it is, so people can verify your conlusion (and point out any error you may have made!) and weigh it differently to arrive at their individual conclusion (Game F does not interest me / I don't have a 4k display, so I'll disregard this).

Normalising then averaging the FPS is still wrong. E.g.

Game A
Card n0 120 fps
Card n1 60 fps
Game B
Card n0 60 fps
Card n1 120 fps
"Normalise":
Game A
Card n0 100%
Card n1 50%
Game B
Card n0 100%
Card n1 200%
Average:
Card n0 100%
Card n1 125%

So n1 is "faster"? But you can easily tell from looking at the raw FPS it should be a tie. And if we set n1 to 100% when normalising then we would conclude that n0 is faster. If the conclusion changes depending on which card you set to 100%, then it's obviously a flawed way to do things.

As pcchen points out, actually using the geometric mean rather than arithmetic mean fixes this.

If you want to give greater emphasis to low FPS titles, you could use a harmonic mean. Arithmetic will give extra weight to high FPS, geometric will be balanced, harmonic will weight low FPS more.

PSman1700 · Apr 11, 2022

Also it is not so much about if a game is playable or not, but to gauge performance of each gpu. I see this alot in the mobile space aswell, where say an A15 device scores 17fps vs a SD8 gen1 scores just 5fps (for example), its to gauge relative performance of what the hardware is capable off.

Jawed · Apr 11, 2022

arandomguy said:
An alternative to look at that might be interesting is to compare the clock speeds with respect to RT vs non RT workloads. I wonder if the under RT workloads the clock rates end up lower.

Furmark with RT is needed :mrgreen:

I believe Furmark is stressful because of the ROP workload.

Anecdotally, I've seen a range of 350-480 watts power consumption on 3090Ti, varying simply because of the game. So it's proving to be extremely difficult to assess ray tracing performance in general terms.

It'll be interesting to see how much performance in Unreal Engine 5 games depends upon the ray tracing hardware, since that engine is going to dominate the next 10 years of gaming.

TopSpoiler · Apr 13, 2022

Techniques for traversing data employed in ray tracing - NVIDIA Corporation (freepatentsonline.com)

It looks like BVH node masking will be added in the next DXR API update.

Jawed · Apr 13, 2022

TopSpoiler said:
Techniques for traversing data employed in ray tracing - NVIDIA Corporation (freepatentsonline.com)

It looks like BVH node masking will be added in the next DXR API update.

I wonder if Turing/Ampere already support node masking. The document implies that "instance masking" is a subset of what the hardware can do, so it seems reasonable to expect that the older hardware can do these things.

Jay · Apr 13, 2022

Not sure if this is the best thread:
https://gpuopen.com/hiprt/

HIP RT is a ray tracing library for HIP, making it easy to write ray-tracing applications in HIP. The APIs and library are designed to be minimal, lower level, and simple to use and integrate into any existing HIP applications.

Although there are other ray tracing APIs which introduce many new things, we designed HIP RT in a slightly different way so you do not need to learn many new kernel types.

Scott_Arm · Apr 14, 2022

Anyone see any Ryzen 5800X3D benchmarks that include ray tracing? Really curious if the huge cache helps with all the BVH stuff that seems to make DXR games crush CPUs.

troyan · Apr 14, 2022

Computerbase has a few: https://www.computerbase.de/2022-04...schnitt_benchmarks_in_spielen_in_fhd_und_wqhd

arandomguy · Apr 15, 2022

Was curious so did a simple interpretation of the Computerbase data for RT vs non RT at 720p -

The difference of the percent gains in improvement of the 5800XD3D over the 5800X with respect to RT vs. non RT -

Code:

                                                    Avg FPS                               1%
Battlefield 2042                :              2.15%                  :            -0.20%
Cyberpunk 2077              :              6.32%                   :             5.35%
Dying Light 2                   :               1.15%                  :            -7.16%
Far Cry 6                          :              4.06%                  :             0.14%
Ghostwire                         :           -10.81%                  :           -3.29%
Guardians of the Galaxy  :            -3.44%                   :           -4.45%
Resident Evil Village        :          -25.46%                   :          -22.39%

To clarify a negative % in this case means that there was less gains in the RT test versus the non RT test for the 5800X3D vs 5800X. A positive % means there was more gains in the RT test vs non RT test.

Scott_Arm · Apr 15, 2022

So I’m the RT titles the cache improved performance over the base 5800x, just not as much as with RT off for most titles. That’s not really clear cut.

DavidGraham · Apr 20, 2022

In UE5, Hardware Lumen is only 7% slower on NVIDIA GPUs than Software Lumen, while offering significantly more visual quality and effects, the situation on AMD GPUs is a little different, Hardware is 17% slower than Software, with lots of missing reflections compared to NVIDIA GPUs.

trinibwoy · Apr 20, 2022

DavidGraham said:
In UE5, Hardware Lumen is only 7% slower on NVIDIA GPUs than Software Lumen, while offering significantly more visual quality and effects, the situation on AMD GPUs is a little different, Hardware is 17% slower than Software, with lots of missing reflections compared to NVIDIA GPUs.

What’s with missing RT reflections on AMD? Are they running some over aggressive optimization? I think there was a similar issue in Watch Dogs.

Phantom88 · Apr 21, 2022

The cost of RT in this is perverse. The performance gets cut three times when enabled.

DLSS is also very blurry at 1440.

https://imgsli.com/MTA1MDg0

Its also like that at 4k, but lessened

https://imgsli.com/MTA1MDg1

At 4k, with maxed RT and no help it runs with 20 frames

DavidGraham · Apr 21, 2022

Phantom88 said:
The cost of RT in this is perverse. The performance gets cut three times when enabled.

The game adds RT reflections and GI, unfortunately the cost of current RTGI in UE4 is significant, with modest IQ enhancements.

Phantom88 · Apr 22, 2022

One of the devs wrote in one of the threads what each setting does. Nice to see stuff like this

Off
Everything off

LoW
reflections low
reflections bounces 1
translucent reflections - off
mesh caustics - off
water caustics - off
DDGI- off

Medium
reflections low
reflections bounces 1
translucent reflections -low
mesh caustics - low
water caustics - low
DDGI- on

High
reflections medium
reflections bounces 1
translucent reflections -medium
mesh caustics - medium
water caustics - medium
DDGI- on

Ultra
reflections high
reflections bounces 2
translucent reflections - high
mesh caustics - high
water caustics - high
DDGI- on

https://steamcommunity.com/app/1016800/discussions/0/3267933887513218928/

Silent_Buddha · Apr 22, 2022

Qesa said:
Normalising then averaging the FPS is still wrong. E.g.

Game A
Card n0 120 fps
Card n1 60 fps
Game B
Card n0 60 fps
Card n1 120 fps
"Normalise":
Game A
Card n0 100%
Card n1 50%
Game B
Card n0 100%
Card n1 200%
Average:
Card n0 100%
Card n1 125%

So n1 is "faster"? But you can easily tell from looking at the raw FPS it should be a tie. And if we set n1 to 100% when normalising then we would conclude that n0 is faster. If the conclusion changes depending on which card you set to 100%, then it's obviously a flawed way to do things.

As pcchen points out, actually using the geometric mean rather than arithmetic mean fixes this.

If you want to give greater emphasis to low FPS titles, you could use a harmonic mean. Arithmetic will give extra weight to high FPS, geometric will be balanced, harmonic will weight low FPS more.

You wouldn't get those numbers with how CarstenS was doing it. He normalized all results such that for each game 100% represented the highest FPS.

So for Game B using his method it would be.

Card n0 - 50%
Card n1 - 100%

Which would lead to the average being

Card n0 - 75%
Card n1 - 75%

Then you normalize those percentages to get

Card n0 - 100%
Card n1 - 100%

Which is representative of the relative averaged performance of the two hypothetical cards.

Regards,
SB

neckthrough · Apr 22, 2022

The following paper is pretty much compulsory reading for any researcher or practitioner of computer architecture. It's a very readable paper, and it's relevant to this performance discussion. I'm sure many of this thread's participants have read it already, but I would strongly encourage those that haven't to do so.

https://dl.acm.org/doi/pdf/10.1145/63039.63043

The author Jim Smith has near-legendary status in the field and is responsible for some seminal work in precise exceptions, vector architectures and many other features of modern processors that we take for granted today.

TopSpoiler · Apr 23, 2022

trinibwoy said:
What’s with missing RT reflections on AMD? Are they running some over aggressive optimization? I think there was a similar issue in Watch Dogs.

It also happened in Boundary benchmark. I don't know if this has been fixed or not, Just wonder why Radeon keeps getting this issue until recently.

https://imgur.com/a/oofYntA

Kaotik · Apr 23, 2022

TopSpoiler said:
It also happened in Boundary benchmark. I don't know if this has been fixed or not, Just wonder why Radeon keeps getting this issue until recently.

https://imgur.com/a/oofYntA

Could be driver issues, could be too heavy optimizations. But WTH is that blurfest on NVIDIA side?

GPU Ray Tracing Performance Comparisons [2021-2022]

pcchen

Moderator

Qesa

PSman1700

Jawed

TopSpoiler

Jawed

Jay

Scott_Arm

troyan

arandomguy

Scott_Arm

DavidGraham

trinibwoy

Meh

Phantom88

DavidGraham

Phantom88

Silent_Buddha

neckthrough

TopSpoiler

Kaotik

Drunk Member

Similar threads