GPU Ray Tracing Performance Comparisons [2021-2022]

Why can't anyone be bothered to normalize their fps*? It's not really rocket science, you can google it into excel in five minutes max and have much more valuable data.

*yes, i realize they have tables normalized to a geo mean of fps, but that's like so...

edit: I feel I need to spell it out once. With geo means, you potentially (numbers very much exaggerated) do this:
Game A
Card n0 120 fps
Card n1 240 fps
Card n2 390 fps
Game B
Card n0 60 fps
Card n1 30 fps
Card n2 10 fps (for example too little VRAM)
Your geo mean results:
Card n0 avg. 90 fps
Card n1 avg. 135 fps
Card n2 avg. 200 fps

Your (flawed) conclusion: Card n0 is slowest, card n1 is mediocre and card n2 is uber!!! You not only completely miss the fact that you can barely play Game B on card n1 and almost not at all on card n2, your conclusion points in the opposite direction!


I think you mean arithmetic mean (which is flawed)?

Using geometric mean, your example would be like:

Card n0 mean: 84.85 (sqrt(120*60))
Card n1 mean: 84.85 (sqrt(240*30))
Card n2 mean: 62.44 (sqrt(390*10))
 
Why can't anyone be bothered to normalize their fps*? It's not really rocket science, you can google it into excel in five minutes max and have much more valuable data.

*yes, i realize they have tables normalized to a geo mean of fps, but that's like so...

edit: I feel I need to spell it out once. With geo means, you potentially (numbers very much exaggerated) do this:
Game A
Card n0 120 fps
Card n1 240 fps
Card n2 390 fps
Game B
Card n0 60 fps
Card n1 30 fps
Card n2 10 fps (for example too little VRAM)
Your geo mean results:
Card n0 avg. 90 fps
Card n1 avg. 135 fps
Card n2 avg. 200 fps

Your (flawed) conclusion: Card n0 is slowest, card n1 is mediocre and card n2 is uber!!! You not only completely miss the fact that you can barely play Game B on card n1 and almost not at all on card n2, your conclusion points in the opposite direction!

With normalized numbers, you arrive at
Game A
Card n0 120 fps = 30,8%
Card n1 240 fps = 61,5%
Card n2 390 fps = 100%
Game B
Card n0 60 fps = 100%
Card n1 30 fps = 50%
Card n2 10 fps = 16,7%
You could just average the normalized values:
Card n0 geo mean of avgs. 65,4%
Card n1 geo mean of avgs. 55,8%
Card n2 geo mean of avgs. 58,4%

Or, if you want to give a fixed reference point, you could normalize those percentages again, so people can readily see, how good 58,4% really is, without an external anchor.
Card n0 normalized index 100%
Card n1 normalized index 85,3 %
Card n2 normalized index 89,3%

At least, you're not presenting a false winner. And still, you don't see that there are games, which just don't run well on cards #2 and 3 with the chosen settings. So please provide your raw fps/ms whatever it is, so people can verify your conlusion (and point out any error you may have made!) and weigh it differently to arrive at their individual conclusion (Game F does not interest me / I don't have a 4k display, so I'll disregard this).
Normalising then averaging the FPS is still wrong. E.g.

Game A
Card n0 120 fps
Card n1 60 fps
Game B
Card n0 60 fps
Card n1 120 fps
"Normalise":
Game A
Card n0 100%
Card n1 50%
Game B
Card n0 100%
Card n1 200%
Average:
Card n0 100%
Card n1 125%

So n1 is "faster"? But you can easily tell from looking at the raw FPS it should be a tie. And if we set n1 to 100% when normalising then we would conclude that n0 is faster. If the conclusion changes depending on which card you set to 100%, then it's obviously a flawed way to do things.

As pcchen points out, actually using the geometric mean rather than arithmetic mean fixes this.

If you want to give greater emphasis to low FPS titles, you could use a harmonic mean. Arithmetic will give extra weight to high FPS, geometric will be balanced, harmonic will weight low FPS more.
 
Also it is not so much about if a game is playable or not, but to gauge performance of each gpu. I see this alot in the mobile space aswell, where say an A15 device scores 17fps vs a SD8 gen1 scores just 5fps (for example), its to gauge relative performance of what the hardware is capable off.
 
An alternative to look at that might be interesting is to compare the clock speeds with respect to RT vs non RT workloads. I wonder if the under RT workloads the clock rates end up lower.
Furmark with RT is needed :mrgreen:

I believe Furmark is stressful because of the ROP workload.

Anecdotally, I've seen a range of 350-480 watts power consumption on 3090Ti, varying simply because of the game. So it's proving to be extremely difficult to assess ray tracing performance in general terms.

It'll be interesting to see how much performance in Unreal Engine 5 games depends upon the ray tracing hardware, since that engine is going to dominate the next 10 years of gaming.
 
Not sure if this is the best thread:
https://gpuopen.com/hiprt/

HIP RT is a ray tracing library for HIP, making it easy to write ray-tracing applications in HIP. The APIs and library are designed to be minimal, lower level, and simple to use and integrate into any existing HIP applications.

Although there are other ray tracing APIs which introduce many new things, we designed HIP RT in a slightly different way so you do not need to learn many new kernel types.
 
Anyone see any Ryzen 5800X3D benchmarks that include ray tracing? Really curious if the huge cache helps with all the BVH stuff that seems to make DXR games crush CPUs.
 
Was curious so did a simple interpretation of the Computerbase data for RT vs non RT at 720p -

The difference of the percent gains in improvement of the 5800XD3D over the 5800X with respect to RT vs. non RT -

Code:
                                                    Avg FPS                               1%
Battlefield 2042                :              2.15%                  :            -0.20%
Cyberpunk 2077              :              6.32%                   :             5.35%
Dying Light 2                   :               1.15%                  :            -7.16%
Far Cry 6                          :              4.06%                  :             0.14%
Ghostwire                         :           -10.81%                  :           -3.29%
Guardians of the Galaxy  :            -3.44%                   :           -4.45%
Resident Evil Village        :          -25.46%                   :          -22.39%

To clarify a negative % in this case means that there was less gains in the RT test versus the non RT test for the 5800X3D vs 5800X. A positive % means there was more gains in the RT test vs non RT test.
 
So I’m the RT titles the cache improved performance over the base 5800x, just not as much as with RT off for most titles. That’s not really clear cut.
 
In UE5, Hardware Lumen is only 7% slower on NVIDIA GPUs than Software Lumen, while offering significantly more visual quality and effects, the situation on AMD GPUs is a little different, Hardware is 17% slower than Software, with lots of missing reflections compared to NVIDIA GPUs.


 
Last edited:
In UE5, Hardware Lumen is only 7% slower on NVIDIA GPUs than Software Lumen, while offering significantly more visual quality and effects, the situation on AMD GPUs is a little different, Hardware is 17% slower than Software, with lots of missing reflections compared to NVIDIA GPUs.


What’s with missing RT reflections on AMD? Are they running some over aggressive optimization? I think there was a similar issue in Watch Dogs.
 
One of the devs wrote in one of the threads what each setting does. Nice to see stuff like this



Off
Everything off

LoW
reflections low
reflections bounces 1
translucent reflections - off
mesh caustics - off
water caustics - off
DDGI- off

Medium
reflections low
reflections bounces 1
translucent reflections -low
mesh caustics - low
water caustics - low
DDGI- on

High
reflections medium
reflections bounces 1
translucent reflections -medium
mesh caustics - medium
water caustics - medium
DDGI- on


Ultra
reflections high
reflections bounces 2
translucent reflections - high
mesh caustics - high
water caustics - high
DDGI- on


https://steamcommunity.com/app/1016800/discussions/0/3267933887513218928/
 
Normalising then averaging the FPS is still wrong. E.g.

Game A
Card n0 120 fps
Card n1 60 fps
Game B
Card n0 60 fps
Card n1 120 fps
"Normalise":
Game A
Card n0 100%
Card n1 50%
Game B
Card n0 100%
Card n1 200%
Average:
Card n0 100%
Card n1 125%

So n1 is "faster"? But you can easily tell from looking at the raw FPS it should be a tie. And if we set n1 to 100% when normalising then we would conclude that n0 is faster. If the conclusion changes depending on which card you set to 100%, then it's obviously a flawed way to do things.

As pcchen points out, actually using the geometric mean rather than arithmetic mean fixes this.

If you want to give greater emphasis to low FPS titles, you could use a harmonic mean. Arithmetic will give extra weight to high FPS, geometric will be balanced, harmonic will weight low FPS more.

You wouldn't get those numbers with how CarstenS was doing it. He normalized all results such that for each game 100% represented the highest FPS.

So for Game B using his method it would be.

Card n0 - 50%
Card n1 - 100%

Which would lead to the average being

Card n0 - 75%
Card n1 - 75%

Then you normalize those percentages to get

Card n0 - 100%
Card n1 - 100%

Which is representative of the relative averaged performance of the two hypothetical cards.

Regards,
SB
 
Last edited:
The following paper is pretty much compulsory reading for any researcher or practitioner of computer architecture. It's a very readable paper, and it's relevant to this performance discussion. I'm sure many of this thread's participants have read it already, but I would strongly encourage those that haven't to do so.

https://dl.acm.org/doi/pdf/10.1145/63039.63043

The author Jim Smith has near-legendary status in the field and is responsible for some seminal work in precise exceptions, vector architectures and many other features of modern processors that we take for granted today.
 
Back
Top