GPU Ray Tracing Performance Comparisons [2021-2022]

Looking on Portal RTX developer menu and the 'SER is Supported' box is ticked but greyed out???

Surely it wouldn't be ticked at all with my 3060ti?
 
For those interested I've tested Witcher 3 (outside on horse back at the start of the game)

  • 12100f
  • 16GB RAM
  • RTX3060ti

Game settings

  • Resolution: 2560x1440
  • Upscaling used: DLSS balanced
  • Graphics pre-set: All set to high
  • RT: Everything turned on

My fps at the above settings is ~45fps with some dips to around 40fps and some scenes 50+fps

I tried turning off each individual RT effect (But left RTGI always on) to try and get to 60fps but no use, turning off RT shadows, reflections and AO only gives me an extra 5-7fps so hardly worth turning off as it doesn't get me to 60fps.

Turning off all RT effects and leaving all the other settings the same (including leaving DLSS balanced on) gets me 110-130fps.

EDIT:

Absolute shit that you can't turn each RT effect on/off individually, you have to have RTGI enabled to even have the ability to toggle the other RT effects on/off which is piss poor.

I think RT shadows adds more to the game than RTGI does, there's so much foliage that's un shadowed that it makes a massive difference just having shadows on them as they have this weird and unnatural glow otherwise.
 
Last edited:
In UE4.26 Archviz mode with RT on, The 4090 is 2.6x to 3x times faster than 7900XTX, the 4080 is ~2x times faster, while the 3090Ti is ~50% faster.
Interestingly the 2080Ti is between 30% to 50% faster than 6900XT, while the 7900XTX is 30% faster than 2080Ti.

1671316343309.png

1671316180156.png


 
@davis.anthony It's a common thing with ray tracing in games. Haven't been able to figure it out, but it's pretty consistent. The first ray traced game I played was control, and I noticed I seemed to not be gpu limited in the most intensive scenes. I actually tries setting the game to some absurd low resolution with DLSS on top of it. There was no way I could be gpu limited. The ray count would be so low at the resolution I was attempting. My frame rate changed if I adjusted the clock speed of my cpu, which proved that it was a cpu limit.

I'm not sure if every ray traced game is like that, but I think pretty much all of them are. There's some bottleneck that's cpu limited, either in the driver, the api or the implementation in the engine.

Ok, so 640x480 with low preset and DLSS ultra so interal res less than 320x240 (DLSS has no issue keeping up with these frame rates). Ryzen 3600x with RTX 3080. Memory is 3200 with timings from an xmp profile (not great)

We have stock CPU clock and no RT we get 142 fps


At 3 GHz CPU clock and no RT we get 119 fps, so yes confirmed CPU limited


Then at stock clock and full RT we get 89 fps. Only 42% gpu use. Are we CPU limited?


At 3GHz cpu clock and full RT we get ... 74 fps. Looks like we're cpu limited. CPU clocked dropped by


Ok, maybe that cpu clock change was severe, so let's try 3950 MHz, so about 10% lower clock. Lose what looks like 6-9 fps depending on when I capture both images.

 
@davis.anthony It's a common thing with ray tracing in games. Haven't been able to figure it out, but it's pretty consistent. The first ray traced game I played was control, and I noticed I seemed to not be gpu limited in the most intensive scenes. I actually tries setting the game to some absurd low resolution with DLSS on top of it. There was no way I could be gpu limited. The ray count would be so low at the resolution I was attempting. My frame rate changed if I adjusted the clock speed of my cpu, which proved that it was a cpu limit.

I'm not sure if every ray traced game is like that, but I think pretty much all of them are. There's some bottleneck that's cpu limited, either in the driver, the api or the implementation in the engine.

My GPU is 99% maxed out in every RT game I own except Spiderman.
 
My GPU is 99% maxed out in every RT game I own except Spiderman.

You can definitely be maxed out if you're using ultra settings, 4k (maybe 1440p), native, DLSS quality or some combination of those things. But I bet if you experiment with the settings like I did with Control you can find there's some cpu limit you can't overcome by lowering settings, resolution etc. Like set the resolution to something absurd like 640x480 or lower, set all of the non RT settings to low, and then compare with RT on/off. Your gpu utilization will probably be very low and your frame rate will take a hit anyway.

In terms of gameplay, you may never encounter it, especially if you have a good cpu. If you don't have a great cpu or you're specifically trying to game at high frame rates by adjusting settings you might see it. I pretty much always aim for 120 fps, so I test most games with RT to see if I can get the settings to bring me to that level.
 
For those interested I've tested Witcher 3 (outside on horse back at the start of the game)

  • 12100f
  • 16GB RAM
  • RTX3060ti

Game settings

  • Resolution: 2560x1440
  • Upscaling used: DLSS balanced
  • Graphics pre-set: All set to high
  • RT: Everything turned on

My fps at the above settings is ~45fps with some dips to around 40fps and some scenes 50+fps

I tried turning off each individual RT effect (But left RTGI always on) to try and get to 60fps but no use, turning off RT shadows, reflections and AO only gives me an extra 5-7fps so hardly worth turning off as it doesn't get me to 60fps.

Turning off all RT effects and leaving all the other settings the same (including leaving DLSS balanced on) gets me 110-130fps.

EDIT:

Absolute shit that you can't turn each RT effect on/off individually, you have to have RTGI enabled to even have the ability to toggle the other RT effects on/off which is piss poor.

I think RT shadows adds more to the game than RTGI does, there's so much foliage that's un shadowed that it makes a massive difference just having shadows on them as they have this weird and unnatural glow otherwise.

RTGI do wonders in big city/interiors. But in a big city, I'm cpu limited all the way... (By that I mean the devs screwed up badly)
 
RTGI do wonders in big city/interiors. But in a big city, I'm cpu limited all the way... (By that I mean the devs screwed up badly)
same here. When RTGI is on my Ryzen 3700X can only go up to 34fps at 1440p, and 30fps at 4K. Changing the Ultra+ settings to Ultra or High, Medium etc does nothing at all to improve the performance. It's a pity but in part I am glad, I just have to click on the RT Ultra setting and that's it. On my 1440p monitor I get a 30fps experience with "DLSS" Auto (hacked the game to make it think it's DLSS but it's using XeSS), and 30fps with dips at 4K. LFC and VRR makes it a bit more palatable, but the lighting is breathtaking and I completed the game a few times already, I just prefer to play with RT all the time and enjoy what CD Projekt has created.
 
same here. When RTGI is on my Ryzen 3700X can only go up to 34fps at 1440p, and 30fps at 4K. Changing the Ultra+ settings to Ultra or High, Medium etc does nothing at all to improve the performance. It's a pity but in part I am glad, I just have to click on the RT Ultra setting and that's it. On my 1440p monitor I get a 30fps experience with "DLSS" Auto (hacked the game to make it think it's DLSS but it's using XeSS), and 30fps with dips at 4K. LFC and VRR makes it a bit more palatable, but the lighting is breathtaking and I completed the game a few times already, I just prefer to play with RT all the time and enjoy what CD Projekt has created.

I'm not surprised you're CPU limited with that CPU.
 
It's sill a decent cpu. And I bet that, if it's like me, in bit city, the gpu usage drop a lot, but you only have 1 or 2 thread at a decent load on the cpu, fighting for 27-32 fps... It's ridiculous...

This is the two locked thread(main and render thread) phenomenon and using multiple worker thread. The two locked thread doing synchronisation with the worker thread. Tons of time are lost on worker thread waiting answer of the two locked threads being busy. Basically the majority of game engine use this because it is very easy to implement. It was ok until 4 cores but now we have 8 cores SMT. This is bad. Very few engine use another way to work like job based threading or fiber based or ECS. I know only Tiger Engine of Bungie, Id Tech Engine, Naughty Dog engine, Decima engine, Bluepoint Game and Unity ECS. There are probably other engine using this maybe Frosbite for example but I never see a presentation out of the rendering part of the game engine.

There are some smaller studios using great multithreading engine like this game studio in Paris http://www.pastagames.com/. Some of the game programmer were using Naughty Dog way to do the thing and finally one of the engine programmer was hired by Guerrilla Games. I am sure there is more example than this. It reminds me the ND tips of hiring. There are some great programmer in small studios. This is how ND hired Christophe Balestra who learned to program very well being part of Atari ST demo making scene.


We have a fiber based job system at Pastagames. I started coding it just after seeing this Naughty Dog GDC Talk back in 2015. And so far, we’re really happy with it.

In this article, I’ll focus more on how we use it than on how we implemented it ‒ our implementation is very close to what Christian Gyrling presented in his talk.
 
Last edited:
My 12100f Quad-Core performs better in RT than the 3700x.

This is normal the game engine is not well multithreaded. It will be interesting to do the test when fully exclusive PS5 games of ND(next single player game), GG(Horizon 3) release on PC. From this studio game engine are very well multithreaded, I would not be surprised the 3700X will perform better than the 12100f Quad-Core with or without RT.

This gen was so fucked up by stock problem and COVID we won't see real exclusive from some studios before 2024/2025.

EDIT: A great test would be to test Doom Eternal, it scales well from 4 to 16 cores and see if the 12100f beat the 3700x. ID tech CPU technology engine is great.

And I don't talk about Spiderman 2 because u Insomniac Engine has the same problem withmultithreading. Bungie engine is very good but I am not sure they will target PS5/Xbox Series as the minimum specification knowing the type of game they made. Bluepoint game engine is good but doing remake they will not push the CPU very far. For example, they didn't know what to do with the CPU `power in Demon's Souls Remake. The rumor talk about two games for Bluepoint game an original game and a remake. If they release the original game before it will be interesting to check it.

EDIT2: And developers didn't do this two locked thread choice because they are stupid or lazy. It is existing because of dual core CPU. It worked very well on this architecture but when they add cores it means you need to totally refactor your CPU game engine architecture and at the same time shipped game on time. The easier way was to keep the two locked thread and add worker thread. This is not ideal but it is better than do nothing and it means the team can ship the game on time.
 
Last edited:
This is normal the game engine is not multithreaded well. It will be interesting to do the test when fully exclusive PS5 games of ND(next single player game), GG(Horizon 3) release on PC. Here game engine are very well multithreaded, I would not be surprise the 3700X will perform better than the 12100f Quad-Core with or without RT.

This gen was so fucked up by stock problem and COVID we won't see real exclusive from some studios before 2024/2025.

EDIT: A great test would be to test Doom Eternal, it scales well from 4 to 16 cores and see if the 12100f beat the 3700x. ID tech CPU technology engine is great.

And I don't target Spiderman 2 because until they said they change the game engine, Insomniac Engine has the same bad multithreaded type of engine. Bungie engine is very good but I am not sure they will target PS5/Xbox Series as the minimum specification knowing the type of game they made. Bluepoint game engine is good but doing remake they will not push the CPU very far. For example, they didn't know what to do of the CPU in Demon's Souls Remake. The rumor they do an original game and a remake. If they release the original game before maybe it will be interesting to check it.

EDIT2: And devs didn't do this two locked thread because they are stupid or lazy. It arrives because of dual core CPU. It worked very well on this architecture but when they add cores it means you need to totally refactor your CPU game engine architecture and at the same time shipped game on time. The easier way was to keep the two locked thread and add worker thread. This is not ideal but it is better than do nothing and it means the team can ship the game on time.

A picture speaks a 1000 words, the review also puts its IPC as being 27% stronger than the 3700x which is why it's the better gaming CPU and by far the better gaming CPU when RT is turned on.

Doom Eternal is no problem as it's right up a 3900x's arse.

doom-eternal-1280-720.png




Average.png

And then in 12 months time when the 12th gen i7's are cheap I'll chuck one of those in my motherboard and get a huge performance boost.

And on another note, please work on your sentence structure. Reading and trying to understand what you're trying to say in that reply was much more difficult than it needed to be, and even then I'm not 100% on what you were trying to put across.
 
A picture speaks a 1000 words, the review also puts its IPC as being 27% stronger than the 3700x which is why it's the better gaming CPU and by far the better gaming CPU when RT is turned on.

Doom Eternal is no problem as it's right up a 3900x's arse.

View attachment 7922




View attachment 7921

And then in 12 months time when the 12th gen i7's are cheap I'll chuck one of those in my motherboard and get a huge performance boost.

And on another note, please work on your sentence structure. Reading and trying to understand what you're trying to say in that reply was much more difficult than it needed to be, and even then I'm not 100% on what you were trying to put across.

The second graph is not useful, I wanted to see in Doom Eternal. This is with RT? But at the end @Rootax is reasonable the game engine could to better on a 3700x if it was using a better multithreading architecture.
 
A fewer-core CPU can absolutely outperform a more-core count CPU. The CPU with fewer cores with higher IPC will be able to handle more per core per clock. Its also the added features that help out performance, quite greatly in some cases. That i3 may have half the core count but maybe do double the amount of math on each core, at much higher clocks, more cache, more advanced architecture and hardware block assistance.

Obviously if a game is coded to take advantage of a 8 core CPU (or 6 then in PS5 case) then the game will easily perform better on a CPU with more cores. The same story is vice versa.
So no, you cant just compare a native PS5 AAA game because that game will be optimized for the slow clocked 6.5 cores the PS5 has available to games.

However, im sure that i3 will hold its own most of the time, if not outperforming the PS5 cpu in most multiplatform games. (it already does atm)
 
Back
Top