GPU Ray Tracing Performance Comparisons [2021-2022]

This is seriously something I just don't understand. Imagination/PowerVR did it all years before NVIDIA, yet somehow NVIDIA doing it later was the miracle?

Caustics did it 2006 or whereabouts, then imagination aquired them later.
 
Rys is from a lot of IHVs.
Only 2, just 2 goes at each 😬
Caustics did it 2006 or whereabouts, then imagination aquired them later.
2009 was the announcement of CausticOne, the first (FPGA-based) incarnation of the basic ideas that made it into PowerVR Wizard (and then on into Photon in more modern form after the product reboot more recently).
 
Only 2, just 2 goes at each 😬

2009 was the announcement of CausticOne, the first (FPGA-based) incarnation of the basic ideas that made it into PowerVR Wizard (and then on into Photon in more modern form after the product reboot more recently).

Yeah, now I see my post was comppetelly irrelevant. I rushed to replying in a tab that had been opened days earlier on my phone before refreshing the page. A mod might as well delete it...
 
Only 2, just 2 goes at each 😬

2009 was the announcement of CausticOne, the first (FPGA-based) incarnation of the basic ideas that made it into PowerVR Wizard (and then on into Photon in more modern form after the product reboot more recently).
So Caustic Graphics came out with CausticOne in 2009 and was acquired by Imagination in 2010. (Wiki)

After the acquisition by Imagination Technologies, an initiative began to integrate the Caustic ray tracing coherence gathering, fixed function geometry testers and a new streaming BVH generator into an exiting Series 6 PowerVR GPU architecture. This allowed GLSL shaders which previously had to run on a host CPU to instead be executed on-chip using the existing PowerVR shading clusters and still gain benefit from the ray reordering being performed by the Caustic traversal unit.[14]
Edit: PowerVR IP is licensed to companies currently using it and imagine it includes RT patents.
PowerVR accelerators are not manufactured by PowerVR, but instead their integrated circuit designs and patents are licensed to other companies, such as Texas Instruments, Intel, NEC, BlackBerry, Renesas, Samsung, STMicroelectronics, Freescale, Apple (Imagination Technologies signed a new multi-year, multi-lease agreement under which Apple has access to a wider range of Imagination’s intellectual property in exchange for license fees on January 2, 2020.[1] The re-kindling of the partnership between the two companies comes as Apple's licences to Imagination graphics IP expire at the end of 2019.[2]), NXP Semiconductors (formerly Philips Semiconductors), and many others.
 
Last edited:
SaarCOR was done at some university if I recall. I'm not sure if it ever sall a commercial release but I recall it being one of the earliest ray tracing acceleratora
 
SaarCOR was done at some university if I recall. I'm not sure if it ever sall a commercial release but I recall it being one of the earliest ray tracing acceleratora

University of Saarland - quelle surprise :)

Realtime ray tracing in 2002.

Just dipping into the text there are some very sobering numbers:

Absolute performance measurements for the Saar-COR chip with 1, 2 and 4 RTCs, 272 KB cache, and 1 GB/s memory bandwidth. 4 RTCs have only half the floating-point performance of a GeForce3 and there is an almost linear relation between performance and the number of RTCs.

A closer analysis shows, that the Cruiser scene with 3.5 million triangles is limited by the memory bandwidth for triangle fetching. Table 3 gives performance measurements of the CruiserGL scene for different sized intersection-caches in combination with 1 and 2 GB/s bandwidth to main-memory. This shows that with a bandwidth of 2 GB/s and an int-cache of 288 KB the performance again scales linearly in the number of RTCs. Achieving linear speed-up with 4 RTCs in BQD-2-scene is harder: we need to enlarge all caches four times to roughly 1 MB together with a 2 GB/s bandwidth to main-memory.

In that table, 4 RTCs with 2GB/s of bandwidth combined with a 576KB int-cache delivers 114fps...

In contrast to these complex models, the Quake3 scene shows perfect linear scaling. Using the standard cache and a bandwidth of only 250 MB/s linear scaling is achieved even up to 16 RTCs. The floating-point performance of the GeForce3 equals the floating-point performance of a full SaarCOR chip with 8 RTCs and full shading. Rendering the Quake3 scene with 8 RTCs achieves 235 fps

Just let that all sink in.

Now, to be fair, it's not "path tracing", but this is real time 1024x768:

b3da050.png


Just browsing the titles here from 2002:


is fun.

SAAR-Cor was a big deal to some of us :) And of course, Ingo Wald:


is a very important person :)
 
NATIVE 4K ray tracing test with max settings, the 4090 is faster by ..

Crysis Remastered:
3090: 97%
6900XT: 4.7X

Chernobylite:
3090: 87%
6900XT: 3.8X

Battlefield V:
3090: 70%
6900XT: 3.7X

Hitman 3:
3090: 90%
6900XT: 3.6X

Cyberpunk 2077:
3090: 90%
6900XT: 3.5X

Guardians of the Galaxy:
3090: 70%
6900XT: 3.3X

Dying Light 2:
3090: 87%
6900XT: 2.9X

F1 2022:
3090: 86%
6900XT: 2.6X

Metro Exodus:
3090: 75%
6900XT: 2.6X

Watch Dogs Legion:
3090: 80%
6900XT: 2.6X

Ghostwire Tokyo:
3090: 75%
6900XT: 2.5X

Doom Eternal:
3090: 65%
6900XT: 2.4X

Deathloop:
3090: 70%
6900XT: 2.3X

Resident Village:
3090: 84%
6900XT: 2.15X

Shadow of Tomb Raider
3090: 90%
6900XT: 2.2X

 
NATIVE 4K ray tracing test with max settings, the 4090 is faster by ..

Crysis Remastered:
3090: 97%
6900XT: 4.7X

Chernobylite:
3090: 87%
6900XT: 3.8X

Battlefield V:
3090: 70%
6900XT: 3.7X

Hitman 3:
3090: 90%
6900XT: 3.6X

Cyberpunk 2077:
3090: 90%
6900XT: 3.5X

Guardians of the Galaxy:
3090: 70%
6900XT: 3.3X

Dying Light 2:
3090: 87%
6900XT: 2.9X

F1 2022:
3090: 86%
6900XT: 2.6X

Metro Exodus:
3090: 75%
6900XT: 2.6X

Watch Dogs Legion:
3090: 80%
6900XT: 2.6X

Ghostwire Tokyo:
3090: 75%
6900XT: 2.5X

Doom Eternal:
3090: 65%
6900XT: 2.4X

Deathloop:
3090: 70%
6900XT: 2.3X

Resident Village:
3090: 84%
6900XT: 2.15X

Shadow of Tomb Raider
3090: 90%
6900XT: 2.2X


Wow!!! AMD have got some serous catching up to do with their RT performance in RDNA3.
 
The PC release of Sony's Sackboy suprises with raytracing support. RTX4090 is in 1440p over 2x faster than a 3080TI. Should make it around 1.95x over the 3090TI:
 
The PC release of Sony's Sackboy suprises with raytracing support. RTX4090 is in 1440p over 2x faster than a 3080TI. Should make it around 1.95x over the 3090TI:
Really dislike this trend of destroying RT performance on any GPU that is not a 4090 for little to no visual gain. Just based on the screenshots GameGPU provided the RT implementation is laughable and not worth the extreme performance cost.
When are we finally getting good optimized RT in games again?
 
The PC release of Sony's Sackboy suprises with raytracing support. RTX4090 is in 1440p over 2x faster than a 3080TI. Should make it around 1.95x over the 3090TI:
The 4090 is also 3.9X faster than 6900XT @4K! Meanwhile, the 3080Ti is 75% faster at 4K and 65% faster than 6900XT at 1440p.

When are we finally getting good optimized RT in games again?
Welcome to the world of UE4 RT reflections, they are applied everywhere even if the materials are not really reflective, the game also has an "Ultra" reflections mode that enables full resolution for the ray traced reflections.
 
Last edited:
Really dislike this trend of destroying RT performance on any GPU that is not a 4090 for little to no visual gain. Just based on the screenshots GameGPU provided the RT implementation is laughable and not worth the extreme performance cost. When are we finally getting good optimized RT in games again?
It's harder when the game is designed around raster limitations. In any case where a game has good artistic control of all the raster hacks RT becomes increasingly harder to justify for the cost. You need to go all in like Control, Cyberpunk, Metro etc. for it to start making sense. Judging by those screenshots it seems like a pretty minor addition in Sackboy.
 
The 4090 is also 3.9X faster than 6900XT @4K! Meanwhile, the 3080Ti is 75% faster at 4K and 65% faster than 6900XT at 1440p.


Welcome to the world of UE4 RT reflections, they are applied everywhere even if the materials are not really reflective, the game also has an "Ultra" reflections mode that enables full resolution for the ray traced reflections.
this is similar to what RT boost mode does for Crysis Remastered, although in Crysis case the developers admit they do that for fun and is disabled by default.
 
Back
Top