GPU Ray Tracing Performance Comparisons [2021] *spawn*

Discussion in 'Architecture and Products' started by DavidGraham, Mar 29, 2021.

  1. Clukos

    Clukos Bloodborne 2 when? Veteran

    It’s likely they added DLSS a while back and nobody bothered to update it.
     
    DegustatoR likes this.
  2. Clukos

    Clukos Bloodborne 2 when? Veteran

    I think the 3070 and the 2080 Ti are performing comparably worse due to different reasons. The 3070 is most likely just running out of VRAM with RT enabled, the 2080 Ti just isn't as good as Ampere for a lot of async compute.
    ^ Eternal is using async compute to hide much of the cost of building the BVH (Control does something similar afaik), maybe the 2080 Ti is slower using that path. Should be easy enough to get a frame capture with NSight using a 2080 Ti and compare :)
     
    Last edited: Jun 30, 2021
    chris1515, T2098, Lightman and 4 others like this.
  3. DavidGraham

    DavidGraham Veteran

    CPU limitations.
    Ultra Nightmare settings consume a lot of VRAM, the 3060 is a 12GB card.
     
    Kyyla and PSman1700 like this.
  4. DegustatoR

    DegustatoR Veteran

  5. DavidGraham

    DavidGraham Veteran

    PSman1700 likes this.
  6. DegustatoR

    DegustatoR Veteran

    There are some weird results there between resolutions and 8GB cards.
    3070 is 44% slower than 3070Ti in 1080p for example and 50% slower in 1440p. No idea why.
    Both 3070 and 3060Ti are also slower than 3060 - except in 4K where they are ahead. Which seems a bit weird if its due to lack of VRAM for UQ streaming buffer.

    They should've probably tested them (or all GPUs) with Nightmare texture streaming buffer.
     
    Putas likes this.
  7. techuse

    techuse Veteran

    Does the ultra nightmare vram heavy setting in Doom actually change the image in any way? Does it improve performance? Why is it there?
     
  8. DavidGraham

    DavidGraham Veteran

    There are 3 texture settings: Ultra, Nightmare and Ultra Nightmare.

    They have no visual quality difference, Ultra Nightmare just caches in massive amount of textures to minimize the possibility of any texture pop in during fast motion.

    Ultra Nightmare is not usable on 8GB or less GPUs.

    Ultra Nightmare + RT is impossible on 8GB GPUs, and if forced results in severe fps degradation. The 3070 for example goes from 100fps+ to just 30fps.
     
    Plano, T2098, Frenetic Pony and 9 others like this.
  9. Dictator

    Dictator Regular

    It does absolutely nothing for average static image quality, just changes the amount of Textures cached in VRAM like @DavidGraham says, hence why it is called "Texture Pool Size" and not "Texture Resolution or Quality".. A texture on Ultra Nightmare looks the exact same as it does on Low. The lower you go even down to "Low" only increases the chances that rapid camera movement or perhaps an active camera teleport can perhaps lead to a lower res mip being shown for a few frames. That is it.
     
  10. pharma

    pharma Veteran

     
    PSman1700 likes this.
  11. Dampf

    Dampf Regular

    Makes me wonder why this is a setting in the first place if its that useless. It's just going to confuse people.

    Why not leave it at medium or high for all cards, or let the engine automatically choose based on GPU's VRAM, graphics settings and active background programs.

    This is one thing that definately should improve for the next IDTech game. Their fixed memory allocation system also has some drawbacks like we have seen with that strange DLSS memory allocation bug.
     
    DegustatoR and pjbliverpool like this.
  12. DegustatoR

    DegustatoR Veteran

    They should've just made it separate from quality presets.
     
    PSman1700 likes this.
  13. DavidGraham

    DavidGraham Veteran

    That was before Moor's law ended, now we are in a new reality. Chips are going bigger again.
    RDNA2 chips are expensive even though they are smaller than Turing. A 6900XT is a 1000$ and in most RT workloads it's no better than a Turing.
    Turing was more expensive yes, but considering it was the only future proof arch with DX12U support, It paid dividend for it's userbase, as opposed to the cheap dead end RDNA1 GPUs.
    Turing is definitely more capable RT wise than RDNA2. Period. Current workloads (gaming/professional) are proof enough of that.
     
    PSman1700 likes this.
  14. JoeJ

    JoeJ Veteran

    But hey don't have to. Keep them small and achieve visual progress with better software :D
    Yep, even RDNA was expensive. Got the same TF at half the price by sticking at GCN. So AMD also did contribute to my somewhat exaggerated depressed view on a healthy PC platform.
    In RT games i see it very mostly ahead of Turing, even if RT performance in isolation is worse. So it's good enough in practice, and higher flexibility may pay off if DXR evolves quickly (which i doubt).
     
  15. Frenetic Pony

    Frenetic Pony Regular

    It's not really "useless", watching textures "pop" in is never pleasant. It's just that people benchmarking games don't really know what VRAM pools like this do, heck most users don't either. You up the setting and it looks no different from a standstill, what gives?

    I don't imagine any next gen Idtech games are going to bother with such large pool sizes though. We've got SSDs and decompression engines and watnot now, no need to pre-stream and cache in vram anymore assuming your current streaming system is up to snuff. Besides you need that vram for radiance caches and acceleration structures and etc.
     
    Dictator, Lightman and trinibwoy like this.
  16. CarstenS

    CarstenS Legend Subscriber

    Most of the current RT workloads are developed with a focus on what Nvidia hardware can and cannot do. So I think it's not (yet) the right time for such an absolute statement as your "uring is definitely more capable RT wise than RDNA2. Period."

    For example Doom Eternal Benchmarks say otherwise. Don't be blinded by the better Ampere, RDNA2 can be quite competitive with Turing, even on a 550-vs-1100-comparision of RX 6800 vs. 2080 Ti.
    (anecdotal) proof:
    upload_2021-7-3_10-2-49.png
    source: https://www.pcgameshardware.de/Doom...ng-RTX-Update-DLSS-Benchmarks-Review-1374898/

    edit:
    anecdotal proof #2 (6700XT is a bit ahead in 1080p, a bit behind in 3840p, 2070S is a 539€ card, 6700XT 480€)
    upload_2021-7-3_10-15-48.png
    source: https://www.computerbase.de/2021-06...mm-lego-builders-journey-2560-1440-raytracing
     
    Last edited: Jul 3, 2021
  17. troyan

    troyan Regular

    RDNA2 needs around 50% more transistors than Turing to deliver the same performance with heavy RT workload. From a technical standpoint AMD's implementation is worse than Turing's.
     
    Last edited: Jul 3, 2021
    DavidGraham and PSman1700 like this.
  18. PSman1700

    PSman1700 Legend

    Not to forget at the cost of normal rendering budget.
     
  19. CarstenS

    CarstenS Legend Subscriber

    Yet another metric. Well, for that to have any meaning, you'd have to compare the full configs of each Navi21 and TU102. As it stands, 6800 uses 60 out of 80 execution units (and 75% of the ROPs, 100% memory configuration for completeness' sake), TU102 68 out of 72 (92% of memory and ROPs).

    And not all x-tors go into raytracing:

    proof:
    upload_2021-7-3_11-58-36.png

    source: https://www.pcgameshardware.de/Graf...s/Rangliste-GPU-Grafikchip-Benchmark-1174201/
     
  20. troyan

    troyan Regular

    Use a 6700XT. Same number of transistors as the RTX2080TI. The 6800 has the full memory configuration as the other Navi20 GPUs.
     
    PSman1700 likes this.
Loading...

Share This Page

Loading...