GPU Ray Tracing Performance Comparisons [2021] *spawn*

Discussion in 'Architecture and Products' started by DavidGraham, Mar 29, 2021.

  1. trinibwoy

    trinibwoy Meh Legend

    Are you drawing that conclusion based on mixed workloads where RT is just one consideration? For reference the 3080 and 2080 Ti have the same number of RT “cores” yet the former is 85% faster in Optix.
     
    PSman1700 and HLJ like this.
  2. Jawed

    Jawed Legend

    Ah, you must be new around here.

    That's what DavidGraham suggested:

    That's what we do here at B3D: discuss what happens at the limits. If you don't like that, there's plenty of other forums.

    You have facts to base that conclusion on?

    Is Metro Exodus:Enhanced Edition at 8K with Ultra ray traciing and no DLSS at 10.8fps:

    TweakTown.com Enlarged Image

    faulty testing?
     
  3. iroboto

    iroboto Daft Funk Legend Subscriber

    yea unlikely.
    I think 8K broke it. Bottleneck differences may be breaking the camels' back sort of speak here for the nvidia card. Might be useful to use the nvidia Nsight here to see what's happening at 8K. Worth while to explore if @Clukos or another member has some time to try it. I'd be curious to see what's happening here.

    Would also be curious to see an equivalent AMD one for the 6900XT. But I don't know the name of that tool and whether it's free for downloading and usage.
     
    PSman1700 likes this.
  4. pharma

    pharma Veteran

    Yeah, or they used rebar as their default setup; at high resolutions it had a negative effect on 3090's in Cyberpunk.
     
    PSman1700 likes this.
  5. Frenetic Pony

    Frenetic Pony Regular

    Weeird, though from everything on here I get the impression Nvidia has somehow dropped the ball on their driver support, or some combination of such. Rebar works great on AMD and there's no "this driver works best for this game but it's not the newest drivers" there. So... while it might be "detrimental" at some point one has to start laying some blame on Nvidia for the underlying problems here.

    If they'd clean up whatever issues they're having no one would have to bring up driver versions or what exact setup settings you have for what exact games. Certainly make life easier on everyone as well.
     
  6. pharma

    pharma Veteran

    Rebar sometimes also results in negative gains on AMD cards. For any "pure" testing they should just disable it.
     
    PSman1700 likes this.
  7. HLJ

    HLJ Regular

    All you show is that AMD is better at 8K in unplayable settings, which will tell you nothing outside this specific border case.
    When native 8K RT because a real option, it will not be with current SKU design or even 1st and 2nd gen RT cores.

    But of course if you want to pivot AMD's lesser RT solution as "better"...this is all you really got but then you have to ignore all the real world gaming that doesn't suit that agenda.


    [​IMG]
     
    Last edited: Jul 11, 2021
    PSman1700 likes this.
  8. DavidGraham

    DavidGraham Veteran

    The small gap between the 3090 and 6900XT with RT on in TweakTown's testing.

    In contrast to that, Computerbase shows that @4K with Ultra RT, the 3080 is 84% faster than the 6800XT, the 3090 stands to be even faster.
    https://www.computerbase.de/2021-03...berpunk-2077-raytracing-und-dlss-in-3840-2160

    PCGH shows the 3090 to be 92% faster than 6900XT @4K Ultra RT.
    https://www.pcgameshardware.de/Cybe...als/Update-120-Benchmarks-Raytracing-1369667/

    Frankly, Tweaktown has never stood to be a reliable source of testing for anyone for quite a long time.
     
    Last edited: Jul 11, 2021
    PSman1700 likes this.
  9. DavidGraham

    DavidGraham Veteran

    Not just that, here is PCGH's testing of the ART Mark RT demo, the 2080Ti remains faster than 6900XT, the 3090 is more than twice as fast.

    [​IMG]
    https://www.pcgameshardware.de/Rayt...cials/ART-Mark-Raytracing-Benchmarks-1371125/

    And here is the Boundary game demo, with the same story.

    [​IMG].
    https://www.tomshardware.com/reviews/amd-radeon-rx-6900-xt-review/3

    Control, same story.

    [​IMG]
    https://www.tomshardware.com/reviews/amd-radeon-rx-6900-xt-review/3

    COD Cold War too.

    [​IMG]
     
    Last edited: Jul 11, 2021
    pharma and PSman1700 like this.
  10. pjbliverpool

    pjbliverpool B3D Scallywag Legend

    Art Mark is interesting there, clearly leveraging some advantage of Amperes RT implementation over Turing that we don't typically see in games. I'd love to understand more what the difference is there and why we're not seeing it in games.
     
    pharma, PSman1700 and DavidGraham like this.
  11. Rootax

    Rootax Veteran

    I guess in games other limiting factor are more frequently involved ? Like, maybe you have 1.6-2x perfs on the pure RT part/ rt core work, but the shading part after that is not x2 so we never see the rt gain ?
     
  12. pjbliverpool

    pjbliverpool B3D Scallywag Legend

    But we usually see the performance ratio between similar GPUs (say 2080Ti and 3070) remain pretty much the same with RT off and on. I'd expect the 3070 to get relatively faster with RT on if that part of the rendering process is faster on that card.

    I'm wondering if there's some feature of Ampere RT that just isn't being used by games right now that the benchmark is using. If true then I'd like to understand the likely hood of seeing that in future games.
     
    PSman1700 likes this.
  13. JoeJ

    JoeJ Veteran

    Pretty sure it's not. Games have just too many other work going on on GPU, including any rasterization and compute, but also non accelerated RT tasks like BVH build/refit due to animation and streaming, denoising, shading, ray generation, and optimizations like ray binning.
    What we see in games benchmarks so depends more on the overall improvement of Ampere > Turing, and RT Cores - even if they are 4 x faster - is just one factor of many. (IIRC, up to 4 x was communicated, but could be wrong.)

    Personally i wonder much more about offline results. HW acceleration often shows only a net win of 2. That's suspiciously small. I think it's a result of 'missing optimizations on all ends', e.g. using really complex materials, construction of all BVH each frame, huge uploads each frame, etc.
     
    pjbliverpool likes this.
  14. JoeJ

    JoeJ Veteran

    A-RT setup from PCHW:
    upload_2021-7-11_11-33-1.png
    To hilight RT core improvement, we would want to turn TAA off and increase those settings to the max, even if resulting FPS end up 'unplayable'.
     
    pjbliverpool likes this.
  15. JoeJ

    JoeJ Veteran

    Oh sorry -they include results with high settings:
    [​IMG]
    vs.
    [​IMG]
    Showing almost 3 x improvement. Makes sense.
     
    T2098 and pjbliverpool like this.
  16. OlegSH

    OlegSH Regular

    This benchmark features perfect mirror reflections and lighting, this lighting is also visible in reflections, so pretty sure this benchmark mostly tests shading performance rather than tracing.
    Perfect mirror rays are so cheap that Crytek were able to trace them efficiently in SW on last gen consoles, obviously, HW RT is still a way to go on PC for the best quality and performance.
    Ampere has 2x FP32 SIMDs and 2x L1/texture bandwidth, so no wonder it works way better with shading heavy workloads. Even if there is some divergence due to materials (though all the shiny balls in this demo seem to be using the same materials), more SIMDs would mean better performance on scenes with lots of divergence.
    I remember there were in-game Quake II RTX breakdowns of lighting, BVH and other passes somewhere here and all compute limited passes were close to 2x faster on 3090 vs 6900 XT
     
    pjbliverpool and PSman1700 like this.
  17. JoeJ

    JoeJ Veteran

    Seems it even uses SM for shadows (impression from looking at settings).
    I assume the 50 bounces mean reflections of reflections, so that's no longer cheap. Divergence will also increase with each bounce, so it's no bad test for HW RT.
    But idk if paths terminate after hitting some diffuse surface, and how long the average path really is.
     
  18. OlegSH

    OlegSH Regular

    It seems this benchmark is built on UE4, so one can easily check with Unreal Unlocker where this bench spends the most of time via the "stat GPU" command. Nsight profiling would be even more revealing.
     
    PSman1700, pharma and Rootax like this.
  19. [​IMG]

    https://forum.beyond3d.com/posts/2185240
     
  20. JoeJ

    JoeJ Veteran

    Wow, seems AMDs BVH build needs some work. Also the overall loss on denoising is unexpected to me.

    Related to RT Core benchmarks, Q2 RTX has (or had?) a mode with all mirror surfaces and 10 bounces. Would be ideal because denoising off.
     
Loading...

Share This Page

Loading...