GPU Ray Tracing Performance Comparisons [2021] *spawn*

Discussion in 'Architecture and Products' started by DavidGraham, Mar 29, 2021.

  1. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,502
    Likes Received:
    24,397
    Can we stop being total assholes to one another because of the need to worship an arbitrary platform of choice!

    Remain civil or bans will be handed out like talk show give-aways. Don't force the mods to get involved.
     
  2. iroboto

    iroboto Daft Funk
    Legend Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    14,833
    Likes Received:
    18,632
    Location:
    The North
    They aren't.

    They're claiming console RT API is better than DXR at the moment because they have more control over RT on consoles, and it provides the flexibility for them to generate the results they want without extravagant work arounds to optimize the hardware.
     
  3. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Faster generic speed? All the stars aligned with Ampere versus Turing when it comes to ray tracing performance. NVidia seemed to indicate that raw ray tracing performance in Ampere is 2x+ that of Turing and I believe there were some non-gaming benchmarks that demonstrated this.

    But in gaming, using just the DXR tests from this page:

    GeForce 466.77 Driver Performance Analysis – Using Ampere and Turing (babeltechreviews.com)

    in the 7 games tested at 1440p native resolution (no DLSS) with "maximum" settings for 3080 versus 2080Ti (two non-reference cards at stock settings):
    • 39% faster on average
    • 41% for 1% lows
    • 42% for 0.2% lows
    So, where is generic gaming performance going to come from? The next 40% is looking like it's going to be very difficult because bandwidth and compute have hit the wall.

    In my opinion this is analogous to geometry shading versus mesh shading. It's a slow motion car crash that in the worst case is going to take 10 years to play out.

    It's worth remembering that AMD with a compute-SIMD "slow" approach has equalled NVidia's dedicated-MIMD in Turing. With Ampere, NVidia gained 40% on Turing in games, but it seems likely there are no major gains to be had from "better MIMD" in Lovelace.

    I don't know if there was ever an in-depth analysis of how NVidia gained 2x+ raw ray tracing performance in Ampere. Links? Such an analysis would be the most productive baseline for a discussion of where NVidia can go from here, hence "a roadmap of generic speed".
     
    Rootax likes this.
  4. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,210
    Nope, once you push RT workload upwards, Turing gets even more faster than RDNA2, as demonestrated in Minecraft, Quake 2, Call of Duty Cold War, Cyperbunk, Control .. etc.

    Here, almost 2x increase from 2080Ti to 3090 in many pro RT apps.
    https://techgage.com/article/mid-2021-gpu-rendering-performance/1/
    https://techgage.com/article/mid-2021-gpu-rendering-performance/2/
     
  5. troyan

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    603
    Likes Received:
    1,122
    Double FP32 throughput, better L1 Cache, double triangle intersection, 50% more bandwidth with GDDR6X and 50% more transistors.
     
    chris1515 and PSman1700 like this.
  6. PSman1700

    Legend

    Joined:
    Mar 22, 2019
    Messages:
    7,118
    Likes Received:
    3,088
    Hence the use of only upscaled reflections in past-previous generation games.
     
  7. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,708
    Likes Received:
    2,132
    Location:
    London
    Why is 6900XT nearly twice as fast as 3090 here:

    TweakTown.com Enlarged Image

    with no DLSS/CAS? 10.6fps versus 6.6fps.

    Out of that list only "triangle intersection rate" (did you mean ray traversal rate, or this is a specific part of ray traversal that you're referring to?) and bandwidth are on topic for Ampere's increased performance over Turing.

    Is the rate a side-effect solely of more ray tracing cores?
     
  8. HLJ

    HLJ
    Regular

    Joined:
    Aug 26, 2020
    Messages:
    529
    Likes Received:
    869
    RTX 2080 Ti = 68 RT cores.
    RTX 3080 = 68 RT cores.

    Those cores are "not equal".
    Is this what you are looking for?
    Ampere vs Turing real time ray tracing overhead – Coreteks
     
    Lightman, Jawed and PSman1700 like this.
  9. HLJ

    HLJ
    Regular

    Joined:
    Aug 26, 2020
    Messages:
    529
    Likes Received:
    869
    This?
    upload_2021-7-10_15-2-54.png

    Talk about finding a border case of unplayable architechture limits.
     
    pharma, Rootax and PSman1700 like this.
  10. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,210
    Faulty testing for sure.
     
    PSman1700 likes this.
  11. CarstenS

    Legend Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,800
    Likes Received:
    3,920
    Location:
    Germany
    Those were quoted wrt to professional apps, the closest game at 1,8x was Q2 RTX:
    upload_2021-7-10_16-17-19.png

    And regarding where the performance comes from in addition to 2x the intersection testing rate claimed by Nvidia, there's this:
    upload_2021-7-10_16-20-21.png
     
    Lightman, Jawed, PSman1700 and 4 others like this.
  12. HLJ

    HLJ
    Regular

    Joined:
    Aug 26, 2020
    Messages:
    529
    Likes Received:
    869
    Turing RT core:
    upload_2021-7-10_16-29-39.png

    Ampere RT core:
    upload_2021-7-10_16-30-23.png

    This makes for this:
    [​IMG]

    [​IMG]

    [​IMG]

    So they did some real work between Gen1 and Gen2.
     
    PSman1700, pjbliverpool, JoeJ and 2 others like this.
  13. HLJ

    HLJ
    Regular

    Joined:
    Aug 26, 2020
    Messages:
    529
    Likes Received:
    869
    This also tells something:
    upload_2021-7-10_16-41-17.png
     
    Lightman, PSman1700, pharma and 2 others like this.
  14. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    17,879
    Likes Received:
    5,330
    Motion blur - Nvidia reintroduce the T-Buffer
     
  15. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I almost forgot about the added MB support. It's an unexpected improvement over Turing.
    I assume they target offline rendering application, not games? Has DXR even support?
    And if offline is the target, i think one serious limitation here was limited levels of instancing. So are there improvements too, and all those things are exposed by Optix maybe?
     
  16. Subtlesnake

    Regular

    Joined:
    Mar 18, 2005
    Messages:
    347
    Likes Received:
    126
    For gaming, Nvidia only ever said to expect up to 2x 2080 performance.
     
    Jawed likes this.
  17. TopSpoiler

    Newcomer

    Joined:
    Aug 18, 2020
    Messages:
    74
    Likes Received:
    176
    I can't post a link yet, but you can see the pure RT performance in the GPSnoopy's RayTracingInVulkan demo. (you can google it)
    The reality is that the 6900XT is slower than the 2080Ti. The 6900XT is only faster if the scene has very shallow BVH depth and no triangle geometry.
     
    DavidGraham and PSman1700 like this.
  18. HLJ

    HLJ
    Regular

    Joined:
    Aug 26, 2020
    Messages:
    529
    Likes Received:
    869
    I don't think it is for gaming yet...3rd gen will be quite telling for where they think RT is going.
     
    PSman1700 likes this.
  19. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    Found a video, so Blender can use it:
     
  20. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    I think it could make sense for games too, even while still in hybrid era. Primary (rasterized) stuff can use SS fakes, but shadows and reflections don't work well.
    And this could fix it. Maybe already practical in games which do not have many RT effects.

    Question is if this requires BVH refit to extend boxes with motion, or if RT Cores can do this on the fly avoiding such problem.
     
    PSman1700, pharma and HLJ like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...