No DX12 Software is Suitable for Benchmarking *spawn*

Discussion in 'Architecture and Products' started by trinibwoy, Jun 3, 2016.

  1. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,792
    Likes Received:
    3,959
    Location:
    Finland
    Well, there's RTX 30 series being slower than RTX 20 series and GTX 1660's and the latter getting twice the FPS of RX 6000 series in traditional rendering which should raise an eyebrow or two.
     
  2. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,546
    Likes Received:
    2,890
    Location:
    Guess...
    All this tells me is that Turing put more emphasis on traditional geometry performance than Ampere does. Which makes sense given its an older architecture.

    Clearly the the most important metric in this mesh shader test is the mesh shader performance.
     
  3. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,533
    Likes Received:
    492
    Location:
    Varna, Bulgaria
    The perf scaling between the SKUs definitely underlines how AMD and NV have been implementing geometry progressing. The RDNA cards are grouped in a narrow range, regardless of their salvage status, while NV hardware is going up and down predictably by the SM count.
     
  4. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,214
    Likes Received:
    1,617
    Location:
    msk.ru/spb.ru
    There are several weird results there.
    1. All RDNA2 GPUs are showing the same results basically with MS off and on hinting at the fact that they are limited by some common part in both cases - L3 cache maybe? I'd expect some gains here with future drivers, at least between 6800 and 6800XT+
    2. Turing is faster than Ampere in "traditional" pipeline. It's an interesting result which may point to Ampere's being balanced differently in its geometry processing. Or maybe its drivers too.
    3. All Turing chips with MS off are awfully close to each other - with the exception of 1650S which begs the question of why is it behaving so differently here? It gains +330% with MS on while 1660 gets only +75%.
    4. Same can be said about Ampere with 3060Ti being on par with 3090 when MS are off. This doesn't look right from what we know of these chips specs. Then again maybe we are seeing the FF setup limitations here which MS should help to avoid?
    So? It doesn't mean that these results are invalid. It begs the questions I've asked above though.

    If these are actually how cards will stack up in games with MS as well then I wonder if 2060 level will be where games will stop with MS utilization as this seem to be inline with what consoles should be able to achieve. (Then again with PS5 not supporting MS at all it's a moot point anyway.)
     
  5. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,896
    It's kind of curious though, isn't it? At first I thought turing was better because it might clock higher and the raster units operate per clock. But 1660 besting ampere? This data aggregate is either not reliable or they've stripped back the fixed function raster units a bit, which would be odd considering it'll take a long time for the industry to transition away from the old vertex shader pipeline.
     
  6. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,032
    Likes Received:
    15,781
    Location:
    The North
    indeed. Mesh Shaders only leverage the CU/SMs for their processing.
     
  7. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,214
    Likes Received:
    1,617
    Location:
    msk.ru/spb.ru
    But why is 6800=6900XT then?
     
    NightAntilli and BRiT like this.
  8. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,896
    The AMD results look weird. They're not scaling with the number of CUs, which you'd think they would. The whole point of the mesh shader pipeline is to take advantage of the width of the gpu and do things in parallel.
     
    #1388 Scott_Arm, Feb 11, 2021
    Last edited: Feb 11, 2021
    NightAntilli and BRiT like this.
  9. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,032
    Likes Received:
    15,781
    Location:
    The North
    I don't know. Typically it's mesh shader controller -> Compute. I'm unsure if there is a bottleneck on the controller/command processor itself that the benchmark is pushing beyond what it's designed to handle and thus we see these results. At least, that's the best I can spit ball if these results are correct.
     
    #1389 iroboto, Feb 11, 2021
    Last edited: Feb 11, 2021
  10. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    13,032
    Likes Received:
    15,781
    Location:
    The North
    better answer here:

    [​IMG]
     
    DegustatoR, DavidGraham, BRiT and 3 others like this.
  11. Man from Atlantis

    Regular

    Joined:
    Jul 31, 2010
    Messages:
    920
    Likes Received:
    734
    2080Ti is a great card people were selling it for as low as 450$, due to fear of value loss before Ampere launch.
     
  12. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,896
    Looking at the results for the 3080,the benefits are kind of absurd.

    15.2 ms per frame to 1.66 ms per frame.
     
  13. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,214
    Likes Received:
    1,617
    Location:
    msk.ru/spb.ru
    Yeah, I've suspected as much. RNDA2 seem to be really dependent on driver optimizations.
     
    DavidGraham and iroboto like this.
  14. Dampf

    Newcomer

    Joined:
    Nov 21, 2020
    Messages:
    67
    Likes Received:
    142
    VRAM. 1650 is the only 4 GB card and I measured around 5 GB memory allocation on my system when the test was not using mesh shaders.

    Once the test enabled Mesh shading, VRAM allocating dropped dramatically to around 1 GB =1650 is no more VRAM bottlenecked and gets a huge speedup from that and Mesh shaders which results into the 400% difference. Atleast that's my theory but it's pretty rock solid.
     
  15. JoeJ

    Veteran Newcomer

    Joined:
    Apr 1, 2018
    Messages:
    1,139
    Likes Received:
    1,291
    It could be occlusion culling is only implemented with mesh shaders on.
    Do we know anything about how they do it?
     
  16. Krteq

    Newcomer

    Joined:
    May 5, 2020
    Messages:
    58
    Likes Received:
    100
    iroboto likes this.
  17. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,896
    I would guess they don't do any sophisticated occlusion culling with mesh shaders off (computer shader based etc). They're probably comparing Input assembler -> vertex shader -> rasterizer to amplification shader -> mesh shader -> rasterizer. So it's not the gain we'd expect to see in games that are already leveraging more advanced forms of frustrum and occlusion culling.
     
    chris1515 and 3dcgi like this.
  18. DegustatoR

    Veteran

    Joined:
    Mar 12, 2002
    Messages:
    2,214
    Likes Received:
    1,617
    Location:
    msk.ru/spb.ru
  19. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,896
    That looks more reasonable. I imagine it will scale with the number of CUs as well, instead of being the same result across the product line.
     
  20. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    Notice that NVidia prefers the smallest possible workgroup size (32) and AMD prefers the largest possible (128/256 - 256 if on console). Makes a huge difference on AMD.
     
    Lightman and BRiT like this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...