No DX12 Software is Suitable for Benchmarking *spawn*

Discussion in 'Architecture and Products' started by trinibwoy, Jun 3, 2016.

  1. Digidi

    Regular Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    392
    Likes Received:
    222
    The meshshader off test for AMD is strange. I thought AMD is now converting everything to primitive shaders. Because of this and that we know that throughoutput of navi21 is 80% higher than naive10, i thought we will also see big gains in meshshader off Test. :runaway:
     
    Remij and BRiT like this.
  2. techuse

    Regular Newcomer

    Joined:
    Feb 19, 2013
    Messages:
    742
    Likes Received:
    439
    The off results are very odd for all GPUs. Hopefully some explanation is uncovered.
     
  3. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    11,207
    Likes Received:
    1,780
    Location:
    New York
    The most likely explanation is that the benchmark was quickly thrown together with no optimization. With mesh shader off the card is seemingly idle for long periods of time with none of the functional units doing any work.

    With mesh shader on the viewport culling hardware and caches are seeing the most action. The VPC does "viewport transform, frustum culling, and perspective correction of attributes".

    [​IMG]

    [​IMG]
     
    Lightman, Krteq, Jawed and 3 others like this.
  4. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,895
    @trinibwoy That's pretty much what you'd expect though, isn't it? I'm assuming the non mesh-shader is kind of written worst case where you load in a bunch of vertex data, run your vertex shader and then let the fixed raster units do all of the culling. The vertex shader pipeline is not good for taking advantage of the parallelism of gpus because of the way vertices are indexed and output by the input assembler, or something like that. The mesh shader pipeline also removes the need to write out some index buffers to pass between shader stages. This test is probably written so it's not really doing much other than processing geometry and the mesh shaders do it way faster. AAA games would do more culling with compute shaders to get some similar benefits, but I don't know how you'd compare.
     
  5. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    11,207
    Likes Received:
    1,780
    Location:
    New York
    I would still expect to see a lot of vertex attribute fetch and fixed function culling activity with mesh shaders off. Instead there's nothing.
     
    chris1515, Scott_Arm and BRiT like this.
  6. chris1515

    Legend Regular

    Joined:
    Jul 24, 2005
    Messages:
    6,114
    Likes Received:
    6,398
    Location:
    Barcelona Spain
    From @Locuza on era



    [​IMG]

    From Timur Kristóf of Valve in XDC September 2020
     
    Krteq likes this.
  7. Dampf

    Newcomer

    Joined:
    Nov 21, 2020
    Messages:
    67
    Likes Received:
    142
    How should mesh shading be possible on RDNA1? It's a GPU hardware feature only available on current gen cards and Turing. RDNA1 uses primitive shaders AFAIK.

    Maybe he means running mesh shaders in software, similar to Raytracing on Pascal cards?
     
    BRiT likes this.
  8. OlegSH

    Regular Newcomer

    Joined:
    Jan 10, 2010
    Messages:
    522
    Likes Received:
    759
    https://support.benchmarks.ul.com/e...erview-of-the-3dmark-mesh-shader-feature-test

    "
    The test runs in two passes. The first pass uses the traditional geometry pipeline to provide a performance baseline. It uses compute shaders for LOD selection and meshlet culling. This reference implementation illustrates the performance overhead of the traditional approach.

    The second pass uses the mesh shader pipeline. An amplification shader identifies meshlets that are visible to the camera and discards all others. The LOD system selects the correct LOD for groups of meshlets in the amplification shader. This allows for a more granular approach to LOD selection compared with selecting the LOD only at the object level. The visible meshlets are passed to the mesh shaders, which means the engine can ignore meshlets that are not visible to the camera.
    "

    The "traditional" baseline performance test is just compute shaders with some weird bottleneck along the way and the traditional geometry pipeline is essentially ignored and used just as passthrough to triangle setup and scan conversion for static geometry.
    Probably VPC does minimal amount of work here because culling had already been done prior to it in compute shaders and bottleneck is very likely somewhere in the CS too or in CS <--> traditional pipeline interop.

    As for the vertex attribute fetch, probably it's done in vertex shader in the same way it's implemeted in anvil next engine.
     
  9. Scott_Arm

    Legend

    Joined:
    Jun 16, 2004
    Messages:
    14,761
    Likes Received:
    6,895
    So the non mesh shader one is more sophisticated than I thought.
     
  10. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,362
    Likes Received:
    3,101
    Location:
    Germany
    I think a really important part of this article is the following: "Do note that we chose the best or close to best scores that were available at the time of writing. It is by no means an accurate representation of each architecture performance. It is only meant to provide a basic understanding of how fast each DirectX12 architecture might be."

    IOW, they used UL database results, meaning totally not comparable systems, limitations, drivers and/or maybe even setting. As nice as a quick overview is to have, I would not put much stock into this batch of results when trying to understand the behaviours of different architectures.
     
    Lightman, 3dcgi, Krteq and 2 others like this.
  11. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,510
    Likes Received:
    4,128
  12. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    11,207
    Likes Received:
    1,780
    Location:
    New York
  13. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    17,708
    Likes Received:
    7,710
    This quote from the article seems more appropriate for most people. :p

    Regards,
    SB
     
    Rodéric and BRiT like this.
  14. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    309
    Likes Received:
    348
    Actually, it can be potentially hardware accelerated on RDNA ...

    It's important to remember that mesh shaders are just a software defined stage in the the API's model of the graphics pipeline. Primitive shaders on RDNA/RDNA2 on the other hand is a real hardware defined stage in their graphics pipeline. An implementation of mesh shaders on RDNA doesn't necessarily have to be a 1:1 mapping between software and hardware so a portion of mesh shaders as a SW stage can be handled with RDNA's own native HW stage while the rest can be done in software (emulation) ...

    On PS5 GNM, they expose similar functionality with respect to mesh shaders but we don't see them attempting to advertise this capability at all ...
     
  15. Rys

    Rys Graphics @ AMD
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,174
    Likes Received:
    1,545
    Location:
    Beyond3D HQ
    It’s been in the works for a long time, just the focus was on the perf of the mesh shader path, especially as one of the first uses of it on PC. The “off” case leans heavily on ExecuteIndirect, in a non-idiomatic way to use that to drive the GPU, so it’s not as interesting. Nice EI test, but not how a production renderer would do anything.
     
    Newguy, Lightman, Krteq and 10 others like this.
  16. Digidi

    Regular Newcomer

    Joined:
    Sep 1, 2015
    Messages:
    392
    Likes Received:
    222
    Isn't it a bit strange? Normaly you want to compare the best wors from example A with the worst case for example B, but with the best condtions you can get for both cases?
     
    #1416 Digidi, Feb 13, 2021
    Last edited: Feb 13, 2021
  17. Rodéric

    Rodéric a.k.a. Ingenu
    Moderator Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,060
    Likes Received:
    955
    Location:
    Planet Earth.
    The most interesting comparison to me would be between standard rendering and mesh shader rendering, to see how much performance improves between what we currently use and what we'll use next...
     
  18. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,510
    Likes Received:
    4,128
    The problem now lies in the non mesh shader path, as it doesn't represent the best possible use of resources for the traditional old ways, thus the speed ups with the Mesh path are now unrealistic?
     
  19. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    8,546
    Likes Received:
    2,890
    Location:
    Guess...
    With UE5 seemingly bypassing mesh shaders in favour of a pure compute path and achieving higher performance for the most part (with obviously stunning results), I'm wondering how much use we'll see of mesh shaders this generation. The same question would apply to the RT hardware vs Lumen which seemingly does something quite similar with less performance hit and without having to rely on the RT hardware.
     
  20. Dampf

    Newcomer

    Joined:
    Nov 21, 2020
    Messages:
    67
    Likes Received:
    142
    Devs should just ditch vertex shading and go all in mesh shading for games in development now. It would transform games. Better performance, much finer LOD granularity, much finer geometric detail. The potential is huge! A big step towards GCI like graphics.

    I would like to see games using mesh shading at the end of 2021 or 2022. The only issue is the userbase, which due to Covid may be lower than expected. Still, it should get better in mid 2021.

    UE5 is not skipping mesh shaders. The demo was running on PS5 so it was using primitive shaders as the HW acceleration path for Nanite. On PC and Xbox that path will very likely be mesh shaders.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...