How Driver/Game optimization is done?

Discussion in '3D Hardware, Software & Output Devices' started by pMax, Nov 21, 2014.

  1. pMax

    Regular Newcomer

    Joined:
    May 14, 2013
    Messages:
    327
    Likes Received:
    22
    Location:
    out of the games
    I was just wondering, given recent NVIDIA GameWorks etc...

    * shaders are usually byte-precompiled (say DX ones), not natively precompiled, so there should be no need to do GPU code reversing, am I right? Or are solutions like GameWorks likely to use natively precompiled shaders?
    * Is most of the work done with the GPU profiling to understand the call order/resource usage and reoptimize such pattern?
    * Is tweaking usually done in the shader's compiler, or more targeted to the draw indirect call batch flow?
     
  2. Davros

    Legend

    Joined:
    Jun 7, 2004
    Messages:
    16,168
    Likes Received:
    3,392
    Humus would be the ideal guy to answer that, I'd love him to do an article on what devrel do.
    I'd also love to know the full details of the amd/rage debarcle
     
  3. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,446
    Likes Received:
    302
    Shaders are compiled to DX assembly by the API and that's what an IHV's shader compiler receives. Since Gameworks works with non-Nvidia hardware it must still compile to DX asm.
     
  4. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    782
    Likes Received:
    22
    Location:
    Wroclaw, Poland
    Part of your question was tackled here:
    https://forum.beyond3d.com/threads/driver-optimizations-and-the-api.56087/

    DX bytecode is a simple C-to-something_like_asm representation with some of the obvious optimizations already performed (constant folding and stuff). It's feasible that you could manipulate input shader to get better bytecode representation for the driver compiler to consume if you know what's better for the compiler. There are some issues with the bytecode itself (some original context missing, vector instructions) so if your tool has enough knowhow about the target, it can do some crazy shit with input (or simply construct better bytecode jettisoning the official compiler altogether). And that's just shaders. There are many more optimizations you can perform if you can measure your app from top to bottom.
     
  5. pMax

    Regular Newcomer

    Joined:
    May 14, 2013
    Messages:
    327
    Likes Received:
    22
    Location:
    out of the games
    hmm... yeah, but I didn't see any IDA Pro module for DX bytecode, nor for AMD/NVIDIA assembly, so I was wondering what kind of reversing -if any- was done in the platform, and why.
     
  6. Dominik D

    Regular

    Joined:
    Mar 23, 2007
    Messages:
    782
    Likes Received:
    22
    Location:
    Wroclaw, Poland
    DX bytecode is documented and you get that data out easily. There's a bunch of DDI intercepting tools that can be used to look into what's happening, and DXGK components expose ETW logs that can be easily consumed (and are by tools like GPUView). Some (most?) of the architectures are sufficiently documented to construct a body of knowledge about the inner workings of their ALUs. Most of the pieces are out there. :)
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...