Tile-based Rasterization in Nvidia GPUs

Discussion in 'Architecture and Products' started by dkanter, Aug 1, 2016.

  1. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Any time you put constants as literals in your code the compiler is likely to "look for patterns", e.g. with loop unrolling. To hide constants from the compiler, use constants as arguments to the shader.
     
  2. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    The driver/compiler could move "(input.VertexID / 3) % 7" and the color lookup code to the pixel shader to reduce the vertex attribute count. But this optimization makes the pixel shader more complex. It's hard to justify as the driver/compiler have no knowledge about your average vertex and pixel counts (or overdraw).

    If the compiler was really clever, it could notice the modulo operator on the SV_VertexID and transform the code to use geometry instancing instead (extract the color lookup table as instance vertex buffer). If an attribute has only ties to instance vertex buffer data, it doesn't need to be replicated per vertex.

    I would guess that the shader compiler does some simple analysis on each output parameter (it needs to do that already for other optimizations). 1e-16 increment per vertex tells the compiler that each output is unique. If there's any kind of reuse caching, this attribute would be excluded. Indexing to a constant array by a modulo operator (%7) however results in most 7 different values, so caching is highly efficient. This kind of vertex output caching is already needed for indexed geometry. It's not hard to believe that Nvidia might have extended it to support some other easily detected safe cases as well.
     
    Heinrich04 likes this.
  3. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Awesome, yes that's what I was sorta guessing was happening. The unbalance you noted via the "4:3:3:3" pattern or similar, and it's confirmed nicely by there being different patterns on different 970s. Great analysis!

    The fine-grained 16x16 hashing is similar to what all GPUs do (and have done for quite some time) - it's also what mistakenly trips some people up into saying there's nothing special going on here (there clearly is :)). But how that interacts with the coarse-grained tiling with uneven loads is the neat bit. Obviously in practice the loads will tend to be a lot less even in the first place due to geometry variation, but it's easier to analyse the simple case first.

    The fine grained hashing is typically static (although sometimes software programmable/tweakable) on most GPUs. I don't know for sure if this is the case on Maxwell but I wouldn't be surprised either.
     
    CSI PC, homerdog, Lightman and 2 others like this.
  4. Philip

    Joined:
    Aug 3, 2016
    Messages:
    5
    Likes Received:
    15
    After a bit more testing: I think the significant output here is just output.Color.a - the compiler's analysis is smart enough to realise that all 7 possible values for it are equal, so it doesn't have to be stored in memory and can be replaced with a constant in the pixel shader. If I simply change one of the 7 alpha constants to a different number, the behaviour changes (and I have to reduce "num floats per vertex" by 1 to get back to the same behaviour as before).

    So ignore everything I said about compression, the compiler was just being smarter than me :(

    (But since speculation is fun: it looks like it actually takes a few frames for the compiler to discover that alpha is constant. E.g. I set it up so when "num floats per vertex" is 20 and "num pixels" is 50%, it's drawn to half the screen; and when it's 22, it's drawn to the entire screen and started a second batch of triangles. Then, whenever I move the slider from 20 to 21 (and it recompiles the shaders), it very briefly flashes an image that looks the same as 22, before settling down to the same as 20. But if I change it so alpha is not constant, and it can't do that optimisation, there is no flicker any more - 21 always looks the same as 22. I guess that means the analysis is expensive enough that it's only done when recompiling the shader in some background thread or something? Anyway, not really relevant to this topic, just something that makes the demo application confuse me.)
     
  5. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Yeah. Modern compilers can move compile time constants (and math) between VS<->PS. It's a good trick in reducing vertex attribute count.

    I have seen similar behavior from Nvidia's shader compiler. I often modify & recompile shaders at runtime. After recompile the affected shader is slower for a few frames. I believe Nvidia is quickly compiling an unoptimized shader first and then applying heavy optimizations as background compile process. Automatic PGO is a possibility, if they collect counter stats from shaders. Wouldn't surprise me as they have researched PGO a lot for Denver.
     
    Heinrich04 and homerdog like this.
  6. itaru

    Newcomer

    Joined:
    May 27, 2007
    Messages:
    156
    Likes Received:
    15
    The tile check software that has been introduced I tried to compile.
     

    Attached Files:

    • tile.zip
      File size:
      121.2 KB
      Views:
      13
  7. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    Well, for people interested in what tiled rendering on a 1070 looks like when it's glitching out.

    http://imgur.com/a/hzx7Y

    That's near the jumping puzzle in Silverwastes in Guild Wars 2. Sometimes it glitches even worse than that in that area.

    Regards,
    SB
     
    pTmdfx likes this.
  8. Infinisearch

    Veteran Regular

    Joined:
    Jul 22, 2004
    Messages:
    739
    Likes Received:
    139
    Location:
    USA
    Why are you attributing the glitch to tiled rendering?
     
  9. pTmdfx

    Newcomer

    Joined:
    May 27, 2014
    Messages:
    249
    Likes Received:
    129
    Look at the surrounding area of the mini-map at the bottom-right corner. You can see tiles there.
     
  10. MDolenc

    Regular

    Joined:
    May 26, 2002
    Messages:
    690
    Likes Received:
    425
    Location:
    Slovenia
    So because it's tile shaped it's a glitch with tiled rendering?
     
    sebbbi, Razor1 and Putas like this.
  11. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    I suppose it could just be bad NVidia drivers doing something that just happens to have a tile shape. It doesn't happen on AMD or Intel hardware that I've used. I wish I had access to an older Fermi or Keplar based NVidia GPU so I could see if it does the same thing there on those chips.

    Regards,
    SB
     
  12. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    Yeah. All modern GPUs perform several rasterization related things in smallish rectangular tiles. Also acceleration structures (such as hiZ) are tile based. I am guessing this is just a driver bug. Some timing issue (missing synchronization) in this special case. Could be anything really.
     
    Heinrich4 and Andrew Lauritzen like this.
  13. Silent_Buddha

    Legend

    Joined:
    Mar 13, 2007
    Messages:
    16,139
    Likes Received:
    5,074
    Yeah I'm guessing probably just bad drivers then or something with Pascal. There's other places in the game that exhibit the same tiled artifacting. I'm guessing it has something to do with the lighting in the game.

    In another location

    http://imgur.com/a/3eX0Y

    It's very annoying as the tiles are constantly flickering and changing based on where you are looking. Also only happens in certain locations in the game.


    Regards,
    SB
     
  14. lanek

    Veteran

    Joined:
    Mar 7, 2012
    Messages:
    2,469
    Likes Received:
    315
    Location:
    Switzerland
    Guild wars 2 player ... i have been a long time a player on GW 1 and 2 and Lineage 2 .. This bug is a long time bug, sometimes it appears and then disappear.... this is due to the engine they are using .. when it appear, just alt tab and come back to the game, it shoulld disappear .. its effectly due to rasterization, but not in the sense you think.
     
  15. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,719
    Likes Received:
    5,815
    Location:
    ಠ_ಠ
    DX9 problem? :p
     
  16. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Yeah sebbbi is most likely right here - this is some sort of timing/race in the hash/ROP part of the pipeline. All GPUs do a screen space hash into "small tiles", not just NVIDIA ones. So this is unrelated to the large tile stuff that this thread is about.
     
    Silent_Buddha, sebbbi, pharma and 2 others like this.
  17. pharma

    Veteran Regular

    Joined:
    Mar 29, 2004
    Messages:
    2,928
    Likes Received:
    1,626
    On NVIDIA's Tile-Based Rendering


    https://www.techpowerup.com/231129/on-nvidias-tile-based-rendering
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...