Road to Anti-Aliasing in BRE

Discussion in 'Rendering Technology and APIs' started by Nicolas Bertoa, Sep 12, 2017.

  1. Nicolas Bertoa

    Joined:
    May 20, 2017
    Messages:
    7
    Likes Received:
    13
    I want to implement anti-aliasing in BRE, but first, I want to explore what it is, how it is caused, and what are the techniques to mitigate this effect. That is why I am going to write a series of articles talking about rasterization, aliasing, anti-aliasing, and how I am going to implement it in BRE.

    Article #1: Rasterization

    All the suggestions and improvements are very welcome! I will update this post with new articles
     
  2. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,293
    Location:
    Helsinki, Finland
    Interesting results you got with instancing tests with your Fermi (GTX 680) GPU:
    https://nbertoa.wordpress.com/2016/02/02/instancing-vs-geometry-shader-vs-vertex-shader/
    https://nbertoa.wordpress.com/2016/02/04/instancing-vs-geometry-shader-vs-vertex-shader-round-2/

    Geometry shader usually doesn't score wins, but in this special case, your instance vertex count is tiny. Only 8 vertices. Fermi doesn't seem to pack vertices from multiple instances to a single warp. As warp is 32 threads and each instance is only 8 vertices, you are likely seeing only 25% utilization of vertex waves. There could be some other bottlenecks as well. The bottleneck disappears when you bump up the vertex count to 44 per instance, which is still a very low vertex count per object by today's standards. In practice most objects have more vertices than that.

    On AMD card (GCN) you would see different results, since AMD packs multiple instances to each wave (64 threads). AMD GCN1-3 (Radeon 7000 series, 200 series, 300 series) also have poor strip rendering performance. Geometry shader outputs strips and strip cuts, which is bad for AMDs architecture. Polaris and Vega have improved strip rendering performance, but I would expect instancing still to beat geometry shaders, even at very low vertex counts.

    There is a workaround for the instance packing inefficiency. You create a vertex buffer with N copies of the same object, for example N=4. Then you use SV_InstanceId and SV_VertexId to calculate the actual instance id, and do custom fetch of instance data from a buffer. Use constant buffer for instance data if your instance count is small, since Nvidia and Intel have special hardware (and special on-chip memory) for fetching and storing constants. Draw calls with huge amount of instances need to use Buffer<T>, StructuredBuffer<T> or ByteAddressBuffer for instance data.

    There are also various tricks you can use to avoid instancing completely and reduce the vertex data size. Very helpful when rendering lots of instances with tiny vertex counts:
    Thread: https://forum.beyond3d.com/threads/programmable-vertex-fetching-and-index-buffering.57591/
    Post about emulating multidraw with index packing: https://forum.beyond3d.com/posts/1900656/
     
    #2 sebbbi, Sep 12, 2017
    Last edited: Sep 12, 2017
  3. Nicolas Bertoa

    Joined:
    May 20, 2017
    Messages:
    7
    Likes Received:
    13
    Thanks, sebbbi for reading those articles and for the explanation about the different vendors and architecture, I did not know those details.

    I will check the links that you mentioned.
     
  4. Nicolas Bertoa

    Joined:
    May 20, 2017
    Messages:
    7
    Likes Received:
    13
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...