Three new DirectX 11 demos

Discussion in 'Rendering Technology and APIs' started by Andrew Lauritzen, Jul 31, 2010.

  1. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Yeah agreed it really does cut down on the light bleeding significantly, and in the places where it matters the most (near the viewer). It's funny to see someone be like "is that light bleeding?" and as they walk up to it it fades away ;) Pretty happy with how the two techniques help one another here.

    Yeah I'm crossing my fingers :) Let me know if it works out.

    Makes sense. Indeed it will be possible to "miss" shadow samples then but doing some sort of reasonable default (in shadow, out of shadow, or just clamp to edge) should work ok.

    Ah yes very clever - should be able to get around the frustum culling stall that way!

    Yep indeed - it seems like a really useful data structure to generate regardless. I use the "first few levels" of it in the deferred shading demo as well for light volume culling :)
     
  2. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    I did some analysis for SDSM for our game content, and unfortunately my analysis doesn't look that good for us.

    We have around 2 kilometer view distance, and almost always (during the game play) you will see parts of the horizon, so the z-max reduction doesn't help. The z-near reduction helps a bit, since usually we have around 3 meters of air before our character (3rd person viewport). I guess we are lucky, since we do not have a weapon glued to the screen almost clipping the near plane (like all fps shooters have).

    The light space xy bounds of each cascade do not help that much either, since we do not have that many big blockers narrowing the view. I did a shader that plots all our visible screen pixels to the light space (selects the cascade and the texture coordinates exactly by the same math as our lighting shader). The nearest 2 cascades are mostly filled to the borders. We choose the cascade by selecting the first cascade that results in texture coordinates in [0,1] range (multiple transformations). This is most likely the biggest reason why we sample almost 100% of the cascade area (for the first cascades). Selecting the PSSM cascade by z-distance would likely result in much bigger areas of unused space in each cascade, but I am unsure if that would be a good idea.

    PSSM shadow cascades (left), and the sampled pixels plotted (right). As you see, the light shader seems already to sample almost the whole PSSM (rectange cannot be tightened for any other than the last cascade):
    http://img513.imageshack.us/img513/2918/shadowsampling.jpg
     
  3. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Yeah the z-max reduction is actually the least important part in any case. Near is what matters (critically!) for the logarithmic distribution.

    Right :) I'd handle the gun case separately in practice as it isn't really a part of "scene" per se.

    Yeah for these to help much you need either occlusion or empty space. Both of these are pretty common in a lot of scenes, but if you have a scene where your light is pretty high in the sky (vs on the horizon) and a sparse scene you can definitely have a case where it won't matter much. That said, the good news about this case is that it's also pretty easy to just parameterize analytically (which is incidentally why the simple scene-independent solutions work pretty well). Judging from your image, the best additional thing you could do is apply a warping to each shadow partition (log PSMs ideal of course, but liPSM or something is more practical). Of course warping the partitions complicates consistent edge softening assuming that you are prefiltering your shadow maps.

    So curiously are there not cases where someone gets near large/semi-large objects (relative to the size of the player)? These cases are usually the ones that cause standard cascades to exhibit artifacts but indeed the ones that SDSM can take good advantage of by tightening up both the z-range and partitions.

    The last nice advantage of SDSMs is of course avoiding tweaking of partition ranges/PSSM lambdas/etc. This is nice as it frees up some artist time and avoids problems late in production when cameras/scenes change after partition ranges have been set up. That said if you're a third person game with a fixed camera distance and a fairly sparse scene this might not be an issue either.

    In any case sounds like it might not be a good fit for your scene, but the positive way to look at that is that your scene is already well-fit by standard CSMs :) I'm still curious what the additional cost of the SDSM analysis would be on a console though so if you do happen to implement it do let me know!

    Thanks for the update.
     
  4. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    But if you are using a deferred renderer, doing the gun on a separate pass is not that elegant. You basically have to forward render it, and to be sure that the deferred rendering & lighting is not applied to those pixels (render the gun first to stencil buffer for example, or a lot of performance is wasted). You really want that the gun also receives the shadows, since it would look really weird if the gun shines fully bright, when the surrounding geometry is shadowed (you enter a small building in the terrain for example -- in our case the roofs also have some cracks letting the sun light in, but only partially).

    I was thinking about some kind of trapezoid warp, but unforunately it's not a linear transform (cannot be simply put in the light matrix & inverse of it), so it requires some extra math in both sides of the equation, and like you said causes other issues as well.

    Yes, there's some blocker geometry, but unfortunately in the most common case you see pretty far, since forest and trees are the most common view blockers, and their leaves always leave some gaps that allow you to see really far. In most games, you have linear paths though the levels, allowing the developer to put lots of huge view/movement blockers along the path. In our case, we have a world that allows you to move everywhere, as we have an in game level editor that allows you to fly over any geometry. Unfortunately for us technology guys, our artists tend to like levels that a located on top of some big (but thin) structures (a nice vertigo feel), and this kind of setting gives a huge visible view distance... but at the same time, the real game play happens very near the camera (so the near plane cannot be moved that far).

    The terrain itself is of course a good view blocker, but it rarely cuts the z-max that much, since the horizon is often visible in the camera view. However it cuts the light space cascade z-max considerably, and that should improve the EVSM quality nicely.

    I agree this is one of the key advantages in SDSM.

    SDSM will help a lot in some scenes, but for the worst case scenes it only helps a bit (we unfortunately have a lot of worst case scenes - and have no control over it, since the player created content). SDSM is still a very good technique. It saves some artist work, and improves the quality and performance a bit even in the worst case scenarios. And naturally we develop our technology in a long run, so there will be future games that gain more from SDSM than our current one. I am pretty sure we will use the z-min / z-max system (it's really fast to generate on GPU, and the tight near plane alone improves the quality nicely), but the per cascade bounds do not seem that useful for us right now (but I'd really like to have the tighter z-bound for the EVSM). But let's see how it goes.
     
  5. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    SDSM Update (part 1):

    The SDSM near-z and far-z search alone improved the shadow map quality nicely in our game view. Our PSSM near plane was set to 0.5 meters, and now during the game play the SDSM near plane is fluctuating between 3 to 5 meters. The far plane is sometimes as close as 100 meters (buildings etc blocking the view), but usually howers around 1.5 kilometers.

    But the most impressive improvement was in the editor. When you fly around in the air and look at the scenery below, the near plane can be pushed as far as 100-500 meters away, giving everything almost pixel perfect shadows (we have four cascades 512x512 pixels each, rendered with hardware 4xMSAA and EVSM filtering).

    I'll post you the performance improvements on the console platform, when the algorithm runs fully on the console as well. I implemented the recursive downsampling method on PC (DX11) first (using our virtual texture small z-buffer as my depth data source). It seems that using last frame data is enough and no graphics glitches are visible (we render at vsynch locked 60 fps), so there's no need to cause a lock stall.

    Thanks for the good algorithm. Our artists are really happy already :)
     
  6. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Agreed although that kind of implies that it has to be at a realistic place in the "scene" (to project its coordinates back into world space for shadow/lighting work) rather than just "plastered to the near plane" :) And indeed if it's at a realistic place that is really near the viewer but you want proper shadows on it... well there's no avoiding dedicating a high-resolution cascade to that range really :S

    Yeah it gets a bit ugly which is why I normally avoid warps but if you do have resolution issues (particularly near the viewer) it might be worth looking at.

    Makes sense - gameplay/design comes first of course :) Terrain visibility (assuming a simple height map) seems possible to exploit analytically actually if it gives much benefit (rather than in image space like SDSM) - have you played with that at all or would the delta not be worth it?

    Indeed. It of course shouldn't ever do *worse* than standard CSMs but the cost of generating the per-partition tight bounds may not be justified for scenes with low occlusion/empty space.


    Nice! Yeah another nice thing I like about using SDSMs with nice filtering is that you never completely "waste" any resolution. When using PCF you can get excessive resolution (higher than screen space) that is simply dropped by PCF and you still get aliased shadow edges. Proper filtering though uses these additional samples to effectively super-sample the shadow map edge via mipmapping - looks great in practice and adding hardware MSAA (with the associated nice sampling patterns) helps a ton too!

    Cool yeah I always suspected that using a really fuzzy scene description (previous frame, downsampled, etc) would be sufficient since the min/max actually drops the vast majority of data from the scene and the likelihood of hitting a case with a significant number of important "missed" pixels is very low and can probably be fully accounted for in practice by just dilating the min/max results slightly (which we already do for the partitions to support filtering near edges). Great to hear it does indeed work in practice and it definitely solves the stall problem I imagine for all practical cases!

    No problem, I'm really happy that it's working well for you! :)
     
  7. sebbbi

    Veteran

    Joined:
    Nov 14, 2007
    Messages:
    2,924
    Likes Received:
    5,288
    Location:
    Helsinki, Finland
    The SDSM recursive min/max downsample takes 0.04 milliseconds on the console hardware (from a 320x168 buffer). The frame rate boost is much higher than the downsample cost, since the tighter cascades have less objects to render. Also we use the z-max to do crude occlusion culling for all our rendered objects (cut the view cone by it). But as a side effect, the recursive min/max downsample generates hierarchial z-buffer for higher quality CPU occlusion culling as well, but we haven't implemented it yet.
     
  8. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    That's awesome news! It seems clearer and clearer that this sort of data structure gleaned from the visible samples in the scene is useful in a number of important ways. Neat to see you making use of it for occlusion culling and other things as well.

    I'm eager to see your game in action :)
     
  9. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,806
    Likes Received:
    473
    Would be nice if they had led by example, Fermi doesn't do full coherence and Larrabee doesn't do full speed scatters. They talk the talk, but they don't walk the walk.
     
  10. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    If you read back the original comment was *about* Larrabee's cache architecture, thus we're not talking about full speed single-cycle x86-style-cache-coherent scatters. That's not necessarily viable, but not clearly necessary either. A more relaxed coherence model and/or some performance degradation for complex scatters are probably both acceptable in the long run. Fermi basically already has the latter with its "coherent" write-combining on scatters to global memory, they just market it as an "optimization" to the base case rather than falling off the fast path. So arguably Fermi doesn't do "full speed" scatters either in the general case, but that's not really saying anything interesting...

    To put it another way, you could easily make an architecture that did coherent scatters to 16 different cache lines at "full speed", but that's more of a statement about how inefficient your cache-aligned write is than anything :)
     
  11. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    6,806
    Likes Received:
    473
    Banked caches are at full utilization both with ideal scatters (they hit all banks) and cache aligned linear writes (they still just hit all banks). Of course the overhead for coherency increases a bit.
     
  12. picosec

    Newcomer

    Joined:
    Mar 7, 2003
    Messages:
    10
    Likes Received:
    0
    If anyone is implementing SDSM on Xbox 360, you may want to check out the screen extent query (D3DQUERYTYPE_SCREENEXTENT). It will return the minimum Z and maximum Z for the queried rendering, with some limitations.

    I'm hoping it will prove to be useful. Of course, I am somewhat biased since I pushed for including Z in the screen extent query (when I was at MS) because of how much having tight Z bounds improved the quality of regular shadow maps.
     
  13. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Ah interesting! I specifically have asked people for this sort of feature in the past (both for x/y/z) but it has never come to PC. Awesome to hear that there's something like it on 360 and indeed depending on the limitations it might actually be useful for SDSM!

    [Edit] It is good to take occlusion into account though which requires rendering the full depth buffer before analyzing anything and thus wouldn't be compatible with this technique. That said, you could use this for just the camera near value, which is actually the most important (far has significantly less effect).
     
    #53 Andrew Lauritzen, Dec 4, 2010
    Last edited by a moderator: Dec 4, 2010
  14. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Just a quick update - the SDSM paper (author preprint for I3D 2011) is now linked from the SDSM web page. Let me know if you guys have questions or comments, or if anyone is planning to attend I3D this year!
     
  15. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,717
    Likes Received:
    5,813
    Location:
    ಠ_ಠ
    Great stuff. :cool: Wonder when we'll see more devs using this on current gen consoles. Theye're certainly here for awhile...
     
  16. homerdog

    homerdog donator of the year
    Legend Veteran Subscriber

    Joined:
    Jul 25, 2008
    Messages:
    6,153
    Likes Received:
    928
    Location:
    still camping with a mauler
    This is a DX11 demo right? How would it apply to current consoles?
     
  17. AlBran

    AlBran Ferro-Fibrous
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    20,717
    Likes Received:
    5,813
    Location:
    ಠ_ಠ
    Read up in the thread. :p
     
  18. Andrew Lauritzen

    Moderator Veteran

    Joined:
    May 21, 2004
    Messages:
    2,526
    Likes Received:
    454
    Location:
    British Columbia, Canada
    Yeah it works fine on consoles (it's very simple); I'm just too lazy to implement a DX9 version :)
     
  19. homerdog

    homerdog donator of the year
    Legend Veteran Subscriber

    Joined:
    Jul 25, 2008
    Messages:
    6,153
    Likes Received:
    928
    Location:
    still camping with a mauler
    Oh great :oops:
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...