Hi all!
I've been doing some more research involving Variance Shadow Maps over the past few months and the results have been very positive. Of course there's always more work to be done, but I'm pretty happy with the implementation at this point, so I figured that it's time to release it and get feedback.
The original Variance Shadow Maps paper (you can find it here) and implementation had a few things that I wanted to clean up. In particular:
The new implementation addresses all of these issues by modifying the approximation function slightly and using summed-area tables.
It is first good to note that light bleeding cannot be eliminated without over-darkening some regions, due to the tight upper bound that Chebyshev's inequality provides. Put simply, VSM is really the best that you can do with N (in this piece, two) pieces of information.
Still, we are willing to accept some over-darkening in certain non-objectionable regions if we can get rid of light bleeding. A simple way is to just clip off the tail of Chebyshev's Inequality. This can be trivially implemented by taking the result of evaluating the inequality and passing it through a simple function like linstep or smoothstep.
The threshold here is artist-editable, and a threshold to completely eliminate all light bleeding can be computed from the ratio of overlapping occluder distances from the light's perspective. The worst case occurs where there is one occluder very close to the light whose penumbra is cast onto two overlapping surfaces that are very close to each other, but both far from the light. I can draw some pictures if that last paragraph doesn't make much sense.
In practice this technique works very well and is exposed in the demo with a corresponding threshold slider for people to play with and see the effect. (The demo uses a simple linstep, but smoothstep looks good as well, although it naturally changes the falloff function).
To solve all of the remaining problems, I implemented summed-area tables. This allows one to sample arbitrary rectangular regions of the shadow map at constant cost with no dynamic branching.
The demo uses hardware derivatives to compute a rectangular filter region (the same way the hardware does it for mipmapping... actually a little bit more accurate), and optionally clamps the minimum filter size to get a softer shadow.
There are several shadowing techniques in the demo which I'll outline briefly here:
Anyways enough talk, grab the demo here: Summed-Area Variance Shadow Maps (January 30, 2007)
Note that I will soon release the source for the demo for people to play around with, but I've submitted this work for potential inclusion in GPU Gems 3, and I'm waiting to hear back on that before I post code.
Please note the requirements (as detailed in the included Readme):
Here are some performance results from a GeForce 8800GTX at 1600x1200, 4xMSAA with the view from the first screenshot below. As you can see, the crossover beyond which SAVSM becomes more efficient than PCF is as low as 2x2! Note that at shallow angles (like the second screenshot below), SAVSM is always faster by a large factor since many pixels will require a large filter region, due to derivatives alone. Indeed even in the best (and rare) case for PCF where the entire shadow map is magnified for every pixel, the cross-over is still only at 3x3.
For those of you who don't meet the requirements - or are just too lazy to download the demo - some screenshots follow. Click the image for the high-resolution, uncompressed version.
Car:
Car (shallow angle - note the nice filtering):
Commando:
Commando again:
Spheres (hard):
Spheres (soft):
For those of you still reading, here are a few other notes and future work:
Anyways sorry for the long post... I got rambling. Please feel free to ask any questions if I didn't make anything clear. Rest assured if my article is accepted into GPU Gems 3 I will cover variance shadow maps, summed-area tables and all of the details that I've only hinted at here in depth.
Enjoy!
Andrew Lauritzen
University of Waterloo / RapidMind Inc.
I've been doing some more research involving Variance Shadow Maps over the past few months and the results have been very positive. Of course there's always more work to be done, but I'm pretty happy with the implementation at this point, so I figured that it's time to release it and get feedback.
The original Variance Shadow Maps paper (you can find it here) and implementation had a few things that I wanted to clean up. In particular:
- Light bleeding could occur due to the sometimes-loose upper bound Chebyshev approximation employed.
- High-precision hardware filtering (mipmapping, trilinear, anisotropic) was required.
- Blurring the shadow map can become expensive, even with the separable O filter.
- The filter width could not be changed dynamically per-pixel, which is desirable both for standard texture filtering, and plausible soft shadows (with contact hardening and so forth).
The new implementation addresses all of these issues by modifying the approximation function slightly and using summed-area tables.
It is first good to note that light bleeding cannot be eliminated without over-darkening some regions, due to the tight upper bound that Chebyshev's inequality provides. Put simply, VSM is really the best that you can do with N (in this piece, two) pieces of information.
Still, we are willing to accept some over-darkening in certain non-objectionable regions if we can get rid of light bleeding. A simple way is to just clip off the tail of Chebyshev's Inequality. This can be trivially implemented by taking the result of evaluating the inequality and passing it through a simple function like linstep or smoothstep.
The threshold here is artist-editable, and a threshold to completely eliminate all light bleeding can be computed from the ratio of overlapping occluder distances from the light's perspective. The worst case occurs where there is one occluder very close to the light whose penumbra is cast onto two overlapping surfaces that are very close to each other, but both far from the light. I can draw some pictures if that last paragraph doesn't make much sense.
In practice this technique works very well and is exposed in the demo with a corresponding threshold slider for people to play with and see the effect. (The demo uses a simple linstep, but smoothstep looks good as well, although it naturally changes the falloff function).
To solve all of the remaining problems, I implemented summed-area tables. This allows one to sample arbitrary rectangular regions of the shadow map at constant cost with no dynamic branching.
The demo uses hardware derivatives to compute a rectangular filter region (the same way the hardware does it for mipmapping... actually a little bit more accurate), and optionally clamps the minimum filter size to get a softer shadow.
There are several shadowing techniques in the demo which I'll outline briefly here:
- Shadow Map is a standard shadow mapping implementation. Cringe at the ugly aliased shadows!
- PCF implements percentage closer filtering to sample the filter rectangle. The results are good, but heavy biasing is required (causing "peter-panning" of the shadow) and performance drops with O(n^2) as the filter width increases (either via softness, or viewing from shallow angles, etc).
- Hardware VSM uses hardware texture filtering (mipmapping, trilinear, anisotropic) and sets the texture Max LOD to soften the shadow. Since mipmapping is a rather course approximation for magnification, boxy artifacts are clearly visible when "softening".
- Summed-Area VSM uses summed area tables to do the filtering, not relying on any hardware texture filtering. This implementation provides excellent quality and softening and does exactly 16 texture reads regardless of the filter area.
Anyways enough talk, grab the demo here: Summed-Area Variance Shadow Maps (January 30, 2007)
Note that I will soon release the source for the demo for people to play around with, but I've submitted this work for potential inclusion in GPU Gems 3, and I'm waiting to hear back on that before I post code.
Please note the requirements (as detailed in the included Readme):
- Any reasonably modern CPU/RAM
- Windows XP or Windows XP x64 Edition
- A shader model 3.0 capable video card
NVIDIA GeForce 8 series card highly recommended - DirectX Redist December 2006
Available free from http://www.microsoft.com (search for the above) - Visual C++ 2005 Redistributable Package
Available free from http://www.microsoft.com (search for the above)
Here are some performance results from a GeForce 8800GTX at 1600x1200, 4xMSAA with the view from the first screenshot below. As you can see, the crossover beyond which SAVSM becomes more efficient than PCF is as low as 2x2! Note that at shallow angles (like the second screenshot below), SAVSM is always faster by a large factor since many pixels will require a large filter region, due to derivatives alone. Indeed even in the best (and rare) case for PCF where the entire shadow map is magnified for every pixel, the cross-over is still only at 3x3.
For those of you who don't meet the requirements - or are just too lazy to download the demo - some screenshots follow. Click the image for the high-resolution, uncompressed version.
Car:
Car (shallow angle - note the nice filtering):
Commando:
Commando again:
Spheres (hard):
Spheres (soft):
For those of you still reading, here are a few other notes and future work:
- One big problem with SAVSM is numeric stability, since both summed-area tables and variance shadow maps eat precision for breakfast. However the error is unbiased, meaning that simply increasing the minimum filter width ("softness") will get rid of it. Once doubles are supported on GPUs, there won't be an issue, but currently large shadow maps can cause numeric trouble. The demo uses several methods to greatly improve precision, and there are plenty more ways. In any case smaller shadow maps work extremely well and look great when filtered properly.
- Enabling multi-sampling while rendering the variance shadow map works fairly well, but it becomes insignificant once the "Softness" is even a few notches up. It also hurts numeric stability a bit and is thus disabled in the current demo. It was a good idea, but summed-area tables are better.
- I've played a bit with combining SAVSMs with "Percentage Closer Soft Shadows" and the results are quite promising (constant-time, efficient plausible soft shadows!). However there are quite a few details and boundary conditions to sort out in order to make it robust... I simply do not have time right now. Hopefully someone will get the time to combine this technique with PCSS, or one of the more recent rear-projection algorithms.
- DirectX 10 should improve the speed of summed-area table generation quite a bit (although it is already pretty fast at <2ms for a 512x512 shadow map on an 8800GTX). Once my new machine with Vista arrives in a few weeks I'm going to port the demo, and I'll post the results if there are significant changes.
Anyways sorry for the long post... I got rambling. Please feel free to ask any questions if I didn't make anything clear. Rest assured if my article is accepted into GPU Gems 3 I will cover variance shadow maps, summed-area tables and all of the details that I've only hinted at here in depth.
Enjoy!
Andrew Lauritzen
University of Waterloo / RapidMind Inc.
Last edited by a moderator: