Layered Variance Shadow Maps

Andrew Lauritzen

Moderator
Moderator
Veteran
Hey all,

Just wanted to post a quick link to my latest paper entitled "Layered Variance Shadow Maps", to be published/presented at Graphics Interface 2008 this year. The paper and accompanying video can be grabbed from the following URL:
http://www.punkuser.net/lvsm/

I apologize for the lack of a spiffy web page yet, but I've been working hard on completing my Masters thesis.

In any case, the idea is kind of neat - namely that you can apply monotonic warps to the depth distribution and plug them into Chebyshev's inequality to get different upper bounds on the visibility function, some better than others. The concept of "layered" VSMs is to exploit this fact and effective use a piecewise reconstruction of the visibility function to reduce or eliminate light bleeding. The advantage of the chosen representation is that it scales up the quality of VSMs arbitrarily at a primary cost of the more storage. Because of the chosen representation, only a single sample is needed to evaluate the visibility function at one point (i.e. when shading a pixel) unlike basis-function approaches like Convolution Shadow Maps that require all of the basis coefficients to resolve such a visibility (and thus scale up poorly). Another side advantage is that since each of the pieces represents a smaller depth range, you don't need as much precision, so the technique can be applied to older hardware that may not support fp32 filtering.

The other thing contained in the paper is some preliminary work on "Exponential Variance Shadow Maps", as inspired by Marco Salvi's work. The idea is that an exponential warping of the depth distribution is another application of the above idea. In particular it does a great job of eliminating light bleeding while still not suffering from leakage near castors (since we're still using two moments and Chebyshev's inequality, as usual). The other advantage that I mentioned in another thread is that you can also use the "dual" of the exponential warp to eliminate the "non-planar receiver" artifacts that stock Exponential Shadow Maps suffers from. The EVSM results are preliminary (as I said, I'm trying to graduate here, even though research is more fun), but extremely promising.

Anyways enjoy the paper and I'd be interested to hear any feedback. My current take is that the most useful application of the layered stuff is just to create a few uniform subdivisions, particularly on hardware without fp32 filtering support. Indeed a 2-layer LVSM using 16-bit per component storage fits nicely into a single texture and uses the same amount of memory as a standard VSM, but filters on a much wider range of hardware and provides a bit of light bleeding reduction "for free". The EVSM stuff is probably the most useful moving forward, although you *really* need fp32 filtering for it to shine, and 4xfp32 shadow maps are probably going to be a bit abusive for a few more years. It's pretty clear to me that the benefits will be worth it down the road though (people said VSMs were impractical not even two years ago now - but they are being used increasingly). Interestingly, layers can be used with other warps, so there might be some other fun to be had there, particularly in the offline rendering world where huge filters are common.

Anyways this will be my last paper for now. I'm finishing off my thesis in the next few weeks (a summary of my shadows work mostly), and then I'm off to work at Intel with the ex-Neoptica guys. I'm very much looking forward to the work and the cool opportunities to make a difference in real-time graphics that I expect it will afford in the coming few years :)

Cheers!
Andrew Lauritzen

PS: Sorry there's no demo yet... particularly embarrassing after Humus posted a cool one today. It needs some cleanup though before I post it publicly and I'm not sure when I'll have the time. The video should give you a good idea of how it works though.

PPS: The "_web" version of the paper is the same one with compressed images. Thus if you just want to browse through the paper and results, grab it. If you want to zoom in and see the uncompressed pixels of the 1920x1200 images, grab the big one ;) Also the WMV is the high quality video. The DivX one is of lower quality for submission into the ACM Digital Library.

[Edit] This thread needs more pictures for people who would otherwise not bother to check out the paper... scroll down to find them!
 
Last edited by a moderator:
Impressive, nice work once again! :) I didn't dive in through the paper properly yet; can you see any opportunity to save video memory in a clever way for EVSM somehow? It really seems very good, but 256MiB for a 4096x4096 shadowmap (or four 2048x2048 cascaded shadow maps) is a bit excessive! I guess if video memory really was a bottleneck you could only handle two of the cascaded shadow maps at a time with a bit of hacking, hmm... I wonder how that'd affect performance.

Anyways this will be my last paper. I'm finishing off my thesis in the next few weeks (a summary of my shadows work mostly), and then I'm off to work at Intel with the ex-Neoptica guys. I'm very much looking forward to the work and the cool opportunities to make a difference in real-time graphics that I expect it will afford in the coming few years :)
I didn't have the opportunity to congratulate you on that yet so - grats! :) And remember: Intel better have a logarithmic shadow maps (+ EVSM or other cool algorithm) demo ready on announcement day, or I'm going to pretend Larrabee will fail dismaly just out of spite... :( Just kidding, enjoy!
 
Really great work. Since from several days ago, I have been finding the paper. Hn, let me study it first. Thank you. :p
 
I didn't dive in through the paper properly yet; can you see any opportunity to save video memory in a clever way for EVSM somehow? It really seems very good, but 256MiB for a 4096x4096 shadowmap (or four 2048x2048 cascaded shadow maps) is a bit excessive! I guess if video memory really was a bottleneck you could only handle two of the cascaded shadow maps at a time with a bit of hacking, hmm... I wonder how that'd affect performance.
Yeah it's interesting, and there are a few ways to approach it. Unfortunately I don't see an obvious way to use lower-precision textures because unlike standard VSM which just uses the mantissa ([0,1] range or similar), EVSM makes use of the whole thing. You could of course do something similar to what Marco suggests with ESM's and effectively do the filtering manually in log space, but you lose hardware acceleration, aniso, etc. with that approach. I've been meaning to see how EVSMs work with SAT's, but I'm really concerned about precision problems in that approach. I dunno, though... it might work out okay. I'd have to run the fp math to see for sure.

The obvious way to use less memory is to just not bother doing the second "dual" (the -e^(-cx) one). If you have a small "c" or depending on your scene, you may not notice any of the "non-planar" artifacts. In that case you'd be using the exact same amount of memory as VSMs, with the only difference being the warping of the depth function. It seems like using an exp() warp would make sense in most cases with VSM as long as you're using fp32 filtering.

I briefly tried to think of ways that we could avoid storing both of the second moments (for the pos/neg exp warps) as well, but I didn't come up with much. Indeed it seems like storing those separately is the key to the whole thing, resulting in two bounds, one of which will usually have a small variance in cases where light bleeding would normally occur.

Anyways as I mentioned in the paper, it's preliminary work, but I decided to include it since it's kind of along the same lines as LVSM and "warped VSMs" in general. I'd be really excited if someone decided to follow-up the work and do a "proper" analysis of the various trade-offs between storing one moment and using Chernoff bounds (i.e. ESM) and storing two moments and using a Chebyshev bound, potentially with pos/neg warps (i.e. EVSM). The recent results seem to suggest that there's a lot of low-hanging fruit there, but I just don't have time to pursue it :(.

Now to be fair on the memory front though, I rarely find that you need much more than 4x 1024 cascaded shadow maps (and this is at 1920x1200), particularly when you're using shadow MSAA, which makes a *huge* difference. All of the images from my PSVSM thread a while back where using 4x 1024 I believe and the difference between that and 3x 1024 is not large either.

Still, using less memory is always better ;) That said, EVSM still compares favourably in terms of quality, performance *and* storage to other competitive methods, as I demonstrate in the paper. In particular, performance-wise EVSM is not much slower than pure VSM on G80. I'll also make the point that if you're using deferred rendering, you only need one shadow map (or one cascade), so the per-light memory usage isn't nearly as much of an issue.

I didn't have the opportunity to congratulate you on that yet so - grats! :) And remember: Intel better have a logarithmic shadow maps (+ EVSM or other cool algorithm) demo ready on announcement day, or I'm going to pretend Larrabee will fail dismaly just out of spite... :( Just kidding, enjoy!
Haha, I'll see what I can do, but to be honest I'm almost ready to swear off shadows, at least for a while. When I started doing research into them a few years ago there weren't a lot of good real-time solutions to shadow filtering. With the recent explosion of work in the area (for instance, GI 2008 will have two papers on shadow filtering and several more on soft shadows!), I feel that we're finally at the point where we have a number of workable solutions. Particularly I think ESM/VSM are quite usable right now depending on your scene, and I suspect EVSM will be a reasonable way to go in the future... unless someone comes up with something better in the mean time, which I of course hope they will!

Besides, why would we want to do shadow maps any more when we can do ray traced shadows!!! (I'm trying to get in the mood ;)).

TBH I may not be able to resist combining some of Aaron Lefohn's resolution-matched shadow map stuff together with EVSM or similar, which would result in a totally kick-ass implementation IMHO. In any case there seems to be no shortage of experience or brains in that group, so I'm really looking forward to seeing what they/we come up with across the board :)
 
Does this mean that you passed up an opportunity to work for the leading maker of GPUs in favor of x86 and did anyone every accuse you of having your parents do your science project for you in elementary school?:)
 
Does this mean that you passed up an opportunity to work for the leading maker of GPUs in favor of x86
Larrabee's target market is the exact same as NVIDIA's so I'm not sure how x86 or not is really a factor; plus, *for a software person*, it may be more attractive to toy with more flexible hardware even if it means *potentially* lower performance. All IMO and may have nothing to do with Andrew's thought process.
 
I was joking about the x86 and his science project. Surely performance and programmability are paramount. Personally I'd want to work for team JHH, but that's me.
 
...and did anyone every accuse you of having your parents do your science project for you in elementary school?:)
Haha, I remember that the judges were initially a bit skeptical, but when I explained everything to them about how it worked and showed them the code they came around :) While my dad did indeed introduce me to programming when I was young, he does mostly database stuff and wasn't involved much in that project. Like I said though, it really sounds more impressive than it is... few pages of code really. You could probably do it in a single shader now!

All IMO and may have nothing to do with Andrew's thought process.
My thought process was simple actually: I wanted to work with the Neoptica guys :) Now that their team works for Intel, that's where I'd like to work, but if they had still been independent, I would have applied there.

Additionally, Larrabee is an exciting architecture for software people, as we have the opportunity to do things that we've had our hands tied on in the past. I don't have any insider information yet, but I'd also expect Intel to make something fairly competitive due in part to their massive size and resource base, but also I have a lot of confidence in the team(s) there . If Matt Pharr, Craig Kolb, Aaron Lefohn, Paul Lalonde, etc. etc. are excited about it, then so am I :D

As some background though, I applied and spoke to quite a number of places in the games and graphics industries and ended up with a lot of options. After deciding that I'd probably be happier in graphics for now, Intel was an obvious option. NVIDIA Research would have been another that I would be interested in, but I wasn't able to get in touch with any of them in time... something to consider down the road in any case. Note that another important factor was being able to stay in Canada - I'll be working on the University of Victoria campus in BC! In addition to being a beautiful place to live, it works well for my wife who teaches outdoor education and we also have family in Victoria. So there were a number of other related factors, but even regardless of those, I'm excited about working with the team at Intel in particular.

Anyways this is all extremely off-topic, but I wanted to give a bit of background. I still like NVIDIA/AMD and wish them the best... there just happens to be a great team of guys at Intel that I want to work with. I'm not the type to get overly competitive - I just want to make the best graphics that I can using the tools that I'm given. At Intel, I even have a chance to affect some of those "tool" decisions too ;)

But back on topic... LVSM/EVSM ftw ;)
 
My thought process was simple actually: I wanted to work with the Neoptica guys :)
I certainly can't blame you on that one!
Additionally, Larrabee is an exciting architecture for software people, as we have the opportunity to do things that we've had our hands tied on in the past.
This probably isn't the smartest thing to say publicly, but I find SGX to be much more exciting than Larrabee. There might be a few implementation details missing to make it more appealing for exotic algorithms and GPGPU, but the core principles are much more exciting than Larrabee which is yet another incarnation of "hey, let's add a SIMD unit and extend the ISA!" - like I hadn't seen THAT trick before... Of course it's not a bad trick! ;) I'm just not convinced it *has* to be the way forward.
But back on topic... LVSM/EVSM ftw ;)
Oops, I should STFU here and read the paper instead I guess! :)
 
Pictures for the lazy:


Left: tough scene for VSMs. Right: Same thing rendered with layered VSMs.


Pretty LVSM picture with no visible light bleeding.


Pretty EVSM picture with few (no?) visible shadowing artifacts (!).

Enjoy!
 
This probably isn't the smartest thing to say publicly, but I find SGX to be much more exciting than Larrabee. There might be a few implementation details missing to make it more appealing for exotic algorithms and GPGPU, but the core principles are much more exciting than Larrabee which is yet another incarnation of "hey, let's add a SIMD unit and extend the ISA!" - like I hadn't seen THAT trick before... Of course it's not a bad trick! ;) I'm just not convinced it *has* to be the way forward.
I don't know much about SGX myself (good summary links?), but I'm glad to hear that other people are trying to innovate in this space, as I feel that it's really important moving forward. As I said, I'm generally not very competitive or brand-loyal, so if they come up with something neat, I'm happy :) Still I do think there's something to be said when an ~80000 person company decides to focus resources on something. Like I said I still don't know the Larrabee details yet in any case... all of my information comes from B3D ;)
 
I don't know much about SGX myself (good summary links?), but I'm glad to hear that other people are trying to innovate in this space, as I feel that it's really important moving forward.
Yeah, it'd be naive to think G8x/R6xx-like architectures are going to remain standard for long. Heck, Graphics is fairly unique in that not only does the market move quickly, but key architectural paradigms also move very quickly (at least compared to most industries and other semiconductor companies).

Just clarifying myself wrt SGX: from a software and architectural perspective, I find it very appealing. From a hardware or consumer POV, I don't have a clue because I don't have any real data on actual implementations of the architecture - but neither do I have any on Larrabee, so I still thought that specific comparison was fair game. Comparing it to competing handheld architectures would be a lot less fair on the other hand unless I had actual data to back it up, which I don't. Certainly overhead is an issue with a MIMD architecture and I'd be very curious to see how they handled it and how it compares in terms of perf/mm² and perf/watt.
Still I do think there's something to be said when an ~80000 person company decides to focus resources on something.
Indeed, although it's not like all 80000 were working on Larrabee! ;) I'd certainly be very curious to see how the Larrabee R&D budget stacks up to NVIDIA's NV60 budget or AMD's R800/R900/? budget.
 
Indeed, although it's not like all 80000 were working on Larrabee! ;) I'd certainly be very curious to see how the Larrabee R&D budget stacks up to NVIDIA's NV60 budget or AMD's R800/R900/? budget.
Hehe certainly, and I don't have that answer. I was told, however, that Intel is really serious about this project particularly to be committing valuable fabs to it.

Anyways, LVSMs ;)
 
Andrew Lauritzen, congrats on the job. Enjoyed the paper, still going to take a few re-reads to completely digest. LVSMs might be very interesting in the context of precomputed "mega-texture style" shadowmaps for static geometry.

This probably isn't the smartest thing to say publicly, but I find SGX to be much more exciting than Larrabee.

Arun, seems to me that you are just going to have to write up your SGX article, to clue the rest of us into what you are seeing there. I'm sure I'm not the only one here who would appreciate reading it.
 
Andrew Lauritzen, congrats on the job. Enjoyed the paper, still going to take a few re-reads to completely digest. LVSMs might be very interesting in the context of precomputed "mega-texture style" shadowmaps for static geometry.
Thanks :) Yeah I have not considered much how they fit in with static geometry shadows, but the very least you could do in that case is pre-position the layers - either automatically or manually - to maximize their utility. For static geometry I'd also consider playing with spatially varying warps... it seems to me that if you keep them fairly continuous and not too "aggressive" you wouldn't mess up filtering significantly. Interestingly this sort of spatially varying warp works perfectly well with the Lloyd relaxation algorithm... you just end up reading/using a higher mipmap level when you do the sum reduction rather than the 1x1 level. Lots of future possibilities for sure, and I'm eager to see what people come up with!

Arun, seems to me that you are just going to have to write up your SGX article, to clue the rest of us into what you are seeing there. I'm sure I'm not the only one here who would appreciate reading it.
Yeah I second that! I know almost nothing about it, but your hints make it sound interesting. A B3D tech article is exactly what is needed :)
 
Can't we just say "MIMD scalar" and leave it at that? :p

An article on SGX would require cooperation with ImgTec, and I'm not sure they're up for spilling the beans (for many reasons). We'll do some digging and try, though, if there's enough interest.
 
I think the combination of the Img docs and some TI docs *plus patents* could give an interesting article that'd go even a bit beyond "hay guys VLIW MIMD Scalar with TBDR, Load Balancing and PPP functionality and many levels of ALU precision and programmable blending!" - but that'd be a fair bit of work obviously. And honestly I think it'd be a better idea to do that after an article about architecture in general, which would be a lot of work in itself... hmmm... I'll definitely try starting work on the latter soon at least I guess, heh! :)
 
Great work Andy!
Have you experimented with the log filtering so far? Maybe there's some hope to have decent quality EVSM in 8 bytes per texel..
 
Back
Top