Some presentations from GDC08

nAo · Feb 25, 2008

All presentations from the "Core Techniques & Algorithms in Shader Programming" tutorial day are already available online.
Since most of the people involved in this full day tutorial work on consoles I thought it was a better idea to post this link here and not in the 3D technology sub-forum.

Core Techniques & Algorithms in Shader Programming | GDC 2008

AlNom · Feb 25, 2008

Winnings. Thanks nAo.

Btw, the High Dense Foliage presentation is a link to the page instead of the presentation.

Crossbar · Feb 25, 2008

@nAo

"PS3 Engineer"

You definitely wear that title with dignity, someone obviously didn´t skip his math classes.

Edit: Very informative blog by the way, with qualified comments as well.

betan · Feb 25, 2008

Nice, thanks.

Marco, are you the one who convinced Lucas Arts to lead on PS3 from now on?

nAo · Feb 25, 2008

Stick to the topic guys!

_phil_ · Feb 25, 2008

http://www.coretechniques.info/2008/TileTrees.zip

one link is not pointing where it should

nAo · Feb 25, 2008

_phil_ said:
http://www.coretechniques.info/2008/TileTrees.zip

one link is not pointing where it should

Umh..what's wrong with this one? I works fine for me.

Arwin · Feb 25, 2008

nAo said:
Umh..what's wrong with this one? I works fine for me.

Highly Dense Foliage Visualization on Multi CPU/SPU Console Architectures by Mark McCubbins 45 minutes (Download) -> Download link points back to the same page

nAo · Feb 25, 2008

Yes, but the tile tree link reported by _phil_ works fine for me

Kanyamagufa · Feb 25, 2008

It's just the main pages link that's acting wonky.

Great presentations though, thanks Marco

Hope you're enjoying yourself at Lucas

Shifty Geezer · Feb 25, 2008

Arwin said:
Highly Dense Foliage Visualization on Multi CPU/SPU Console Architectures by Mark McCubbins 45 minutes (Download) -> Download link points back to the same page

And there's no file uploaded for it. http://www.coretechniques.info/2008/

Nice thread nAo! Nice to get some real meat from the conference.

ShootMyMonkey · Feb 25, 2008

Hmm... I'd been wondering what you were up to with your method, and this is almost what I figured, given the way you were talking about eliminating variance from the picture. I had something similar, but never really beat down the path of e^x as the function to plug into the Markov inequality. I was heading down the path of Taylor series because I was so concerned with occluder and receiver "shapes." Overlooking the simple...

On a side note, it looks like the way you describe it, there's not really any particular need to pre-filter. One of the disconcerting things I was always running into, at least on 360, was that pre-filtering was invariably slower than filtering on-demand with regular old PCF. Considering that a separable filter on a 1024 shadow map puts you through more than twice the fillrate of one pass on 720p (assume Z-prepass and all), this isn't really all too surprising. And sure enough, it comes out almost even if I do 4 pre-filter passes with a 512 shadow map vs. PCF. Granted, I can believe we're pretty fill-limited on the games we're doing here at Crystal, but it always sucks when you're doing something and just two separable filter passes on a 1024 map cuts your framerate down by 1/3 over small-filter PCF.

nAo · Feb 25, 2008

The nice thing about using an exponential function is that if you expand it with taylor it's easy to see that you are computing all the (infinite) positive moments of your depth distribution times a constant (1/n!).

Pre-filtering and big shadow maps are a big issue, but we 'just' need to get better at distributing shadow maps resolution when it's really needed (more and more smaller charts).
Pre-filtering can also be done in constant time (with respect to the filter width) on a CPU (CELL for example), just wish we had a more advanced/flexible programming model on GPUs.
Can't wait to have CUDA-like shaders in a 3D API.

BTW..if pre filtering big shadow maps take too much time perform as first prefiltering pass a 4:1 downsample (in log space if you work with the 'special' ESM that have no depth range issues)

TimothyFarrar · Feb 26, 2008

Marco, great presentation, only wish I could have seen it in person.

nAo said:
Can't wait to have CUDA-like shaders in a 3D API.

Cannot wait for CUDA/PTX to have the R/W surface cache implemented...

nAo · Feb 26, 2008

TimothyFarrar said:
Marco, great presentation, only wish I could have seen it in person.

Thank you Timothy, I am a big fan of your awesome awesome blog, hope to meet you somewhere at some point in the future

Cannot wait for CUDA/PTX to have the R/W surface cache implemented...

Then wait for Larrabee..!

corysama · Feb 27, 2008

nAo said:
Can't wait to have CUDA-like shaders in a 3D API.

I think Chaz Boyd of the DirectX team is picking up your psychic cry for help. During his "DirectX Futures" talk he specifically mentioned adding software-managed cache to enable more efficient image processing operations.

IIRC, it is possible to context-switch a a G80 between 3D mode and CUDA mode quickly... Depending on how quickly, perhaps we could do
3D)Render shadow maps
CUDA) filter shadow maps
3D)Render scene image
CUDA) filter scene image

Andrew Lauritzen · Feb 27, 2008

ShootMyMonkey said:
I was heading down the path of Taylor series because I was so concerned with occluder and receiver "shapes." Overlooking the simple...

The Convolution Shadow Maps paper mentions heading down this path to some extent (at least to represent the visibility function) but they ultimately abandoned it seemingly due to a lack of translational invariance.

ShootMyMonkey said:
On a side note, it looks like the way you describe it, there's not really any particular need to pre-filter.

Not sure what you mean here... you never *have* to pre-filter; it's just an accelerator. In particular mipmaps are a form of pre-filtering and become a huge win when PCF would otherwise have to sample the entire shadow map per-pixel

In any case, ESM is no different in this manner to my knowledge.

On the blur/convolution side, as you have noted for "small" blurs (I'd guess maybe up to 4x4-ish) it may not be worth incurring the extra pass/fill overhead of blurring the whole shadow map. You can experiment with using a non-separable blur to reduce the blur to single-pass, or even do the whole thing in the kernel during lookup (as with PCF). The latter however will only be a win if you have many more shadow map pixels than framebuffer pixels, which implies a poor shadow map projection. The other disadvantage of not pre-blurring is that you really want to generate mipmaps based on the blurred shadow map or else they won't work properly.

It's important not to downplay the importance of mipmapping/aniso in generating high-quality shadows. I consider separable blurs to be a nice "side-effect" of linear filtering, but the real win is the application of linear filtering acceleration structures like mipmaps or summed-area tables. You really have to be comparing to "proper" PCF in which you project the filter area using derivatives and evaluate the whole region (no matter how big) using dynamic branching. With that implementation I think that you'll find the linear/hardware filtering methods to be a substantial win.

ShootMyMonkey said:
And sure enough, it comes out almost even if I do 4 pre-filter passes with a 512 shadow map vs. PCF.

I'd try just using non-separable blurs or similar to reduce the number of passes first. The tradeoff between memory/fill/computation is easily managed with simple convolutions like blurs.

As mentioned you can control it even more explicitly using a software cache, but in my experience you can usually convince even the most stubborn hardware-managed caches to do something reasonable. RSX/G70 (and maybe X1x00 - I'm not sure) can be a bit of a pain this way due to how they handle quad "segments", but you can always play with the register usage in your shader to avoid thrashing the L2.

Mintmaster · Feb 27, 2008

nAo said:
The nice thing about using an exponential function is that if you expand it with taylor it's easy to see that you are computing all the (infinite) positive moments of your depth distribution times a constant (1/n!).

I suppose, but since all the moments are combined into one sum it's not the most sensible way of looking at things, IMO.

I also played around with exponents of the distance term a while ago, but didn't look at prefiltering because I think correct anisotropic filtering and mipmapping are more important. Philosophically, I guess I'm in the same boat as Andy.

I looked at it in a different way, in that filtering exponents lets you get an approximation of the max (or min) distance in any arbitrary per pixel filter kernel. ln space doesn't work, though, because that would lose this ability, but it's a great way to put a clamp on the light bleeding of VSM. Now instead of entire edges potentially having bleeding, it's only the "shadow intersections" that can have problems.

Of course, there are precision problems limiting how good this estimate of the max really is. I mapped e^(ax+b) to the full FP32 range and for really small differences in distance the max estimate degenerates into a linear filter.

In the end, I think the most bulletproof solution for shadow maps is Andy's suggestion of using VSM to see what the variance is, and if it's high enough fall back to many-sample PCF. It's much better than other methods of skipping samples in non-penumbra regions.

nAo · Feb 27, 2008

Mintmaster said:
I suppose, but since all the moments are combined into one sum it's not the most sensible way of looking at things, IMO.

Don't get me wrong, I'm not saying that since all those moments are someway in that single number then we are using them all

I also played around with exponents of the distance term a while ago, but didn't look at prefiltering because I think correct anisotropic filtering and mipmapping are more important. Philosophically, I guess I'm in the same boat as Andy.

While it's true that if you filter in ln space anistropic filtering is not correct, you can still generate your mip maps in ln space, which means that anistropic filtering can't deviate that much from the 'correct' answer. In practice what I've noticed in my tests is some minor overdarkening.

I looked at it in a different way, in that filtering exponents lets you get an approximation of the max (or min) distance in any arbitrary per pixel filter kernel. ln space doesn't work, though, because that would lose this ability, but it's a great way to put a clamp on the light bleeding of VSM.

What is not working in ln space?

Now instead of entire edges potentially having bleeding, it's only the "shadow intersections" that can have problems.

Actually shadows intersections are fine in ESM as long as your receiver lies on a plane (from the light POV) within the filtering region.
It's easy to prove that ESM in this particular case converges to PCF, no matter what your depth complexity is.
Now I'm only using the positive moments of the depth distribution, the challenge is to also use the negative ones..

Mintmaster · Feb 27, 2008

nAo said:
While it's true that if you filter in ln space anistropic filtering is not correct, you can still generate your mip maps in ln space, which means that anistropic filtering can't deviate that much from the 'correct' answer. In practice what I've noticed in my tests is some minor overdarkening.

Interesting. What I was thinking was that when you linearly filter ln values, you're no longer doing a weighted average of the original exponents and instead doing a weighted geometric average, so that would mess up your edge filtering. However, since you're prefiltering, it's probably not a big deal now that I think about it.

What is not working in ln space?

The estimated max function. I was talking about my use of the exponent, not ESM. The thing is that I didn't think of being able to use exp(O-R) to recover the depth test, and was instead thinking of linear ways to stop bleeding from the otherwise nice VSM.

Actually shadows intersections are fine in ESM as long as your receiver lies on a plane (from the light POV) within the filtering region.

True, but when that's not true the problem is similar, though located in a different area. One of the test cases I like for these fancy shadow techniques is two layers of objects over a ground plane. VSM shows bleeding on the ground plane, and ESM shows it on the second object.

Now I'm only using the positive moments of the depth distribution, the challenge is to also use the negative ones..

I don't see how the negative moments help you (at least from the perspective of a negative exponential). They give you information about the closest samples in your filter kernel.

The way I see it, the postive moments give you information about the weight of the furthest samples, and when they're all on the same plane as your receiver, then you can actually retrieve that weight and it equals visibility.

One thing I tried is using regular distance, a positive exponential, and a negative exponential. Then I can get the average, max, and min for an arbitrary kernel, and can get a nice gradient by linearly interpolating. The results are mostly good, but when there's three or more distinct groups of distance values in the kernel, somewhere in the scene there's an artifact. It's better than VSM, but not perfect.

For ESM, I think a more interesting direction for improvement is clipping the furthest values in each texel's kernel when prefiltering. It could introduce other artifacts, but maybe they're less objectionable.

Some presentations from GDC08

nAo

Nutella Nutellae

AlNom

Moderator

Crossbar

betan

nAo

Nutella Nutellae

_phil_

nAo

Nutella Nutellae

Arwin

Now Officially a Top 10 Poster

nAo

Nutella Nutellae

Kanyamagufa

Shifty Geezer

uber-Troll!

ShootMyMonkey

nAo

Nutella Nutellae

TimothyFarrar

nAo

Nutella Nutellae

corysama

Andrew Lauritzen

Moderator

Mintmaster

nAo

Nutella Nutellae

Mintmaster

Similar threads