Speed: assembly vs. HLSL shaders

Nick · Jul 24, 2003

darkblu said:
actually that's quite a significant advantage, IMHO

Yes, I agree that's very important, and that's partially why stencil shadowing is so succesful. The other main reason is that it runs on nearly all hardware. But with newer hardware and floating-point textures as render target, I think there are little issues left. So my expectation is that all next generation DirectX 9 games will use shadow mapping. I could be very wrong too...

apropos, i believe we can divide the shadow-volumes algorithm in two rather distinctive parts, which can be considered independently from each other:

* shadow volume generation
* shadow "casting" from the shadow volume

whereas the shadow volume generation is a rather esoteric, thus highly tricky problem, the shadow casting part of the business is very nice, properly precised, and mightly cool, i may add.

Yes, indeed it is mighty cool. The first time I read about stencil shadowing I couldn't believe it allowed self-shadowing and did all other things correctly, but I slowly started to realize that it's indeed very nice. But I can't help the feeling that it's a detour and it's actually shadowing the scene instead of lighting it. But that's probably just the way I look at it.

JD · Jul 24, 2003

The shadows that doom3 uses look ok to me. Take a look at my pics. Depth buffers have problems all of their own. Asm is always faster than hlsl given same instruction set. However, one needs to write asm optimized for individual hw set and this can get tiring. I think it's a good tradeoff to use hlsl over asm on fast hw. To my knowledge, the apis expose all hw thru asm while hlsl lacks some instructions. I think d3d hlsl compiles to ver.2 asm not 3 if I'm not mistaken. But you don't need shaders for doom3 lighting only arb.env.cmb + crossbar or reg.cmbs. and two texture units. This low req. makes doom3 lighting accessible to low end hw. The speeds are good given low poly scenes and shadow/light optimizations.

http://forged3d.tripod.com

Ilfirin · Jul 24, 2003

Neither shadow volumes nor shadow maps are physically correct by any means. The only way to truly do that would be to trace each individual photon of energy emitted from a light source all the way to its final destination. Most off-line renderers don't even do that for everything and end up hacking it. There's a rather entertaining video presentation on nVidia's developer site about the hacks and tricks off-line renderers do (though it talks about a lot more than that).

But, from the perspective of strictly direct illumination (which isn't physically correct at all), both methods are actually physically correct for either point or directional lights (ignoring any precision problems). But any extension you make of them for area/volume/soft light sources is going to end up being nothing more than aesthetically correct, or at best analytically correct (doing the proper penumbra/umbra calcs, but that still isn't physically correct).

Mostly, that STALKER statement is pure PR

jpaana · Jul 24, 2003

JD said:
I think d3d hlsl compiles to ver.2 asm not 3 if I'm not mistaken.

http://msdn.microsoft.com/library/d...e/d3dx/functions/shader/d3dxcompileshader.asp

D3DXCompileShader:

Vertex shader targets vs_1_1, vs_2_0, vs_2_sw, vs_3_0
Pixel shader targets ps_1_1, ps_1_2, ps_1_3, ps_1_4, ps_2_0, ps_2_sw, ps_3_0
Texture fill targets tx_0, tx_1

Edit: oops the url is to public dx9 sdk doc, quote from beta update, thus missing vs_3_0 and ps_3_0...

Pavlos · Jul 24, 2003

Both techniques (shadow volume and shadow maps) are equally correct, but they are far from the physically correct way of calculating shadows.

The shadow maps have clearly many advantages over the shadow volumes (apart from the precision issues and the point light issues, as Nick said) and they will certainly prevail. The shadow maps solve the shadow determination problem in the image space (thatâ€™s why the final image suffers from aliasing), unlike the shadow volumes that solve the problem analytically in object space.

The shadow volumes offer a superior solution (analytical solutions are almost always better in quality than sampling) but they donâ€™t scale well with the scene complexity. How to efficiently calculate and render the shadow volumes in a very complex scene (say, containing 3 millions particles)? In contrast, itâ€™s very easier to render that complex scene from the light-source position and use the resulting z-image as a shadow map. Itâ€™s much like the debate between the Z-buffer and the other analytical object space HSR methods.

For those interested, the physically correct way of calculating shadows involves complex integral equations of unknown functions and the only way to calculate them correctly in the general case is to perform Monte Carlo integration. This technique is far from real-time, because it traces many rays from the illuminated point to the light source surface. Anyone interested can see some examples of physically correct lighting using Monte Carlo methods with my RenderMan compliant renderer at www.di.uoa.gr/~stud1313/gallery.php

Nick,
The shadow maps can be pre-computed during content creation when the lights AND the geometry are static. But a similar thing applies to shadow volumes too. And for acceptable results, the shadow maps must have a fairly high resolution.

ZenBearClaw! · Jul 24, 2003

cool stuff though a couple of additions:
the resolution and aliasing issues with shadow maps are getting closer to being resolved
http://graphics.stanford.edu/papers/silmap/
though there still are some issues when you get details that are too fine.

and...
JD your page doesn't work with mozilla

MfA · Jul 24, 2003

There will always be issues with uniform sampling, there is no way to avoid it (apart from using ridiculous resolutions for the shadow map).

KimB · Jul 24, 2003

Nick said:
I hope you agree that shadow volumes actually model the shadow, which is a bit of a wrong approach because shadow isn't something physical like light. That doesn't take away that the output is correct. There are so many things that are 'physically incorrect' that produce exactly what we want, so don't take it as something that is necessarily negative.

Shadow volumes just model the border between light and shadow. They model light every bit as much as they model shadow.

Additionally, there are certain techniques that are essentially "switch on" techniques for shadow mapping, where the shadow volume generation is done in the vertex shader. This is done in 3DMark03.

Anyway, shadow volume's primary problem is simply that the algorithm doesn't scale well with scene complexity.

From what you've been saying, it feels to me like you've been turned off of shadow volumes due to seeing too many programs that have used them to add shadow, not mark places that are not to be lighted.

Ilfirin · Jul 26, 2003

One more thing..

RussSchultz said:
As JPAANA stated, the only HLSL that is runtime is OpenGL GSLANG and Cg. Dx9 HLSL compiler, while called at runtime, is statically linked into the application, therefore the output is essentially static. The IHV has no ability to insert their own compiler, and the driver has no ability to optimize based on the HLSL. Its required to work off of the DX9 shader assembly.

Actually, you can do a bit of pseudo run-time optimizations with HLSL. The SDK update is supposed to come with a compiler profile for least register usage along with the least-instruction profile. You can detect what card is running when the user launches your software and use that to choose between the two. I know it's not quite what you mean, but that will probably be 90% as good 90% of the time.

Ilfirin · Jul 26, 2003

Chalnoth said:
Anyway, shadow volume's primary problem is simply that the algorithm doesn't scale well with scene complexity.

For me, that's not really that much of a problem. The deal killer, for me, is the fact that it's based purely around geometry so it won't handle things such as leaves and other alpha-blended textures. For purely indoor games that isn't a problem, but the second you move outside to a non-post apocalyptical world that becomes a really bad problem. It isn't too rare to find a tree that is only a little over 1k triangles using the alpha mask trick that turns into close to 100k triangles when you actually model out everything to get the geometric equivalent. Multiply that by the dozens of trees that are probably going to be around and well.. it's not pretty.

JD · Jul 26, 2003

In that case one can use shadowmaps (not depthmaps) like in doom3 to handle fences and walkways with holes in them. You render the geometry in black into a white texture then use gl automatic texture coordinate generation to projectively map this shadowmap onto the scene. At least that's the theory and I haven't tried it yet so not sure about how it works in practice. Will implement it in future I hope.

Ilfirin · Jul 27, 2003

Then you lose self-shadowing, which is another deal breaker.

Brimstone · Jul 27, 2003

Nick said:
SvP said:

- I haven't seen stalker screenshots yet, but technically, generating shadow maps takes much less time than rendering a frame (lower resolution, no texture mapping, no multipass, early-z/z-pyramid). So it's not minutes or hours but milliseconds.

So, besides the precision issues and the problems with omnidirectional lights, shadow mapping is really the 'physically correct' way of doing shadows. A worst case scenario for shadow volumes, prison bars, is not an issue for shadow mapping.

Click to expand...

Here are some links to screen shots.

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker77.jpg

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker85.jpg

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker84.jpg

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker82.jpg

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker81.jpg

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker78.jpg

http://www.gamershell.com/imagefolio/gallery/FPS/STALKER/Stalker86.jpg

If you like those check out the trailer they released a couple of weeks ago.

High Resolution STALKER demo trailer

Low Resolution STALKER demo trailer

Speed: assembly vs. HLSL shaders

Nick

JD

Ilfirin

jpaana

Pavlos

ZenBearClaw!

MfA

KimB

Ilfirin

Ilfirin

JD

Ilfirin

Brimstone

B3D Shockwave Rider

Similar threads