Carmack to use shadow map in the next game?

Scali - (just curious) have you explored the tradeoffs? A CPU based approach should significantly lower the rasterization/fill-rate
requirements...

What tradeoffs would that be exactly?
If you mean the scissor test, you can construct the scissor rectangle from the lightsource, you don't need to process the mesh.

If you mean the zpass test for shadows that don't intersect the near-plane, you can construct a bounding volume for the shadow from the lightsource position and the bounding volume of the object, and test intersection with the near-plane like that.
You can also render the volume without near and far caps this way, if you simply remove the original triangles from the shadowvolume's mesh, and only include the edge quads.

Unless there is anything else, it seems to me like a GPU-based approach can generate almost the exact same shadowvolumes as a CPU-based approach, with the same fillrate requirements.
 
I think he's talking about extracting the silhuette from the source model, which must be done by the CPU no matter what. This means that you must keep to low-res models to be able to do it in real-time.

But it is not required to explicitly extract the silhouette at all, for shadowvolumes. Just doing extrusion of backfaces on a per-vertex basis, with degenerate edge-quads is enough to generate a valid shadowvolume.
So it is no excuse. You can avoid processing the mesh by the CPU altogether, and offload to the GPU, where the processing is almost free.
You will always have relatively low-poly models anyway, for real-time stuff like games, shadowmaps will make no difference in that, especially not when using cubemaps, where you have to render the same scene multiple times per light.
I wouldn't be surprised if ultimately shadowvolumes would scale better than shadowmaps in terms of polycount, since shadowvolumes require you to render 'the scene' twice for each light (once the shadowvolumes, once the lit scene), while cubemaps require you to render the scene 6 times for each light, to get the shadowmap alone.
 
Silouette extraction can be done on the GPU, but since we can't create geometry on the GPU at this time you have to attach 2 degenerates triangles on each edge of a 2-manifold mesh.
A 2-manifold mesh with N triangles has 3N/2 edges and about 3N/6 vertices ..the 'new' mesh will have N + (2*3N/2) = 4N triangles and about 3N vertices.. I don't like that ;)
To extract the shadow volume you need a normal for each triangle and AFAIK you can't compute that normal on the GPU if you don't store other extra data within each vertex.
Moreover with directional lights one can further improve silouette extraction on the CPU with some nice trick to avoid to extrude (and modify) any point on the mesh..
At this time even if silouette extraction can be done on the GPU I wouldn't advocate it, imho.

ciao,
Marco
 
After listening to Carmack's keynote speech I'm convinced he must read these forums :) Perhaps some people will pick up on that (especially the last part were he mentions soft-shadows and the sound system).
 
Silouette extraction can be done on the GPU, but since we can't create geometry on the GPU at this time you have to attach 2 degenerates triangles on each edge of a 2-manifold mesh.

NVIDIA describes a way to use 1 degenerate triangle and exploit the fact that a w=0 projection can generate the same shape as a quad.

A 2-manifold mesh with N triangles has 3N/2 edges and about 3N/6 vertices ..the 'new' mesh will have N + (2*3N/2) = 4N triangles and about 3N vertices.. I don't like that
To extract the shadow volume you need a normal for each triangle and AFAIK you can't compute that normal on the GPU if you don't store other extra data within each vertex.

Yes, you need extra geometry, but the geometry is reasonably compact.
For non-skinned models, the position vector and the plane equation would suffice (or even only the normal, but that would take some more per-vertex processing). So that would be 28 (or 24) bytes per vertex, plus a set of indices (you don't have to store indices for the caps if you don't mind rendering in two calls instead of one. You can arrange the edge quads in a way that the vertex-order forms the original mesh, so rendering without an indexbuffer would generate the caps, and the indices would create the rest of the volume. You lose some efficiency, also in the vertexcaching).
For skinned models you'd need to store the three positions of the triangle for each vertex, the blend weight (and possibly the index), so that would be 36+(B-1)*4 (or 40+(B-1)*8 ) bytes per vertex for B bones.

So the amount of extra geometry shouldn't be all that high (less than 10 mb for the average game?)... especially if you subtract the amount of geometry that a CPU-based approach would send over per frame, on average (you would need to allocate buffers for that too). And the advantage ofcourse is that the data here is static (this should also allow the driver to 'swap out' unused geometry from videomem to mainmem, or do this manually).
Also, you could use lower resolution meshes as shadowcasters, if you like (extruding frontfaces instead of backfaces will move the selfshadowing bugs to the back of the object, which is unlit anyway, so they won't be noticable).
And if we compare to shadowmaps, well... if you need 6 maps per lightsource, you will quickly run into higher memory demands than shadowvolumes.

At this time even if silouette extraction can be done on the GPU I wouldn't advocate it, imho.

With the results I've seen, compared to what Doom3 does, I would certainly pick the GPU-based method on all R3x0 cards and up (most DX8-generation cards don't have enough shaderpower to beat the CPU, and on DX7 cards the method would only work with shader-emulation, which is definitely lots slower than a proper CPU-based approach).

I haven't decided yet if I would favour shadowmaps over shadowvolumes on R3x0 and up though. My decision at the time for shadowvolumes was for the better compatibility with DX7 hardware, and shadowmaps still have some open issues that keep them from being a robust solution.
And my experiment with cubemap shadowmaps was disappointing, because it wasn't faster than the same scene with shadowvolumes, but much lower quality. But on R3x0 you could increase resolution and accuracy, so that would leave mainly the speed and robustness issues.
 
Scali - I mean culling of shadow volume geometry which is completely contained inside other shadow volume geometry (using for example a shadow BSP tree).

Serge
 
Maybe another reason Carmack favors shadow map now is that it's easier to do soft shadow with it. If it was shadow volume, you need N times the fillrate for N sampling points whereas in shadow map, it's just N texture sampling processes.
 
BTW, for those of you that listened to his entire QuakeCon04 talk, how many instances of "Aa-ee" (not "I"... kinda like "y'know" instances!) did he say? :)

Oh, hey anyone knows if his baby has indeed been born?
 
Reverend said:
BTW, for those of you that listened to his entire QuakeCon04 talk, how many instances of "Aa-ee" (not "I"... kinda like "y'know" instances!) did he say? :)

Oh, hey anyone knows if his baby has indeed been born?


Wow, I hear you Rev, It sounded like he has Touretts or some other affliction.

Anyhow it 's irrellavant if the genious is within....
 
Well, when considering which technique will be faster, I tend to consider that one technique has tremendous performance implications in some very simple scenarios that you may want to have in a game, while the other doesn't.

As a quick example, just imagine a scene like Far Cry, outside. There's a whole lot of foliage to look through. If you were to attempt to draw the shadows with stencil shadow volumes, the sum of the fillrate requirements would be tremendous. There's also the potential for a fair bit of geometry limitation here, as many of these objects could have quite small silhouette polygons.

By comparison, I see shadow volumes as having no significant problems with the complexity of the scene, allowing artists much more free reign over the types of scenes they can design with acceptable performance. There, of course, may be cases where the technique is slower, but it's not nearly as much slower as those cases where shadow volumes just breaks down utterly.
 
Reverend said:
BTW, for those of you that listened to his entire QuakeCon04 talk, how many instances of "Aa-ee" (not "I"... kinda like "y'know" instances!) did he say? :)

I tend to disregard stuff like that about people. But my girlfriend thought it was annoying to listen to it.
 
Scali - I mean culling of shadow volume geometry which is completely contained inside other shadow volume geometry (using for example a shadow BSP tree).

Does Doom3 even do this? You can't use a BSP tree on skinned geometry anyway.
 
Well, when considering which technique will be faster, I tend to consider that one technique has tremendous performance implications in some very simple scenarios that you may want to have in a game, while the other doesn't.

There are scenarios where shadowmaps have very bad performance while shadowvolumes would be fine, aswell.
For example, enabling shadows on each plasma cell that you fire, will probably be worse with shadowmaps.

And then there is still the problem of shadowmaps not being robust. They simply don't work in some cases.

So you are only considering the fact that foliage is a bad case for shadowvolumes. But you cannot base a final decision on that.
If your game doesn't have foliage, why even care?
And you need to look at the disadvantages of shadowmaps aswell.
I find the lack of robustness a VERY big issue at this point.

And ofcourse you could always combine the two... using a shadowmap for things like foliage, and using shadowvolumes where shadowmaps would look bad, or won't work.
 
shadow maps aren't robust at all as well. they don't work at all for any geometry with alpha-transparency-maps (or colour-key-maps).

if you have any outdoor scene with some trees, you eighter get very bad shadows, as the tree is low-geometry with alpha-transparency for the leaves, or an imense shadow volume with gigantic overdraw if the trees are actual polygon meshes.

shadow volumes only work well for low detail scenes, with not much geometry, and not much structure in the geometry. shadowmaps don't bother.

and most of the other issues are merely solved as well, with the perspective, the frustum, and the other shadowmap algos.. there are quite some quite new approaches, and they work rather well..

shadow volumes have much less stable performance characteristics, and can kill your gpu if you don't have infinite overdraw performance.. think of a grid at the ceiling and a light above shining trough.. you see the grid shadows on the floor. in case of shadowmapping, there will be no overdraw at all (not more than the scene has, that is). with shadowvolumes, you get another full screensize overdraw per gridline (actually, two), just to determine there isn't any visible shadow in front of you, except on the floor..

shadow volumes are hell bad performing. and they don't work on natural geometry well at all
 
I don't know much about it, so I am probably about to say something quite stupid. So I would appreciate it if you could tell me how this method is called already and what the largest problems with it are.

Looking at shadows and such, why not just render a scene from the viewpoint of each light source, only taking the transparency layer into account and rendering the z-value only? That way, you get a z-buffer full of intensity values, that you can scale down and save as a texture.

If you do that for each light, you get a set of texture maps that tell you exactly how much each pixel is to be illuminated. And when you use texture filtering, you can have soft edges as well.

The obvious problem with that method seems to me, that you have to calculate the position of the texel for each map and look it up, but isn't that what shadow maps do as well?

Another problem would be reflective surfaces like mirrors, those would require you to render the scene even more times.

It would probably be a lot nicer if you had a mathematical method, like using the vector of the normal map, but you might be able to use that to limit the amount of intensity textures you have to look up.

Anyway, shoot!
 
shadow maps aren't robust at all as well. they don't work at all for any geometry with alpha-transparency-maps (or colour-key-maps).

Texturemaps are not geometry.
Shadowvolumes are robust for all geometry.
But indeed, when using normalmaps or opacitymaps to simulate geometry, these are not taken into account with shadowvolumes.

shadow volumes only work well for low detail scenes, with not much geometry, and not much structure in the geometry. shadowmaps don't bother.

As I stated before in this thread, when using cubemap shadows, you have to process a lot more geometry per frame on average. So as polycount scales, cubemaps will perform worse than shadowvolumes. So it is certainly not true that the shadowmap method is not affected by the amount of geometry.

and most of the other issues are merely solved as well, with the perspective, the frustum, and the other shadowmap algos.. there are quite some quite new approaches, and they work rather well..

'rather well' is not the same as robust. There are still problems where you either need too much resolution in the shadowmap to be useful for realtime, or cases that simply cannot be handled, because of bias problems or such.
Shadowmaps work in many situations, but not in all situations. This is one of the main reasons why they aren't being used much yet.

shadow volumes are hell bad performing. and they don't work on natural geometry well at all

But Doom3 proves that shadowvolumes are fast and robust enough to do an entire game with. I have yet to see a game which fully relies on shadowmaps.

I find the arguments for shadowmaps similar to the ones used for raytracing. It is more elegant, and in theory it would be a better solution, but there are some practical problems that have yet to be overcome, which makes the 'less elegant' solution more practical at this time.

Ofcourse I would use shadowmaps if all problems were solved, and ofcourse I would prefer raytracing over triangle rasterizing if it could be done at the same speed (and get at least the same image quality), but at this time, these problems are not solved, so I choose the best alternative for the case at hand.
 
So, I guess my question was really stupid and wouldn't work at all? Not even when you render the lightmaps on a very low resolution, so you don't have to scale them?

Anyway, I was expecting that. But I would really like to hear why it is stupid and wouldn't work.
 
Scali said:
As I stated before in this thread, when using cubemap shadows, you have to process a lot more geometry per frame on average. So as polycount scales, cubemaps will perform worse than shadowvolumes. So it is certainly not true that the shadowmap method is not affected by the amount of geometry.
I don't quite get this argument. Rendering a cubemap is no more geometry limited than rendering a single shadow map. You just have to render more maps. The nice thing about shadow maps is that their performance characteristics mirror normal rendering. Yes, they take less fillrate to render, so they will be more geometry-limited than normal rendering, but you don't have the potentially disastrous scenarios like you do for shadow maps.

In the plasma gun casting a light for each particle instance, for example, shadow maps will be no slower for all of those lights than they would be for just the one. So you don't have a catastrophic performance collapse here. You just have a situation that may not be completely optimal.

'rather well' is not the same as robust. There are still problems where you either need too much resolution in the shadowmap to be useful for realtime, or cases that simply cannot be handled, because of bias problems or such.
Shadowmaps work in many situations, but not in all situations. This is one of the main reasons why they aren't being used much yet.
And I claim that these problems will have to be dealt with, in some fashion, for shadows to become ubiquitous in future games, as they are bound to become. It remains that the performance benefits of shadow maps over shadow volumes are too great to deny.

But Doom3 proves that shadowvolumes are fast and robust enough to do an entire game with. I have yet to see a game which fully relies on shadowmaps.
Except that it does it with low geometry counts and a limited type of game environment.
 
Back
Top