Carmack to use shadow map in the next game?

So, what do you think of my method? Are there any problems with it? If you assume that the other lighting methods use the same amount of passes (as it seems), it would be just as fast, totally dynamic and make a very nice realistic lighting, as far as I can see.

But in that case, it would probably be used right now, so what is the show stopper?
 
Just because it works for your data set, don't assume that its a rule that applies to mine or Carmacks engine.

What if my dataset is equivalent to the one used by Carmack's engine?
I never said it was a rule, I just said that polycount will scale higher when offloading the processing to the GPU, for modern GPUs.

If Carmack had done more shadow volume stuff on the GPU, the game would be much worse for me personally (I play it on a 3.0Ghz CPU with a relatively crappy GPU, its completely GPU bound now with CPU silhuette extraction).

So I trust Carmack made the right choice, for his engine and game. Works in my favor at least.

Just because it works in your favour doesn't mean it works in everyone's favour. Besides, I never said to abandon CPU-based paths for PCs with GPUs that are not faster than CPUs, so it would still work in your favour if there was an accelerated GPU path. But then it would work in my favour aswell. Now Doom3 is the only game in the world that is unplayable on my PC, and I am convinced it will be the only game for quite a while (with the exception of other games based on the engine).
I trust Carmack made the wrong choice. He could add a GPU-accelerated path in less than a day, I know I did. Since his game was delayed over a year anyway, one day more or less wouldn't matter. So I find it inexcusable that he failed to implement such a path.
 
Scali said:
Two maps are no different than one or ten in their performance implications. I really don't see how this is a particularly bad case for shadow maps. In fact, I would tend to think that shadow volumes would tend to be a bit worse in terms of performance implications for multiple lights.

If we assume vertex-limited scenes, obviously the method that requires the least passes will be the fastest.
So then two maps will be twice as slow as one map, and ten maps will be ten times as slow as one map.
If you need 6 maps per light, you render the scene 6 times for the shadowmaps, while you would render it once for the shadowvolumes. You do the math.
Really, it is not that hard to understand.
Your simplifying the problem, a single shadow volume will be faster than 6 shadow maps but not by the massive amount as its looks at first.

Assuming your vertex-limited using stencil shadow (especially with GPU silhuettes) each pass of the geometry will cost alot more than the a single shadow map pass.
The high fill-rate of stencil shadows causes pipeline bubbles, shadow maps generation usually doesn't cause bubbles (simple vertex and pixel shaders without massive extrusion), so you often get much higher vertex rate per pass for shadow maps.

Now its not likely to componsate for having 6 times higher but its likely to cover 2-3 times (dual parabloids perhaps?).

Also shadow maps (non perspective) can have the generation cost amortilized over frames. Stencil volumes can't, this is another factor that makes shadow maps a good choice.
 
DeanoC, can you tell me what your view is of using my proposed projected z-buffer illumination maps to do the lighting of a rendered scene?
 
Scali said:
If we assume vertex-limited scenes, obviously the method that requires the least passes will be the fastest.
So then two maps will be twice as slow as one map, and ten maps will be ten times as slow as one map.
If you need 6 maps per light, you render the scene 6 times for the shadowmaps, while you would render it once for the shadowvolumes. You do the math.
Really, it is not that hard to understand.
First of all, this has nothing to do with the number of lights, which is what I was replying about.

Secondly, this has nothing to do with the number of passes. Rendering a cubemap can hardly be thought of has multipass rendering, as you're rendering entirely different geometry for each of the six faces, and therefore won't typically need to share all that much geometry between the various faces.

So yes, obviously shadow maps increase the geometry load on the GPU, due to the simple fact that they require little fillrate compared to normal rendering, but about the same amount of geometry power. All this does is reduce the amount of geometry you can realistically have in the scene before it starts to get significantly geometry-limited. This performance characteristic alone doesn't really dramatically change the types of scenes you can render.
 
So.

- Static shadow maps require you to calculate them up front and don't cast shadows for mobile objects, so are used for global illumination only.
Passes: none extra.

- Dynamic shadow cube maps require 6 maps to be calculated, but have soft edges.
Passes: 1 * 6 per light.

- Shadow volumes and stencil shadows require calculating new geometry and an extra z-buffer pass and have hard edges.
Passes: 1 to 6 per light.

- Projected textures are used on top of other shadowing methods.
Passes: 1 per light (not omnidirectional).

- Projected z-buffer illumination maps are shadow volumes done like dynamic shadow cube maps and have soft edges.
Passes: 1 to 6 per light.

Correct?

Edit: This is all without the final rendering pass.

Edit2: Corrected the passes for static shadow maps.
 
Shadow volumes require one pass per light for shadows, and one pass per light for lighting.

Shadow maps require one pass per light for shadows for "direcitonal" lighting (such as a flashlight or sunlight). A cubemap (6 textures rendered) is required for lighting that can point in any direction. I don't think 6 passes is an accurate depiction of what is being done here, though, as each of the 6 textures that make up a cubemap is rendering different stuff entirely.

Also, one of the really nice things about texture-based approaches is that it becomes easier to lump the lighting together into a single pass. For example, with shadow maps, you merely have to have an additional texture for each shadow that is calculated for the surface. This will result in fewer lighting calculations, fewer texture reads (for example, only need to read in the normal map and color textures once), and fewer framebuffer accesses.

Projected textures, similarly, would simply add to the length of the shader program for that surface, and therefore won't add another pass.
 
Secondly, this has nothing to do with the number of passes.
Rendering a cubemap can hardly be thought of has multipass rendering, as you're rendering entirely different geometry for each of the six faces, and therefore won't typically need to share all that much geometry between the various faces.

Obviosuly I wasn't talking about passes in terms of multipass rendering, but in terms of the number of times the scene geometry would have to be processed. I don't see why you are trying to pull this out of context, I think it was clear what I meant. I never actually used the word 'multipass' anyway.

While it is true that not every pass has to render every vertex, you will get more workload from the sheer number of render calls, frustum culling, and rendering meshes that are inside multiple frusta (worst case being in all 6 of them).
Which all adds to the fact that increasing the amount of geometry in a scene will adversely affect performance.

All this does is reduce the amount of geometry you can realistically have in the scene before it starts to get significantly geometry-limited. This performance characteristic alone doesn't really dramatically change the types of scenes you can render.

Not the types no, but I believe the original argument was solely about geometric complexity.
 
Chalnoth said:
Shadow volumes require one pass per light for shadows, and one pass per light for lighting.

As far as I understand, that depends on the specific way you implement them, so I added the one extra z-buffer pass bit.

Shadow maps require one pass per light for shadows for "direcitonal" lighting (such as a flashlight or sunlight). A cubemap (6 textures rendered) is required for lighting that can point in any direction. I don't think 6 passes is an accurate depiction of what is being done here, though, as each of the 6 textures that make up a cubemap is rendering different stuff entirely.

Ok. We should make multiple cases of this one: 1 pass, with calculating 1 to 6 maps, just about like the other cases, but you would have to break them down into multiple passes or use branching to get the benefit.

Also, one of the really nice things about texture-based approaches is that it becomes easier to lump the lighting together into a single pass. For example, with shadow maps, you merely have to have an additional texture for each shadow that is calculated for the surface. This will result in fewer lighting calculations, fewer texture reads (for example, only need to read in the normal map and color textures once), and fewer framebuffer accesses.

That's why they all have that last final rendering pass, with only the amount of maps determining the length of it. Early-out would be optimal here, I think.

Projected textures, similarly, would simply add to the length of the shader program for that surface, and therefore won't add another pass.

As would the projected z-buffer illumination maps.

Btw. That name is way to long. Who can think of a better one?
 
Scali said:
Secondly, this has nothing to do with the number of passes.
Rendering a cubemap can hardly be thought of has multipass rendering, as you're rendering entirely different geometry for each of the six faces, and therefore won't typically need to share all that much geometry between the various faces.

Obviosuly I wasn't talking about passes in terms of multipass rendering, but in terms of the number of times the scene geometry would have to be processed. I don't see why you are trying to pull this out of context, I think it was clear what I meant. I never actually used the word 'multipass' anyway.

While it is true that not every pass has to render every vertex, you will get more workload from the sheer number of render calls, frustum culling, and rendering meshes that are inside multiple frusta (worst case being in all 6 of them).
Which all adds to the fact that increasing the amount of geometry in a scene will adversely affect performance.

All this does is reduce the amount of geometry you can realistically have in the scene before it starts to get significantly geometry-limited. This performance characteristic alone doesn't really dramatically change the types of scenes you can render.

Not the types no, but I believe the original argument was solely about geometric complexity.

Wouldn't that depend heavily on the viewpoint and the depth of the scene used for the partial rendering? And the resolution as well, if it is feasible to switch to a lower one for the maps?

Would it be better to upload all the geometry first and just transform all of it multiple times, if at all possible, instead of uploading multiple, smaller scenes?
 
Scali said:
Which all adds to the fact that increasing the amount of geometry in a scene will adversely affect performance.
Fine, but I still don't think it's nearly as severe of a problem as stencil shadow volumes. Stencil shadow volumes really break down if you have a lot of objects, for example. Shadow buffers don't really care that much whether you have a lot of objects or a few complex objects.
 
Fine, but I still don't think it's nearly as severe of a problem as stencil shadow volumes. Stencil shadow volumes really break down if you have a lot of objects, for example.

Do they? In terms of fillrate perhaps, but that would depend on the shape, size and arrangement of these objects and the lightsources, which we weren't considering anyway. In terms of geometry processing, I am still of the opinion that cubemaps require more processing (6 frusta per light per object, and the possibility of objects being in multiple frusta, or worse: bruteforcing everything against every map).

Shadow buffers don't really care that much whether you have a lot of objects or a few complex objects.

As stated above, I do not agree here.
 
Wouldn't that depend heavily on the viewpoint and the depth of the scene used for the partial rendering?

Careful there. Shadows are view-independent, and be aware of the fact that objects that are not inside the viewing frustum can still cast shadows on objects that are.

And the resolution as well, if it is feasible to switch to a lower one for the maps?

The resolution of the shadowmap only affects the fillrate requirements (and the quality ofcourse). The amount of geometry that has to be processed, doesn't change. Since we were considering high-poly scenes (not fillrate-limited), changing the resolution of the shadowmap would have little or no effect on the rendering speed.

Would it be better to upload all the geometry first and just transform all of it multiple times, if at all possible, instead of uploading multiple, smaller scenes?

Ideally you always want to have static geometry. In most cases it is faster to run a vertexshader multiple times than to have the CPU upload new geometry.
In a lot of cases, only things like matrices and other shader constants change every frame, and the geometry is left untouched.
Doom3 doesn't do this, which I consider the reason of its high CPU demands and relatively low polycount. Sadly some people think Carmack is a god, so they think I'm crazy.
 
Scali said:
Wouldn't that depend heavily on the viewpoint and the depth of the scene used for the partial rendering?

Careful there. Shadows are view-independent, and be aware of the fact that objects that are not inside the viewing frustum can still cast shadows on objects that are.

Yes, but that is only a problem if you use geometry, isn't it? Of course you have to use more maps with other methods, but that only makes a difference for the last render pass.

And the resolution as well, if it is feasible to switch to a lower one for the maps?

The resolution of the shadowmap only affects the fillrate requirements (and the quality ofcourse). The amount of geometry that has to be processed, doesn't change. Since we were considering high-poly scenes (not fillrate-limited), changing the resolution of the shadowmap would have little or no effect on the rendering speed.

True, but a lot of large maps would have a large memory requirement.

Would it be better to upload all the geometry first and just transform all of it multiple times, if at all possible, instead of uploading multiple, smaller scenes?

Ideally you always want to have static geometry. In most cases it is faster to run a vertexshader multiple times than to have the CPU upload new geometry.
In a lot of cases, only things like matrices and other shader constants change every frame, and the geometry is left untouched.
Doom3 doesn't do this, which I consider the reason of its high CPU demands and relatively low polycount. Sadly some people think Carmack is a god, so they think I'm crazy.

Well, I don't. :D

If I did, I would never have had the guts to throw this new method in (although I was expecting everyone to shoot me down fast ;) ).

And that's how it works, isn't it? People like us, discussing really interesting new things on forums like this. Or would all the developers do their magic just by themselves? I don't believe that for a second.
 
Wasn't it mentioned in the DX10 leaked slides that cubemaps would be rendered in a single pass on DX10 level hardware? If so would this make shadow maps for point lights more attractive?
 
akira888 said:
Wasn't it mentioned in the DX10 leaked slides that cubemaps would be rendered in a single pass on DX10 level hardware? If so would this make shadow maps for point lights more attractive?
Either you misunderstood what is required for rendering shadowmap information to a cubemap, or I'm misunderstanding you. :) Anyway, rendering to a cubemap takes six "passes" because one has to render each direction. The only way this can be done in a single shot is if there were some special "render-to-cubemap" functionality that would rasterize for all directions and note just for within a <180° FOV.
 
Ostsol said:
akira888 said:
Wasn't it mentioned in the DX10 leaked slides that cubemaps would be rendered in a single pass on DX10 level hardware? If so would this make shadow maps for point lights more attractive?
Either you misunderstood what is required for rendering shadowmap information to a cubemap, or I'm misunderstanding you. :) Anyway, rendering to a cubemap takes six "passes" because one has to render each direction. The only way this can be done in a single shot is if there were some special "render-to-cubemap" functionality that would rasterize for all directions and note just for within a <180° FOV.

No he's right Microsoft have mentioned this in DX10 talks.

The functionality is basically a rendertarget array, although I have no real detail on how it's supposed to work, I'd assume that you still need to do all 6 projections, and set some sort of mask to indicate which targets are live for a given tri. I can't see any other way to implement it.

FWIW IMO both shadow maps and shadow volumes have their problems, neither one solves the general problem particluarly well. I've used both in various projects, I'd use shadow maps as a first choice (they suck slightly less IMO), but my games aren't in a constrained Doom like environemt.
 
So, that would require 1 to 6 passes for each light for shadow (cube) maps as well, with a future DX10 solution to speed it up a bit? Sounds logical.

Can anyone try my proposed solution and tell me if it works as expected? I would really love that.
 
And that's how it works, isn't it? People like us, discussing really interesting new things on forums like this. Or would all the developers do their magic just by themselves? I don't believe that for a second.

That's the way I work anyway. I read papers, interviews, talk to developers via IM or email, forums, or whatever... and somehow connect the fragments of info into a complete 3d engine.
My stencil shadowing method has ideas borrowed from various places. zfail/zpass and scissoring from Carmack himself, GPU extraction and skinning from 3dmark03/NVIDIA paper, lower resolution shadow geometry with extruded frontfaces instead of backfaces, to form caps, from I believe a Flipcode IOTD... the bounding volume idea was my own, although I'm sure others use that aswell... And perhaps something else that I forgot :p

Can anyone try my proposed solution and tell me if it works as expected? I would really love that.

Well, to me it sounds like it's the same as regular shadowmapping, except that you want to use the z-value as light intensity. Normally the z-value is only used to determine whether a pixel is the closest one to the light or not (if it isn't closest, there must be one in front, so this one is in shadow).
But I suppose your idea behind this is that the intensity can be filtered now, which would give softshadows. To be honest, I'm not sure how well that would work. I suppose you'd have to implement it and see.
 
Scali said:
Well, to me it sounds like it's the same as regular shadowmapping, except that you want to use the z-value as light intensity. Normally the z-value is only used to determine whether a pixel is the closest one to the light or not (if it isn't closest, there must be one in front, so this one is in shadow).
But I suppose your idea behind this is that the intensity can be filtered now, which would give softshadows. To be honest, I'm not sure how well that would work. I suppose you'd have to implement it and see.
Actually, I don't see how this method would work at all, since you never see a shadow from the POV of the light source. So of what use would an "intensity value" be?
 
Back
Top