Shadows in Carmack's next engine

Reverend · Nov 4, 2004

Anyone knows for sure what shadowing technique he has decided on? Or at least what options/choices he has narrowed it down to?

I've talked to him about vertex shaders and shadows and expect a reply from him.

Richard · Nov 4, 2004

The only information we have (publicaly at least) is from QCon 04's speech so I'm sure you already know this but here it is:

Shadow Buffer-based (not using PCF) 4samples-16 samples (up to 64 samples for nearly offline render performance). Probably not using cubemaps for most point lights. Probably using 16-bit colour for shadow buffers to save memory.

Some options he's studying:

John Carmack said:
So the solution that I’m looking at for outdoor lighting on there is a multi-level sort of mipmap... cropped mipmaps of shadow buffers on there where you have your 1k by 1k shadow buffer which renders only the, say two thousand units nearest you, and it’s cropped exactly to cover that area dynamically, and then you go ahead and take, you know start scaling by powers of two out there until you’ve covered the entire world, which may require five or six shadow buffers depending on how big your outdoor area is...

***

...I actually dynamically scale all of the resolutions for every single light that’s drawn based upon how big it is in screen, and you can throw other parameters into the heuristic that you use for deciding that, but because of the way I select out the areas that are going to be receiving shadow calculations on there, which I actually use stencil buffer tests (all the work with stencil buffers and the algorithms for that is still having some pay-off in the new engine, even though we’re not directly using that for shadowing), but because of the way I select areas in the screen for that, I don’t require clamping or even power-of-two texturing on the shadow buffers...

AlNom · Nov 4, 2004

Maybe, Reverend, you'd like a copy of that transcript I made for the keynote address?

Reverend · Nov 4, 2004

Ah yes... forgot all about his Qcon address

Anyway :

John Carmack said:
Reverend said:

I was discussing with Tim about the importance of vertex shaders and then about shadows. Tim had told me quite some time back that vertex shaders should be fairly simple ones to set up vectors for more complex pixel shaders (and that is his opinion, given the current industry -- software and hardware -- progress).

Click to expand...

That has been my position as well.

Reverend said:

However, with the current choice of "favourite" shadowing being a toss-up between shadow buffers and shadow volumes (hacked, when necessary, maybe for soft-shadowing), shouldn't more emphasis be placed (mostly by the IHVs) on vertex shader performance and features? It, of course, depends on the way shadowing shapes out in time to come -- stencils is fill-costly due to shadow geometry while with buffers we'd need to be sufficiently high-rez enough to avoid aliasing. There's also the influence on developers of multisampling.

How do you see vertex shading features and performance shaping up when it comes to the consideration of shadowing?

Click to expand...

I have both methods running in the new codebase, because the driver paths for shadow buffers aren't optimized enough to make a determination about final performance yet.

The biggest hit in vertex shader performance right now is the fact that you may need to evaluate them multiple times for multiple passes. Allowing shaders to write back to vertex buffers will be a better improvement than doubling vertex shader performance.

John Carmack

Can anyone give me an example of situations where we would need to "evaluate vertex shaders multiple times for multiple passes"?

AndrewM · Nov 4, 2004

Reverend said:
Can anyone give me an example of situations where we would need to "evaluate vertex shaders multiple times for multiple passes"?

Hi Rev.

To do point lights with shadow maps requires 6 maps, so it's possible that some geometry may be rendered into multiple maps (think of geometry that surrounds a light, such as a light fitting, or even a level), and also to avoid some of the artifacts, you may need to render back and front faces with separate passes and find a midpoint to use as shadow casting geometry - John mentions this in his keynote also.

I like how John mentions vertex write back. Render to Vertex Array.

V3 · Nov 4, 2004

Allowing shaders to write back to vertex buffers will be a better improvement than doubling vertex shader performance.

I like to see something like that. Playstation 2 could do it, PC should be able to do it as well, if they're going the multiple passes route.

AndrewM · Nov 4, 2004

V3 said:
Allowing shaders to write back to vertex buffers will be a better improvement than doubling vertex shader performance.

Click to expand...

I like to see something like that. Playstation 2 could do it, PC should be able to do it as well, if they're going the multiple passes route.

We _can_ do it already.

V3 · Nov 4, 2004

Really ? That's cool. Which OGL extension can be used for this ?

BTW You didn't mean Render to Vertex Array did you ?

AndrewM · Nov 4, 2004

V3 said:
Really ? That's cool. Which OGL extension can be used for this ?

BTW You didn't mean Render to Vertex Array did you ?

Well, yes I did (should have been more clear). We can't write back from vertex shaders yet, but I note that carmack said "shaders", and wasn't specific. No idea if hardware is coming that allows write back from the vertex shader, or if it's even possible.

Reverend · Nov 4, 2004

More from John :

John Carmack said:
Reverend said:

John Carmack said:

I have both methods running in the new codebase, because the driver paths for shadow buffers aren't optimized enough to make a determination about final performance yet.

Click to expand...

It's not currently a hardware (NV4x, R4xx) issue and you think driver improvements can be made? I was also talking to a <snipped game developer> about shadow buffers and how drivers *plus* hardware limitations appear to be maxxed out already with regards to the latest from both NVIDIA and ATI... what sort of things in the drivers could be improved?

Click to expand...

Drivers might be maxed out on D3D, but they certainly aren't on OpenGL.

Okay... so what are the differences between OGL and D3D wrt shadow buffers? Is this simply a matter of IHV priorities (driver development re OGL vs MS/D3D) or are there some important differences between the two APIs as far as shadow buffers are concerned? I'm no driver developer/engineer... any input from the IHV personnels/ driver developers here?

AndrewM · Nov 4, 2004

Reverend said:
More from John :

John Carmack said:

Reverend said:

John Carmack said:

I have both methods running in the new codebase, because the driver paths for shadow buffers aren't optimized enough to make a determination about final performance yet.

Click to expand...

It's not currently a hardware (NV4x, R4xx) issue and you think driver improvements can be made? I was also talking to a <snipped game developer> about shadow buffers and how drivers *plus* hardware limitations appear to be maxxed out already with regards to the latest from both NVIDIA and ATI... what sort of things in the drivers could be improved?

Click to expand...

Drivers might be maxed out on D3D, but they certainly aren't on OpenGL.

Click to expand...

Okay... so what are the differences between OGL and D3D wrt shadow buffers? Is this simply a matter of IHV priorities (driver development re OGL vs MS/D3D) or are there some important differences between the two APIs as far as shadow buffers are concerned? I'm no driver developer/engineer... any input from the IHV personnels/ driver developers here?

The problem is that the Render To Texture extension in opengl is rubbish. It has performance problems, and is a pain in the ass to use. EXT_framebuffer_object is on its way, which will fix this. ARB_superbuffers was canned.

Ostsol · Nov 4, 2004

Reverend said:
Can anyone give me an example of situations where we would need to "evaluate vertex shaders multiple times for multiple passes"?

Wouldn't Doom 3 or 3DMark03's Proxycon be examples? For each pass, the vertices have to be transformed again, since the results are not stored in memory. In the case of Proxycon, skinning must also occur. Doom 3 performs skinning in software, leaving only transformation to the vertex shader -- an extremely light operation.

What Carmack is suggesting is actually quite attractive. One could set up a target vertex buffer where the vertex shader would write skinned and/or transformed vertices to, reducing the load on vertex processing for subsequent passes and therefore making higher polygon counts and/or more lights much more practical.

AndrewM · Nov 4, 2004

Ostsol said:
What Carmack is suggesting is actually quite attractive. One could set up a target vertex buffer where the vertex shader would write skinned and/or transformed vertices to, reducing the load on vertex processing for subsequent passes and therefore making higher polygon counts and/or more lights much more practical.

Yep, the idea is that you dont have to run the complex vertex shader again.

Zeross · Nov 4, 2004

Reverend said:
More from John :
Okay... so what are the differences between OGL and D3D wrt shadow buffers? Is this simply a matter of IHV priorities (driver development re OGL vs MS/D3D) or are there some important differences between the two APIs as far as shadow buffers are concerned? I'm no driver developer/engineer... any input from the IHV personnels/ driver developers here?

Render to texture in OpenGL is done by using the pbuffer interface which is not only ugly but also not efficient because it requires a context switch which is costly. Oh and it's buggy as hell also, I've had a ton of problems with pbuffers on every cards I've tried from R300, to NV30, NV40 and Realizm.

Mordenkainen>are you sure he is not using PCF ? I remember him saying that PCF was not enough to solve the aliasing problem by itself, but combining PCF with multiple jittered samples offered pretty good results. 4 PCF samples would be close to 16 non filtered samples and so on. I don't see a good reason for not using PCF, but it has been a long time since I listen to his keynote so maybe I'm wrong.

Zeross · Nov 4, 2004

Ok here is what he said about PCF :

With shadow buffers, the new versions that I've been working with, there's a few things that have changed since the
time of the original Doom 3 specifications. One this that we have fragment programs now, so we can do pretty
sophisticated filtering on there, and that turns out to be the key critical thing. Even if you take the built
in hardware percentage closer filtering [PCF], and you render obscenely high resolution shadow maps (2000x2000 or more
than that), it still doesn't look good. In general, it's easy to make them look much worse than the stencil shadow
volumes when you're in that basic kind of hardware-only level filtering on it. You end up with all the problems
you have with biases, and pixel grain issues on there, and it's just not all that great. However, when you start
to add a bit of randomized jitter to the samples, you have to take quite a few samples to make it look decent,
it changes the picture completely. Four randomized samples is probably going to be our baseline spec for normal
kind of shipping quality on the next game. That looks pretty good. There's a little bit of, if you look at broader
soft shadows on there, there's a little bit of fizzly pixel jitter as things jump around on there, but the randomized
stuff does look a lot better than any kind of static allocation on it. It should be a good enough level on there,
and the nice thing is because the shadow sampling calculation is completely seperated from the other aspects of
the rendering engine, you can toss in basically as many samples as you want. In my current research I've got a
zero-sample one which is the hardware PCF for comparison, a single sample that's randomly jittered, four samples
as kind of the baseline, and also a sixteen sample version which can give you very nice, high quality soft shadows
on there. And I'll probably toss in even higher ones like a 25 or 64 sample version on there which will mostly be
used for offline rendering work if people want to go ahead and have something render and they don't mind
if it's running a few frames a second, you can get literally film quality shadowing effects out of this, by just
changing out the number of samples that are going on in there. This wind up being very close to the algorithm
that Pixar has used for a great many of the Renderman based movies, and it's just running in the GPU's now in
real time at the lower sample levels.

Thanks to Johnny Watson

: http://www.gamedev.net/community/forums/topic.asp?topic_id=266373

Diplo · Nov 4, 2004

Is Hank Marvin still alive?

Richard · Nov 4, 2004

Zeross said:
Mordenkainen>are you sure he is not using PCF ? I remember him saying that PCF was not enough to solve the aliasing problem by itself, but combining PCF with multiple jittered samples offered pretty good results. 4 PCF samples would be close to 16 non filtered samples and so on. I don't see a good reason for not using PCF, but it has been a long time since I listen to his keynote so maybe I'm wrong.

I actually worded it wrong. What I meant was that John isn't planning to offer anything less than 4-randomized samples as baseline when the game based on the engine is released. He is using a hardware PCF-only mode for tests, however.

PeterAce · Nov 4, 2004

Here is the part that (Zeross mentions above) in JC's view :

John Carmack said:
However, at this point right now, the shadow buffer solution is quite a bit slower than
the existing stencil shadow solution.

Some of that is due to hardware API issues. Right now I'm using the OpenGL p-buffer and render-to-texture interface
which is a GOD AWFUL interface, it has far too much inheritence from bad design decisions back in the SGI days,
and I've had some days where it's the closest I'd ever been to switching over to D3D because the APIs where just
that appaulingly bad. Both ATI and Nvidia have their preferred direction for doing efficient render-to-texture, because
the problem with the existing APIs is not only are they crummy bad APIs, they also have a pretty high performance
overhead because they require you to switch OpenGL rendering contexts, and for shadow buffers that's something that
has to happen hundreds of times per frame, and it's a pretty big performance hit right now. So, both ATI and Nvidia
have their preferred solutions to this and as usual they're not agreeing on exactly what should b done on it, and it's
over stupid, petty little things. I've read both the specs, and I could work with either one, they both do the job,
and they're just silly syntactic things and I have a hard time empathising why they can't just get together and
agree on one of these. I am doing my current work on Nvidia based hardware so it's likely I will be using their
extension.

http://www.gamedev.net/community/forums/topic.asp?topic_id=266373

KimB · Nov 4, 2004

Reverend said:
Can anyone give me an example of situations where we would need to "evaluate vertex shaders multiple times for multiple passes"?

Well, if you're doing the lighting in more than one pass, much of the vertex shader will remain the same between passes (if not all).

Shadows in Carmack's next engine

Reverend

Richard

Mord's imaginary friend

AlNom

Moderator

Reverend

AndrewM

V3

AndrewM

V3

AndrewM

Reverend

AndrewM

Ostsol

AndrewM

Zeross

Zeross

Diplo

Richard

Mord's imaginary friend

PeterAce

KimB

Similar threads