Deferred normals from depth.

Graham · Jan 13, 2007

I've been thinking on this one for quite a while.

Basically, I had this idea floating about - that for a deferred renderer, you could get away without using MRT at all. Simply render depth + diffuse colour.

Now, if it's actually practical is totally besides the point (although I could see use on the 360, tiling advantages... less bandwidth hit).
My 'ultimate' idea revolves around not using normal maps at all - go all out with displacement maps, and let the depth dictate the lighting. 360 rears it's head here again. However one point crops up - I'll get to in a moment.

Anyway, I was mucking about, and actually got the damned thing to work. Which is always a nice thing. And to an extent it works better than I expected.

So. The whole idea is that you can get the information you need for your light pass, ie, position + normal all from the depth buffer.

The first challenge was getting position. Naturally being me I didn't look for the answer, but worked it out myself. It's a pretty simple quadratic equation problem. But enough of that,

The other bit was getting a normal.
And this is where things get fun, and SM3.0 only

So basically I finally have a use for ddx() and ddy() ;-). ddx and ddy are partial derivatives between adjacent pixels in the quad, horizontally and vertically. So you can work out what the change in a value is compared to the neighbour pixel. This, as it turns out, is nice and easy for working out change in surface position on the x and y axis in screen space. Cross these, and you have a normal

Anyway. Here is some shader code:

Code:

float4x4 wvpMatrixINV; // inverse world view projection
float near = 50.0; // near clip
float far = 5000.0;// far clip
float2 window = float2(800,600);
float2 windowHalf = window * 0.5;

...

float writtenDepth = .... // load depth from depth buffer;

//quadratic to get real depth (aka w)
float depth = -near / (-1+writtenDepth * ((far-near)/far));

float2 screen = (vpos.xy-windowHalf)/windowHalf; // screen pixel point

float4 worldPoint = float4(screen,writeDepth,1) * depth; // get the vertex position. * depth accounts for the /w perspective divide
float4 worldPointInv = mul(worldPoint,wvpMatrixINV); // mul by inverse of the world view projection matrix

//yay
float3 normal = normalize(cross(ddx(world.xyz),ddy(world.xyz)));

//dance
Output.Colour = float4(normal*0.5+0.5,0);

And here is a not-so-pretty pictar.

There are lots of ways this could be modified of course, store height in the alpha channel of your colour map to jitter things around. Could work.
Of course the thing I didn't think ofwas that it's all flat shaded (*slap*) - Soo... Er.... MORE TESSELLATION!

I'm not actually too sure if you could realistically see a performance boost using this method. Although it does, in effect, provide a fairly potent form of normal map compression.

I suppose in bandwidth restricted environments, there could well be a boost.

Thoughts? Questions? Links to papers already demonstrating this?

* ohh, and yes that is a quake3 map in an XNA application. I'm slowly porting an old .net Q3 renderer I upgraded a year or so back.. pics: 1, 2, 3 (thats not lightmapped btw

)

SuperCow · Jan 13, 2007

The problem with using ddx and ddy to determine normals in screen space is that ddx and ddy only work on a 2x2 pixel granularity. E.g. the horizontal rate of change you obtain for pixel n (with n even) is the same as pixel (n+1). This will cause continuity issues when lower-frequency normal maps are used which can result in banding artefacts.
Another issue is that an adjacent pixel may belong to a completely different object so computing a normal from those depths will be mathematically incorrect (and once again can create visual issues on polygon edges due to lighting). This issue is actually similar to the issue pertaining to the use of MSAA with deferred shading.

Graham · Jan 13, 2007

Yup.
I know

but I chose not to mention that, because I have a solution, but it's more complex and I haven't implemented it yet

it would end up not using ddx/ddy on the edges by branching off into more complex code. It also requires a good edge outline, which I've only managed to produce rendering the scene twice. Which all kinda defeats the initial point

but it's still interesting.

It all sortof stems from the thing I did ages ago with the shadowed bunnies (two more pics), however it extends into doing antialiasing as a post process effect similar to this post from a while back, but on the gpu fully. From what I can tell it'd be a DX10 or 360 thing. Not doable on SM3.

This was one of the steps along the path to figuring out if my overall grand idea would actually work.
So while in it's current form it's utterly useless, it does prove a point.

Sorry if I didn't make that clearer in the original post

I guess I got a bit wrapped up in it all

But I still think it's interesting

My main goal with doing this was to test if this part of the algorithm was feasible, and it seems it is. I was very worried that precision loss through all the calculations, etc, would screw it up.

Andrew Lauritzen · Jan 13, 2007

The position-from-depth encoding is fairly standard for deferred shading - IIRC they discuss it a bit in the GPU Gems 2 chapter on deferred rendering.

We've used something similar in a different context to get face normals and it works pretty well. I'm a little unsure of how this helps deferred shading much though. Just to avoid storing the normal? As you note it'll be flat shaded for one, and is the 2/3 floats really that bad?

You seem to imply that this is to avoid MRT - why would you want to avoid that, especially when MRT is 100% efficient on the G80 (and hopefully R600). I also don't see how you can avoid storing the rest of the BRDF inputs (data-driven stuff like specular colours/powers, emissive, etc).

I am intrigued by the idea, but is losing normal interpolation really worth it to save 2/3 float components in the G-buffer, or is there some other motivation?

Jawed · Jan 13, 2007

So, could you use the alpha channel to store a material ID?

With a material ID you can then look up the correct combination of textures/maps for each pixel.

Depth + normal + material ID with MSAA samples available when reading from the depth buffer (so, SM4/XB360 only) to produce well-behaved edges and polygon intersections?

Jawed

Arun · Jan 13, 2007

Very nice Graham - that certainly is pretty damn cool!

To be honest, I fail to see how the lack of interpolation is even a problem, you just need to have your height and/or normal maps to be created appropriately. Interpolation is a good hack if you don't have normal maps, though, which you obviously don't have in a Quake3 renderer!

AndyTX: MRT might be 100% efficient on the G80, but it's still going to scale by the datapaths. Obviously, 3x64bpp is going to take longer than 1x32bpp. What this means is that if you begin outputting a lot via MRTs, it's going to begin making sense to render things front-to-back, or even doing an initial Z-Pass. The first kills batching, the second wastes geometry processing and rendering time. Take your pick. And front-to-back is never perfect, of course. So it's not a major optimization, but certainly not useless either.

Jawed: That's worthless if you don't also have the texture coordinates. I have been researching this approach (passively, but thinking about it every couple of days) for the last few weeks. It has its fair share of theorical problems, which are pretty much the same ones as presented by SuperCow for Graham's algorithm. Nothing that can't be fixed, though - for a cost.

In practice, this is a very inelegant but potentially quite efficient way to emulate what a much smarter rasterizer would do: allow triangles sharing an edge to be rasterized in the same quad pipeline, while simultaneously (nearly) supersampling edges.

Uttar

Andrew Lauritzen · Jan 13, 2007

Uttar said:
To be honest, I fail to see how the lack of interpolation is even a problem, you just need to have your height and/or normal maps to be created appropriately. Interpolation is a good hack if you don't have normal maps, though, which you obviously don't have in a Quake3 renderer!

But you don't have texture coordinates either, so what good is a normal map? Besides even with normal maps you want interpolation for the tangent space basis (excepting world space normal maps, but then they need to be unique, like light maps). Theoretically texture arrays can index enough to data to allow for one gigantic atlas of all textures in the scene, but I can't see that being any more efficient than the alternative.

Uttar said:
MRT might be 100% efficient on the G80, but it's still going to scale by the datapaths.

Granted, but in my experience reading/writing the G-buffer is still the cheap part of deferred shading. In the long term it won't be a problem at all since it's predictable streaming memory access.

Uttar said:
Obviously, 3x64bpp is going to take longer than 1x32bpp.

I'm still missing how you avoid writing the rest of the G-buffer data... maybe I'm just being stupid. Someone care to explain? If you're not writing the BRDF inputs you're missing one of the key benefits of deferred shading: the ability to use the rasterizer to solve the light contribution problem per-pixel.

Arun · Jan 13, 2007

AndyTX said:
But you don't have texture coordinates either, so what good is a normal map?

Indeed, in a tradtional deferred renderer you don't have the texture coordinates. There is nothing preventing you from writing the texture coordinates and a material (->texture!) index, though, at least for a DX10-level-only renderer. If you don't have texture arrays, this has to become a switch statement, which isn't too efficient in hardware for obvious reasons!

But then you need a way to detect discontinuities (and Graham's approach for normals might complicate that), and while I think I've got one that makes sense on paper, I obviously haven't implemented it yet. I am also a bit uncertain about its performance characteristics... Should it even work at all, of course.

Uttar

Graham · Jan 14, 2007

Hello.

I probably should have been more obvious in my original post I guess. This was simply an idea I had, and I wanted to see if it actually worked, and more importantly find out if precision problems made it useless.

The overall benefit I see is in the displacement mapped case, in such a case, storing displacement in the alpha of an existing texture is a lot simpler than adding a normal map with alpha. Then again, on such hardware it's probably possible to compute the normals in the vertex/geometry shader anyway, so it may well all be totally redundant.

At the end of the day, it was an interesting experiment, and had some interesting maths, so I thought I'd share it

I'll probably end up not even using it.

Andrew Lauritzen · Jan 14, 2007

Uttar said:
Indeed, in a tradtional deferred renderer you don't have the texture coordinates. There is nothing preventing you from writing the texture coordinates and a material (->texture!) index, though, at least for a DX10-level-only renderer.

Sure, but your "material index" is just a (potentially compressed) representation of all of the other BRDF inputs that you're neglecting to render, so I don't think much is gained. In particular, STALKER's approach of using a material index into a 3D texture of different BRDF slices seems to be a good trade-off, but still requires the other inputs to be saved. I don't think there's any getting around the fact that these inputs must be represented in some form or another - compressed or otherwise. Maybe packing more data into the material index could be handy for some of the inputs though.

Graham said:
At the end of the day, it was an interesting experiment, and had some interesting maths, so I thought I'd share it

Certainly, and I'm glad you did! I was just trying to understand the advantages and disadvantages - sorry if I implied that it was uninteresting (quite the opposite)!

Graham · Jan 15, 2007

Hmm.

I just had another idea..

I'm wondering if the use of ddx/ddy on a texture coordinate would give enough information to eliminate the need for tangent/binormal vertex data...

Be more accurate too, but per pixel transform, not per vertex.

Xmas · Jan 15, 2007

Graham said:
Hmm.

I just had another idea..

I'm wondering if the use of ddx/ddy on a texture coordinate would give enough information to eliminate the need for tangent/binormal vertex data...

Be more accurate too, but per pixel transform, not per vertex.

You might want to read chapter 2.6 of ShaderX5: Normal Mapping without Precomputed Tangents, by Christian SchÃ¼ler.

Arun · Jan 15, 2007

Xmas said:
You might want to read chapter 2.6 of ShaderX5: Normal Mapping without Precomputed Tangents, by Christian SchÃ¼ler.

My copy of ShaderX5 is in the mail, so I guess I know what artile I'll begin with now, thanks!

Uttar

Xmas · Jan 15, 2007

Uttar said:
My copy of ShaderX5 is in the mail, so I guess I know what artile I'll begin with now, thanks!

Uttar

Nah, you should begin with chapter 7.4.

And don't read page 411 too carefully...

Graham · Jan 15, 2007

Ok.

Ohh and btw, the next algorithm I'll post will be even awesomer.*

*provided I get it to work correctly.

Zengar · Jan 15, 2007

Xmas said:
You might want to read chapter 2.6 of ShaderX5: Normal Mapping without Precomputed Tangents, by Christian SchÃ¼ler.

Could you probably summarize his algorithm (I don't have the book)? Is my guess right that he uses derivatives for texture coordinates and position to compute tangent and binormal per-pixel?

RacingPHT · Jan 16, 2007

Another deferred shading backer

Though GPUs has enough power to process pixel-sized polygons, but it still might not a win compared to bigger triangle + normal maps. So I prefer packing normals.
The XBOX 1 RT DR paper could be done on Ati cards with d3d hack, but not as easy on NV(What a shame MS does not provide DST support), or the normal could be extracted via modify output depth, but it might be a very bad idea.

Graham · Jan 16, 2007

RacingPHT said:
Another deferred shading backer
Though GPUs has enough power to process pixel-sized polygons

I have a fantasy of using dynamic displacement mapping to effectivly do ~pixel sized triangles in a single pass. Then do all your fancy stuff in deffered passes. One day... one day...

not as easy on NV (What a shame MS does not provide DST support)

[edit] Binary search!

RacingPHT · Jan 17, 2007

Graham said:
Binary search!

crazy

Deferred normals from depth.

Graham

Hello :-)

SuperCow

Graham

Hello :-)

Andrew Lauritzen

Moderator

Jawed

Arun

Unknown.

Andrew Lauritzen

Moderator

Arun

Unknown.

Graham

Hello :-)

Andrew Lauritzen

Moderator

Graham

Hello :-)

Xmas

Porous

Arun

Unknown.

Xmas

Porous

Graham

Hello :-)

Zengar

RacingPHT

Graham

Hello :-)

RacingPHT

Similar threads