MultiSample Depth buffer Resolve

sergi.gonzalez · Jul 31, 2007

Hi all,

I need to use the depth buffer as texture in order to apply several post processing effects (like Depth Of Field, Atmospheric scattering, etc.).

In order to perform FSAA, I render the scene to multisample color and depth surfaces. Later, I call stretchRect to resolve multisample color fragments into a color texture.

My questions:

1. Is it possible to resolve a multisample depth buffer into a color/depth texture?

DirectX9 documentation states it is not possible (stretchrect has additional restrictions for depth stencil surfaces).

2. Render the scene with a renderz-only shader into a multisample color texture is the unique solution?

3. Any alternative (render depth to color buffer alpha channel is not useful for me) ?

Thanks in advance,

Sergi

ChenA · Aug 1, 2007

You can use depth texture.
Use D3DUSAGE_DEPTHSTENCIL to create a texture, you can use it as the depthstencil surface and render z to it,then you can use it as the normal texture.

But sampling from nVidia depth textures will return Percentage-Closer-Filtered results as the comparison with an incoming Z value is automatically performed when sampling from depth textures.So you can only use depth texture in ATI card.

Humus · Aug 1, 2007

sergi.gonzalez said:
1. Is it possible to resolve a multisample depth buffer into a color/depth texture?

Not in DX9. You need DX10.1 for that.

sergi.gonzalez said:
2. Render the scene with a renderz-only shader into a multisample color texture is the unique solution?

That's the only workable DX9 approach.

ChenA said:
You can use depth texture.

Not with multisampling.

sergi.gonzalez · Aug 1, 2007

Thanks for the replies.

ChenA: Unfortunately, multisampling is a requirement for my application, so, depth textures can't be used.

Humus: I'll have to render the full scene twice as I supposed.

BTW, I think NVIDIA ForceWare drivers next to release 95.xx will have an extension to disable the automatic pcf comparison of depth stencil textures.

The pcf comparison can be enabled/disabled using custom FOURCC (RAWZ in nv4x and g7x, INTZ in g8x) in resource creation time (CreateTexture, d3dformat parameter).

However, I have not checked if this feature is working in the latest release drivers (168.xx).

-
Sergi

OpenGL guy · Aug 1, 2007

For ATI boards, you can use DF16 and DF24 for depth-textures. Boards that expose the DF24 FOURCC format also support FETCH4 so you can optimize your fetches in some cases. R600-based products expose depth textures with PCF via the standard depth formats (D16, D24X8, etc.).

KindDragon · Nov 30, 2007

sergi.gonzalez said:
The pcf comparison can be enabled/disabled using custom FOURCC (RAWZ in nv4x and g7x, INTZ in g8x) in resource creation time (CreateTexture, d3dformat parameter).

Where can I find documentation about RAWZ and INTZ?

Humus · Nov 30, 2007

Google turns out nothing really, but I just tried using the RAWZ format, and from my findings RAWZ is a 24bit depth format with 8bit stencil. It seems to sample the raw 32bits as BGRA8. So stencil is in .b, and the depth is in .gra, with the most significant bits in .a

sergi.gonzalez · Dec 1, 2007

KindDragon said:
Where can I find documentation about RAWZ and INTZ?

Hi,

I don't understand why NVIDIA has still not documented these features (I suppose they are busy with DX10 drivers). A NVIDIA programmer was who told me all these things.

With INTZ format, the lookups will return 32 bit depth value, but RAWZ value is compressed. In order to decompress it, you must do the following in the pixel shader:

Code:

float z = dot(tex2D(RawZSampler, tcoord).arg, float3(0.996093809371817670572857294849, 0.0038909914428586627756752238080039, 1.5199185323666651467481343000015e-5));

The formats can be created with the new drivers, but nowadays I am using normalized linear depth by other reasons (see http://forum.beyond3d.com/showthread.php?t=45628)

I must benchmark my application in order to know which option is faster:

1. render the scene without z-only pre-pass, post process with depth texture
2. render the scene with a z-only pre-pass, post process with depth
3. render the scene with z-only pass (into R32F texture, normalized linear depth), post process with depth texture

The first option should be the fastest of the most applications.
The second option is the recommended one by all HW vendors (when your application bottleneck is the pixel fillrate).
The third option is the only one that works with multisampling (in DX9) and it is used by Crysis.

Your thoughts?

Cheers,

Sergi

nAo · Dec 2, 2007

sergi.gonzalez said:
In order to decompress it, you must do the following in the pixel shader

I wouldn't call it compressed, it's just split over 3 color channels, being a 24 bits value.
Also reconstructing it just with a dot product on G7x won't give you the exact original depth value..

KindDragon · Dec 2, 2007

Thanks

ConorDickinson · Feb 2, 2008

nAo said:
Also reconstructing it just with a dot product on G7x won't give you the exact original depth value..

As far as I can tell it actually will give you the exact value, but the shader compiler will often "optimize" your code by storing the result of that dot product in an 8 bit register instead of a floating point register. This of course results in some very imprecise depth values. To solve this problem, I add 0.00000001f to the result of the dot product and use that as my depth value. This trick works to solve some other shader compiler bugs as well.

Conor Dickinson
Lead World Graphics Programmer
Cryptic Studios

nAo · Feb 2, 2008

ConorDickinson said:
As far as I can tell it actually will give you the exact value, but the shader compiler will often "optimize" your code by storing the result of that dot product in an 8 bit register instead of a floating point register.

G70 hasn't 8 bit registers.
The sampled value will not be 'exact' because of some conversion going on in texture units (even when only point filtering is used)

Graham · Feb 2, 2008

sergi.gonzalez said:
I must benchmark my application in order to know which option is faster:

1. render the scene without z-only pre-pass, post process with depth texture
2. render the scene with a z-only pre-pass, post process with depth
3. render the scene with z-only pass (into R32F texture, normalized linear depth), post process with depth texture

...

My experience is no.3 is the best option. The overhead per frame is fairly well constant, and it saves you a lot of time per deferred pass. Getting the camera-centred world position is a simple vec multiply - whereas it's a bit more involved otherwise.
This was most significant in a app I made a while ago (for fun). Here is a buggy screenshot, that's *at least* ~20 lights/pixel, and my x1800xl was just keeping 60fps. Not using linear depth would bloat the lighting shader a lot, and being maths limited it'd pretty much halve it's speed (of course this is a really extreme example).

But of course benchmark yourself

Don't trust my word. :yes:

eigers · Mar 24, 2008

Hi all,
Does INTZ format contain stencil bits at all, or is it just depth?
RAWZ seems to be not supported on my 8800 GTS. Is there some other secret format equivalent to RAWZ on G8?

Thanks.
Shai.

eigers · Mar 30, 2008

OK, got the answer to that one from nVidia guys. So, if anyone's interested, INTZ is equivalent to RAWZ (which means 24 bits depth, 8 bits stencil), only with a more sensible access to the depth value, as mentioned above.

assen · Mar 30, 2008

Is there some kind of penalty when using these formats? I read somewhere that they don't work with MSAA depth buffers; this would be very serious. I also heard from the "developer grapevine" that they turn off HiZ (or whatever NVIDIA call their equivalent). Can somebody confirm this?

Humus · Mar 31, 2008

Don't know about the Nvidia formats, but on ATI cards the DF16/DF24 formats disable Z-compression. However, HiZ remains functional.

nAo · Mar 31, 2008

float z generally improves HiZ efficiency

KindDragon · Apr 1, 2008

NVidia INZ doesn't work on Windows XP.

eigers · Apr 28, 2008

Not quite true, KindDragon.
I'm using INTZ on XP without any problem (8800GTS/GTX/GT, 8500...). I'm also using RAWZ on XP (7900).

MultiSample Depth buffer Resolve

sergi.gonzalez

ChenA

Humus

Crazy coder

sergi.gonzalez

OpenGL guy

KindDragon

Humus

Crazy coder

sergi.gonzalez

nAo

Nutella Nutellae

KindDragon

ConorDickinson

nAo

Nutella Nutellae

Graham

Hello :-)

eigers

eigers

assen

Humus

Crazy coder

nAo

Nutella Nutellae

KindDragon

eigers

Similar threads