msaa & csaa

zhugel_007

Newcomer
Hi all,

I am not quite clear of how msaa works.

Here is the definition in wiki:
"multisampling" refers to a specific optimization of supersampling. The specification dictates that the renderer evaluate one color, stencil, etc. value per pixel, and only "truly" supersample the depth value.

If only depth is "supersampled", how is the color value calculated?

Then i read the doc from nvidia's csaa:
MSAA reduces the shader overhead of this operation by decoupling shaded
samples from stored color and coverage; this allows applications using
antialiasing to operate with fewer shaded samples while maintaining the same
quality color/z/stencil and coverage sampling. CSAA further optimizes this
process by decoupling coverage from color/z/stencil, thus reducing bandwidth
and storage costs.

And i am completely confused. Is depth "supersampled" or not? How does csaa work exactly?

Thanks!
 
With multisampling a single color is shaded per pixel. Depth on the other hand is interpolated and a separate depth value provided for each sample. Then the depth and stencil test is performed and the color is written to every sample that passes the depth and stencil tests.

Given that the vast majority of pixels aren't intersected by more than one polygon edge it's kind of wasteful to store for instance 8 samples when there rarely will be more than two distinct colors in those samples. So instead of writing the color to all the covered samples CSAA will only store one instance of the color and instead store an index for each covered sample to tell which color belongs to that sample. This way you can have for instance 8 coverage samples but only 4 slots for the colors. If you have 4 or fewer colors it works fine, but if the pixels is intersected by a lot of polygon edges some color information will be discarded and you may be artifacts.
 
Thanks a lot Humus for answering my question! (Actually, I am a big fan of your demos from long time ago. :) )

So for MSAA, even the color is shaded once per pixel, we still need multiple samples for each pixel. For example, for 4x msaa with resolution 1024x768, we still need 2048x1536 samples, right? And why it is not compatible with Deferred Rendering? If it is because the sample is interpolated based on depth value, can we just simply disable depth write for all the needed passes for Deferred Rendering?

For CSAA, in your example, what will happen if there are more than 4 colors. which 4 color will be stored? First 4? How to deal with other colors? (use closest color instead?) How to decide the number of colors being stored? Is it a fixed number (like 4?) or this number could be changed? I remember I've read something about sampling patterns for MSAA to improve the anti aliasing quality. Does CSAA have different patterns, too?
 
Last edited by a moderator:
For MSSA you have n samples (fragments) per pixel. If a primitive is rasterized on one of those (yes a primitive might miss all the fragments locations if it's too thin), then its color is evaluated at the center of the pixel (or at the center of gravity of the primitive's part inside the pixel using centroid sampling), and is stored "in" the relevant fragment(s) memory location.

So 4x MSAA means 4 times as many fragments as pixels, you're correct.

In Deferred Shading, you first render to a G(eometry)-Buffer, depth, surface attributes (albedo, normal...), then you read that G-Buffer to shade the fragments.
Unfortunately you can't read MultiSampled buffers on many GPU, so you have to resolve your buffer into a single sampled one first, which means that your data is being processed (weighted average), and basically junk. (ok, it's fine on any pixels which fragments all belong to the same primitive.)

For CSAA, I think the simplest is yet to read:
http://developer.nvidia.com/object/coverage-sampled-aa.html
(I realize it doesn't answer all your questions, I don't remember having read how it works exactly, I could speculate, but there's likely someone who knows on this board...)
 
Thanks Roderic! :)

For MSAA:
"Unfortunately you can't read MultiSampled buffers on many GPU, so you have to resolve your buffer into a single sampled one first,"
- so for those cards which can read MultiSampled buffers (like Geforce 8, 9 series with Shader Mode 4), it is possible to use MSAA for Deferred Shading?

For CSAA:
Thanks for the link. :) I've read the document you mentioned from Nvidia, but it is more like "how to use it & how it looks like" than "how it works". :(
 
For MSAA:
"Unfortunately you can't read MultiSampled buffers on many GPU, so you have to resolve your buffer into a single sampled one first,"
- so for those cards which can read MultiSampled buffers (like Geforce 8, 9 series with Shader Mode 4), it is possible to use MSAA for Deferred Shading?
No, you can't access multi sampled buffers from basic DirectX 10 shaders. You'll need DirectX 10.1 and hardware that supports DX 10.1. While GeForce 8+ hardware supports this it can't do some other DX 10.1 stuff so you're left out with this.
However with DX 10.1 and DX 10.1 hardware you use both deferred shading and MSAA.

P.S.:
For CSAA:
Thanks for the link. :) I've read the document you mentioned from Nvidia, but it is more like "how to use it & how it looks like" than "how it works". :(
CSAA will store aditional information per sample (coverage bit mask). So you don't just know that a sample is covered, but you can tell that for example that only half of the sample was actually covered by the triangle.
 
Last edited by a moderator:
Thanks a lot Humus for answering my question! (Actually, I am a big fan of your demos from long time ago. :) )

Hope you didn't miss my last one then, which shows MSAA and deferred shading working together. :)
http://www.humus.name/index.php?page=3D&ID=81

For CSAA, in your example, what will happen if there are more than 4 colors. which 4 color will be stored? First 4? How to deal with other colors? (use closest color instead?) How to decide the number of colors being stored?

I'd say that's "implementation dependent". I don't know what Nvidia does. But I would guess they either just use the first 4, or if they do something more advanced prioritize by coverage. So if there's a new color coming in and it's covering for instance 3 sample locations and there's an already stored color that only covers a single location, that one is replaced. Any way you do it you can get some kind of artifact. CSAA relies on the statistically very low amount of pixels that run into that kind of conflicts.

Is it a fixed number (like 4?) or this number could be changed?

Fixed for a certain CSAA mode. But there are different modes with different sample counts.

I remember I've read something about sampling patterns for MSAA to improve the anti aliasing quality. Does CSAA have different patterns, too?

Yeah. CSAA isn't much different from MSAA in that way. You can see CSAA as just a compression algorithm applied to MSAA, which turns lossy once you exhaust the color storage.
 
No, you can't access multi sampled buffers from basic DirectX 10 shaders. You'll need DirectX 10.1 and hardware that supports DX 10.1. While GeForce 8+ hardware supports this it can't do some other DX 10.1 stuff so you're left out with this.
However with DX 10.1 and DX 10.1 hardware you use both deferred shading and MSAA.

Which means only ATi HD3 and HD4 series at this date.

Actually this is one of the DX10.1 features GF's support that can be exposed via NVAPI, Ubisoft used this with FarCry 2. Of course it requires extra work compared to when using DX10.1 hardware, but it's still usable.
 
No, you can't access multi sampled buffers from basic DirectX 10 shaders.
Yes you can. Access to multi-sampled depth buffers requires DX10.1, but you can write depth to a multi-sampled texture in DX10.
 
Yes you can. Access to multi-sampled depth buffers requires DX10.1, but you can write depth to a multi-sampled texture in DX10.

Indeed.
You can use the Load instruction in a PS4.0 shader, to read from a multisampled texture.
 
Indeed.
You can use the Load instruction in a PS4.0 shader, to read from a multisampled texture.
But you can't get D3D10 to produce a multisampled depth texture - D3D10 enforces MSAA-resolve before you can access the resource as a texture. So the multiple samples are lost.

Jawed
 
But you can write depth to a non-depth texture which means deferred shading with MSAA is entirely possible with DX10, just not as efficient.
 
But you can write depth to a non-depth texture which means deferred shading with MSAA is entirely possible with DX10, just not as efficient.

This method lowers the image quality (very) slightly, as you only store one depth value per pixel from each polygon (shader is run only once for each pixel inside the polygon). MSAA depth buffer stores one depth value per sample on every pixel (it's not replicated like color outputs are).
 
I thought we were talking about getting multiple samples from multisample texture render target, and not about whether you can bind multisample resource to pipeline without manual resolve.
 
Alpha to coverage generates a coverage mask that is (binary) ANDed with the MSAA mask.

(ie Alpha = opacity = occlusion. Alpha = 0.5 means 50% occlusion of samples, so you combine that info with the number of samples inside the primitive to get the samples that will really be written to.)
[Basically means that if a primitive is covering 2 samples of a given pixel, with an alpha of 0.5, only one sample will receive this primitive's color.]
 
Back
Top