NV40: Why doesn't MSAA work with FP Blending?

LeStoffer · Jul 6, 2004

Yes indeed, why doesn't MSAA work with FP Blending on the NV40?

Of course the simplest explanation could be that nVidia just decided that since the memory bandwidth (and storage) requirements will go up by quite a bit, MSAA should be a no-go.

But that explanation doesnâ€™t sit that well with me since nVidia tends to provide brand new features foremost and performance next. And 800x600 with twice FP Blending and 2xMSAA doesnâ€™t sound impossible to me anyway.

So what gives? I canâ€™t see any reason beyond the bandwidth (and some storage) constraints since MSAA is done well before the FP Blending stage anyway.

Beyond3d's NV40 preview said:
Although most of the pipeline operations work under the OpenEXR format, at present the FSAA multisampling scheme does not.

Maybe it is just a decision within the drivers for now?

ERP · Jul 6, 2004

My understanding is that FP blending is an incredibly expensive feature (transistor wise), so it's most likely related to transistor cost.

Or it's a limitation of the NV40's output logic, does NV40 support MSAA on any >32bit output format?

LeStoffer · Jul 6, 2004

ERP said:
My understanding is that FP blending is an incredibly expensive feature (transistor wise), so it's most likely related to transistor cost.

Yes, I thought about that too. But since MSAA and FP blending isn't performed at the same stage in the pipeline (before and after the PS units) I would assume they won't claim the same logic.

But maybe there is some reuse with regard to write/read to the back buffer (for MSAA) and writing to the 'blend' buffer? :?

tb · Jul 6, 2004

ERP said:
My understanding is that FP blending is an incredibly expensive feature (transistor wise), so it's most likely related to transistor cost.

100 % Right. Maybe FP RTs and FSAA is in the next VPU....

Thomas

Mintmaster · Jul 6, 2004

Is it just FP blending that doesn't work with MSAA? I thought it was FP rendering in general that doesn't work with MSAA on all DX9 cards, but of course I could be wrong.

BTW, does anyone know how the RTHDRIBL demo does FSAA?

Luminescent · Jul 6, 2004

It does seem odd that 2 seemingly unrelated rendering functions, such as MSAA and FP filtering, infringe upon each other in NV40. Aside from requiring floating point precision, what makes performing MSAA on interger blended surfaces different from performing AA on float blended surfaces?

ERP · Jul 6, 2004

Doesn't it depend largely if the blending occurs before or after the multisampling?

I was thinking about this and I'm not certain you can do it before. How does it determine the color of the dest pixel before replication? All 4 of them need not have the same color becasue of varying occlusion.

That means it they would have to replicate the fp blending logic 2 or 4 times to support MSAA. Or at least reuse the same logic, which would complicate the output section.

Dave Baumann · Jul 6, 2004

FP blending and filtering will be orthogonal with all other currently available features in future architectures.

Xmas · Jul 7, 2004

LeStoffer said:
So what gives? I can?t see any reason beyond the bandwidth (and some storage) constraints since MSAA is done well before the FP Blending stage anyway.

It's not. MSAA is a feature that affects operation throughout various parts of the whole pipeline, from triangle setup and others, to blending, frame buffer compression and downsampling.

Supporting MSAA on FP16 render targets would require changes to the last three parts mentioned. And while it does not seem like it requires some difficult and big changes, I don't think it is trivial or cheap either.
Of course it would be nice to have, but slow, so NVidia obviously thought it isn't worth the effort for this generation.

tb · Jul 7, 2004

Mintmaster said:
Is it just FP blending that doesn't work with MSAA? I thought it was FP rendering in general that doesn't work with MSAA on all DX9 cards, but of course I could be wrong.

BTW, does anyone know how the RTHDRIBL demo does FSAA?

It doesn't use the HW FSAA, it uses simple super-sampling. ShaderMark v2.1 will use HW FSAA with 16 bit floating point blending (ARGB16F) on NV4x HW and 16 bit integer blending (ARGB16) on R3xx and R4xx HW.

But I don't think games will use this technique, because you have to render the scene twice, once in an x8r8g8b8 FSAA render target and once in an 16 bit hdr render target (where you do all the hdr calculations) and then blend them together.

http://www.tommti-systems.de/temp/hdr_r3xx.png
http://www.tommti-systems.de/temp/hdr_nv4x.png

Thomas

KimB · Jul 7, 2004

My first guess would be that high-performance FP16 with FSAA would require framebuffer compression, and nVidia has not updated their framebuffer compression routines to work with FP16. This is the only thing I can think of that would directly require more transistors with a FP16 framebuffer.

Dave Baumann · Jul 7, 2004

Chalnoth said:
My first guess would be that high-performance FP16 with FSAA would require framebuffer compression, and nVidia has not updated their framebuffer compression routines to work with FP16

Hint: Read

3dcgi · Jul 7, 2004

The only things I can think of are the drivers just haven't implemented this feature or the memory organization with MSAA doesn't work well with FP16 blending.

FUDie · Jul 7, 2004

3dcgi said:
The only things I can think of are the drivers just haven't implemented this feature or the memory organization with MSAA doesn't work well with FP16 blending.

Or FP16 is not a displayable format.

-FUDie

KimB · Jul 7, 2004

Nah. I'm sure it's more because they decided it wasn't worth the transistor cost this generation. I expect that soon MSAA with FP16 will be possible (especially considering that Dave was told so by nVidia....if you'll read a few posts up).

Dave Baumann · Jul 7, 2004

Chalnoth said:
(especially considering that Dave was told so by nVidia....if you'll read a few posts up).

I didn't say I've been told so by NVIDIA.

FUDie · Jul 7, 2004

Chalnoth said:
Nah. I'm sure it's more because they decided it wasn't worth the transistor cost this generation. I expect that soon MSAA with FP16 will be possible (especially considering that Dave was told so by nVidia....if you'll read a few posts up).

If it's not displayable, then you wouldn't be able to downsample the FP16 AA buffer in the RAMDAC. Nothing you said contradicts what I said, yet you say I am wrong.

-FUDie

KimB · Jul 7, 2004

No, but downsampling via the RAMDAC Is but one option available in the GeForce series of processors for finding the final color value of a particular pixel when using FSAA. A downsample should happen any time you attempt to read from the framebuffer (other than blending, of course), such as, for example, if you read the framebuffer in as a texture in the next pass of rendering (tone mapping in this case).

I don't really think that's a major obstacle, as the hardware could easily use the FP filtering hardware to do the downsampling.

Kristof · Jul 7, 2004

Not sure how NV40 implements MSAA but inside triangles the throughput can be high since the same data is written to all "MSAA-sub-pixels" but these subpixels still need full Z-checking and Blending... so quite possibly they simply do not have enough throughput capability with MSAA. So they don't have the required amount of blending units when using MSAA and no loopback capability to do it over multiple clocks ?

K-

LeStoffer · Jul 7, 2004

Xmas said:
LeStoffer said:

So what gives? I can?t see any reason beyond the bandwidth (and some storage) constraints since MSAA is done well before the FP Blending stage anyway.

Click to expand...

It's not. MSAA is a feature that affects operation throughout various parts of the whole pipeline, from triangle setup and others, to blending, frame buffer compression and downsampling.

Thanks, I forgot about downsampling probably going on so late in the process on the GeForce cards (old 3dfx RAMDAC trick?).

But it is still a bit elusive to me why the full MSAA process (incl final downsampling) can't be finalized on the two different FP blending targets first after which these two FP 'images' are then blended together.

I'm missing something, but what?

NV40: Why doesn't MSAA work with FP Blending?

LeStoffer

ERP

LeStoffer

tb

Mintmaster

Luminescent

ERP

Dave Baumann

Gamerscore Wh...

Xmas

Porous

tb

KimB

Dave Baumann

Gamerscore Wh...

3dcgi

FUDie

KimB

Dave Baumann

Gamerscore Wh...

FUDie

KimB

Kristof

LeStoffer

Similar threads