How to do Shader AA?

Jawed said:
If you want to render a scene with a linear tonemapping from 0-1, then you'd have to know, in advance, how to "invert" the tonal values of every element of the scene before rendering them, based on the final "exposure" you'd want for the scene.
I would think that you could make it piecewise-linear without having to know anything about the exposure. This would break somewhere in the dark range, and somewhere in the overbright range, but that shouldn't be a huge issue.
 
Hyp-X said:
Well it isn't transparent in D3D.
Not sure about GL.

The only 'transparent' thing is when you render to RT textures directly, but in that case there's no copying. (And no AA)
Well, that does seem kind of silly on the surface, but it may actually allow one to do tonemapping before AA downsampling, and avoid having to deal with the problems Jawed pointed out.
 
Jawed said:
I don't think that's what Humus said at all :!:

Jawed

I guess I didn't say it explicitly, but it was the underlying message anyway. All hardware on the market (on the consumer side anyway, don't know with pro cards) only support 32bit framebuffers. The best you can get would be a RGB10_A2 format. So all FP16 stuff has to be done to off-screen buffers.
 
Chalnoth said:
Hrm, I was pretty sure that step 2 was transparent to the software. That is, if a multisample render target is used, it will automatically be downsampled when read in as a texture.

No hardware on the market supports texturing from multisample buffers directly. It needs to be resolved manually. In D3D you use StretchRect() and in OpenGL you use glCopyTexImage2D().
 
Humus said:
I guess I didn't say it explicitly, but it was the underlying message anyway. All hardware on the market (on the consumer side anyway, don't know with pro cards) only support 32bit framebuffers. The best you can get would be a RGB10_A2 format. So all FP16 stuff has to be done to off-screen buffers.
My apologies to Hyp-X.

I admit I'm now entirely bemused by the ability of a graphics card to render (and blend) FP16 to a buffer, but not to the framebuffer.

Are you saying that FP16 has to be rendered into a pair of buffers simultaneously (e.g. two G16R16F - I think that's what I've seen - one for R,G and one for B,A)? because there's no buffer format that supports all four channels concurrently?

Damn I'm confused... All along I've been thinking FP16 was a framebuffer format. I can't be the only one, can I? ARGH...

Sleep...

Jawed
 
I can see how having texture units capable of reading directly from "multi-sample textures" could be complicated and costly HW wise...

So is there really such a large quality degradation associated with tone-mapping an FP16 texture obtained from a multi-sampled FP16 backbuffer rendered/downsampled in linear color space (versus tone-mapping the original FP16 backbuffer)?
 
Jawed said:
My apologies to Hyp-X.

I admit I'm now entirely bemused by the ability of a graphics card to render (and blend) FP16 to a buffer, but not to the framebuffer.

Are you saying that FP16 has to be rendered into a pair of buffers simultaneously (e.g. two G16R16F - I think that's what I've seen - one for R,G and one for B,A)? because there's no buffer format that supports all four channels concurrently?

Damn I'm confused... All along I've been thinking FP16 was a framebuffer format. I can't be the only one, can I? ARGH...

Sleep...

Jawed

No I'm not.
There's A16B16G16R16F but it can be used as an off-screen render target / RT-texture

Fullscreen:
The framebuffer is the buffer that is read by the RAMDAC and therefore displayed on your monitor.
No cards' RAMDAC support higher than 10bit per component.
Also there might be a limitation in Windows not being able to handle a FP video mode (maybe the video mode stuff will be more flexible with Vista...)

Windowed:
The buffer you render to is copied to the actual frambuffer (inside the window), but there's no support for FP->FX8 conversion here. It would probably need changes in the D3D runtime and the driver. But most importantly I don't think there's hardware support for that conversion.

To have FP16 as a useful framebuffer format there should be a way to specify a non-linear conversion (for tone-mapping & gamma correction), which currently exist for FX8 (gamma table in the DAC & D3DPRESENT_LINEAR_CONTENT for copying)
 
psurge said:
So is there really such a large quality degradation associated with tone-mapping an FP16 texture obtained from a multi-sampled FP16 backbuffer rendered/downsampled in linear color space (versus tone-mapping the original FP16 backbuffer)?

Nope, actually that is the correct way to do it.
Jawed was only worrying the downsampling would apply gamma correction which is incorrect due to the content being linear - but I said it's optional.

I don't think a hardware supporting FP16 AA will support gamma corrected downsampling - since it doesn't make much sense...
 
This info is all here but I though I'd synopsis it, as there appears to be some confusion...

There are basically 3 types of video surfaces (actually there are more than 3 but for this discussion 3 will do).
1) Texture
Can be read by the texture units, includes compression formats (DXT), float formats (FP32, FP16, etc.) and integer (ARGB8, ARGB16 etc.). Often these are also 'swizzled' to speed up bilinear reads.
2) Render targets
Can be rendered onto, formats are things like integer (ARGB8, RGB10A2) and float (FP16, FP32). Is a subset of Texture formats (i.e. you can't render to DXT). Some render targets can have MSAA applied but not all (i.e. ARGB8 can, FP16 can't)
3) Scan-out
Can be displayed onto a monitor, a further subset of Render targets. Usually controlled by the DAC which are currently a maximum of 10 bit integer, so the two common formats are ARGB8 or RGB10A2. No cards can scan-out a float render target (actually IIRC 3DLabs top end card can...)

If you want to use the same surface in multiple places (texture and render target), you have to meet the limitations of both AND often a few more. I.e. textures can't read MSAA render-targets, so if you want to texture from something that was rendered with MSAA you have to lose that information (by a copy blit).

A conventional HDR renderer will render the scene into a FP16 render-target with no MSAA in linear colour space. You make that a texture, run a tone-mapper on it rendering into a ARGB8 surface with gamma on (converting linear to gamma 2.2) which can then be displayed to the user. There is no where to get MSAA, the FP16 target can't MSAA and the tone-mapper (which could have MSAA on due to rendering to ARGB8) has no edge information.
 
Big thanks to Humus, Hyp-X and DeanoC.

I shall refer to "FP16 render target blending" from now on, not FP16 framebuffer blending.
Jawed
 
Just to clarify :

NV40 and G70 support 'FP16 filtering and blending' on both Textures and Render Targets.

R4X0 supports 'FP16 textures' (but no blend no filtering). Does it support FP16 Render Targets? (or just FX16)

I supose blending (for HDR) has to be done in the shader? would this require ping-ponging between two render targets?
 
Last edited by a moderator:
PeterAce said:
Just to clarify :

NV40 and G70 support 'FP16 filtering and blending' on both Textures and Render Targets.

R4X0 supports 'FP16 textures' (but no blend no filtering). Does it support FP16 Render Targets? (or just FX16)

I supose blending (for HDR) has to be done in the shader? would this require ping-ponging between two render targets?

Pixel formats supported on R4x0 cards:

Code:
A8R8G8B8      TCV+F RT+BL 2x 4x 6x 
X8R8G8B8      TCV+F RT+BL 2x 4x 6x 
R5G6B5        TCV+F RT+BL 2x 4x 6x 
X1R5G5B5      TCV+F RT+BL 
A1R5G5B5      TCV+F RT+BL 
A4R4G4B4      TCV+F RT+BL 
A8            TCV+F 
A2B10G10R10   TCV+F RT    
G16R16        TCV+F RT    
A2R10G10B10   TCV+F RT    
A16B16G16R16  TCV+F RT    
L8            TCV+F 
A8L8          TCV+F 
V8U8          TCV+F 
L6V5U5        TCV+F 
X8L8V8U8      TCV+F 
Q8W8V8U8      TCV+F 
V16U16        TCV+F 
A2W10V10U10   TCV+F 
UYVY          TCV+F 
YUY2          TCV+F 
DXT1          TCV+F 
DXT2          TCV+F 
DXT3          TCV+F 
DXT4          TCV+F 
DXT5          TCV+F 
ATI2 (3DC)    TCV+F 
D16_LOCKABLE        Z     
D24S8               Z     2x 4x 6x 
D24X8               Z     2x 4x 6x 
D16                 Z     2x 4x 6x 
L16           TCV+F 
Q16W16V16U16  TCV+F 
R16F          TCV   RT    
G16R16F       TCV   RT    
A16B16G16R16F TCV   RT    
R32F          TCV   RT    
G32R32F       TCV   RT    
A32B32G32R32F TCV   RT
T = texture
C = cube texture
V = volume texture
+F = filtering support
Z = depth-stencil buffer
RT = render-target
+BL = blending support
2x etc. = AA support
 
Back
Top