How do videocards store the frame they are outputting to the monitor

sonix666

Regular
Hi, at another forum we were having a discussion about how the framebuffers on a videocard are stored in regard to FSAA. In the discussion there are two idea's of how they store it:

1) Two framebuffers with all FSAA samples seperate.
(memory = width * height * bytes/pixel * fsaa_samples * 2)

2) One framebuffer with all FSAA samples seperate and another one (or two) where the other framebuffer gets its FSAA samples merged. The idea behind this one being that it cuts down alot on memory usage.
(memory = (width * height * bytes/pixel * fsaa_samples) + (width * height * bytes/pixel))

Can anyone enlighten me which one video cards use (or maybe another option)?
 
To my knowledge, NV only uses 1) with the "quincunx"-type FSAA modes. Doing it that way prevents things from working properly in windowed mode, but when I run certain games in a window (particulary WoW and GW), standard 4xMSAA sure works.
 
Nvidia usually uses 1) for all MSAA modes in fullscreen. When memory is scarce (higher resolutions), for mixed SSAA/MSAA mode and in windowed mode, they fall back to 2).

1) saves bandwidth, provided the framerate is higher than (n-1)/(n+1) times the refresh rate, n being the number of samples.
2) saves memory space.

There is a third option:
Render the antialiased scene in tiles into an on-chip memory, downsample on output to the framebuffer in external memory. This is what PowerVR chips and other TBDRs do, and Xenos as well.
 
arjan de lumens said:
T-buffer appears to be described in OpenGL extension #208, and seems to be not much more than a multisample buffer where you can mask which buffers you write to; I haven't yet seen a good description of what the M-buffer was supposed to be able to do.
The same, just with multisampling instead of supersampling.

Oh, and 3dfx did 1), too.
 
Xmas said:
1) saves bandwidth, provided the framerate is higher than (n-1)/(n+1) times the refresh rate, n being the number of samples.
Which is exactly why I think it's a somewhat dubious bandwidth saving method - because it mostly helps if your framerate is already high, and actually hurts (not much though) if the framerate is low (so, for 4 times AA, at 60Hz, it only helps if the framerate is above 36fps, if you got a crt with say 85Hz you need more than 51fps). It really helps for that quake3 benchmark at 600fps though, in that case the bandwidth saving is substantial...
I could remember that wrong, but I thought only GF 5 series could do it, and only at 2xMSAA (which would make some sense because there you need much lower framerate for it to be beneficial, i.e. at 60Hz only 20fps).
 
GeForce 4 was already able to do 2xMSAA/Quincunx with downsample at scanout. NV40 also supports this for 4xMSAA. Doesn't mean it has to, though.
 
Back
Top