The math

According to Nvidia's paper, its 28MB. (1024x768x32 4xFSAA). Democoder's calculation is right on target.
 
Xmas said:
Actually, DAC downsampling is more bandwidth efficient even before you reach a 1:1 fps/refresh ratio.
When the framerate is higher than (s-1)/(s+1) times the refresh rate (where s is the number of samples), DAC downsampling saves bandwidth (e.g. 1:3 for 2 samples, 3:5 for 4 samples)
This means however, that the more samples you take the more your framerate needs to approach refresh rate to benefit from DAC downsampling.
I must admit I was only giving an approximation but thanks for the more precise maths anyway:)
Xmas said:
Well, no big difference here as this only affects where the samples are stored in memory. I'm not sure whether GF3/4 put the samples one after another in one buffer (=linear memory area) or uses separate buffers for each sample position. I guess this depends on what the memory controller can do best.
In some respects, yes, it is 'just a matter of where data is located', but in other respects, it isn't. I am interested, to some extent, in the HW configuration. The T-buffer approach is possibly more flexible, in that you could more easily do N independent renders, but then you either have separate memories or possibly suffer greater page break costs etc.

Chalnoth said:
(other than the Kyro series...they do FSAA internally)
Really? I must remember that.... or perhaps that's why I said I was certain that not all systems downsample in the DAC. ;)
 
I though this would be an easy question to answer. Now look what I stirred up! ;)

Are images rendered in the back buffer (and thus need Z-buffer and stencil buffer) and are then stored in the frame buffer(s). And then multiple framebuffers are used to keep the screen from flashing at high framerates? Having done a little programming I realize the need of a dubbel buffer. Why do I need a third one?

Is front buffer the same as the frame buffer? I just trying to get the terms right in my head.

So this is the general idea but different types of boards have different implementations so I basically have to download the whitepapers of the board I'm thinking about to make sure?
 
rAvEN^Rd said:
I though this would be an easy question to answer. Now look what I stirred up! ;)

Are images rendered in the back buffer (and thus need Z-buffer and stencil buffer) and are then stored in the frame buffer(s). And then multiple framebuffers are used to keep the screen from flashing at high framerates? Having done a little programming I realize the need of a dubbel buffer.
Correct. Typically the frame buffer swap is just a matter of changing a couple of registers but, then again, if you need to do a downsampling pass (as was the case with early Geforce system), then it'd be a bit more complicated.
Why do I need a third one?
Are you refering to triple buffering? If you are double buffering and syncronising the swap of buffers to the VSync signal (in order to avoid 'tearing' of the display) then your final instaneous framerate is will always drop to an integer fraction of your refresh rate. For example, if your refresh rate is 60Hz, then the framerates you will get are any of 60Hz, 30, 20, 15 etc. That is, if your system could only render frames at 59Hz, then you will actually only get 30Hz because of the locking to VSync.

Triple buffering allows the system to decouple the rendering and refresh to some extent so that you'd get, on average, higher overall framerates.

Is front buffer the same as the frame buffer? I just trying to get the terms right in my head.
The front one is the one being read by the DAC and hence displayed. The back (i.e. hidden behind the front) is the one in which the next image is being constructed.
 
As I said before, you have to take the architecture into account when calculating the memory requirements.

A chip that does downsampling on-the-fly in the DAC part, needs one high resolution render target (A) with associated depth/stencil buffer (B), and one high resolution front buffer (C) that holds finished frames that will be shown on screen. RAMDAC accesses C.
A buffer swap causes A and C to change their roles.
Each of these buffers is 3 MiB * s in size -> 9 MiB * s (where s is the number of samples)

Chips that don't do DAC downsampling need one high resolution render target (A) with associated depth/stencil buffer (B), one low res buffer to downsample to (C) and one low res front buffer (D). Remember you shouldn't downsample into the front buffer (i.e. the buffer that is currently read by the DAC) because that would cause tearing.
C and D are 3 MiB each, A and B 3 MiB * s each. Thats 6 MiB + 6 MiB * s overall, 30 MiB for 4x AA.

That's without any kind of compression.
 
Chips that downsample to a buffer could actually do it with:
High res z/stencil
High res back buffer
Low res front buffer

The downsampling takes a well known fix time that is conciderabely faster then screen refresh. So when the back buffer is finished, start downsampling to the front buffer immediately (top to bottom). If the downsampling "catches up" with the point were data is read out for screen refresh, then stall the downsampling.

I don't think any chips actually do it that way though.
 
You're right, Basic, but I especially agree to that last sentence ;)

Another idea is that those who disable vsync don't care about tearing, and if you enable vsync the downsampling will be done in the vertical retrace period (if that's possible).
But I believe having one additional buffer is the simplest solution :)
 
It would be fairly safe to assume that if you start downsampling during the vertical retrace period, you will always be in front of the refresh.
 
How do I know when I have a problem with memory quantity? Suppose I can, somehow, calculate that 1280x1024x32 4 sample FSAA needs 60 MB (or whatever) to work. Does this mean that 64 MB of memory on the board will suffice (since I have more memory than that) or is it like a Windows installation where you need lots of free space after the install.

How much extra space do I need in order to not get a huge performance hit? What happens when I run out? Will it try to access main memory somehow?
 
It depends on the game how much extra memory you need, and it depends on the drivers how much the video card will allow you to have before it drops to a lower resolution.

If you don't have enough memory left over after setting aside the framebuffers to use for textures, then the textures will spill over into main memory. The video card will attempt to intelligently-cache those textures in order to prevent too much slowdown, but if there is just too little texture memory remaining, performance will take a nosedive.
 
Back
Top