Dave H said:That calculation doesn't seem to count the memory needed for the Z-buffer. With a 32-bit multisampled Z-buffer, it becomes
1600 * 1200 * ( 6*8 + 2*4 ) = ~103 MBytes.
Thanks for the correction. Aren't most Z-buffers still 24-bit, though? (That'd get you to ~92MB.)
I'm not sure if the 8-bit stencil buffer has to be replicated for each sample set...I'd think it would be.
OT but related: how exactly are framebuffer/z-compression working such that they save bandwidth yet don't save memory usage? Conversely, wouldn't hierarchical-Z require more memory space (in the same way that mipmaps take up more memory but reduce bandwidth utilization)?
There are several explanations in these forums in the past, and some concepts outlined by some of the ATI people in scattered places. Sorry, this computer and my connection (using a modem now, haven't configured for the DSL yet) is so slow that until I get used to it, my annoyance factor while using is prohibitive in regards to me doing a search for you, but IIRC, keywords "compression", "bandwidth", "memory" should help you find the comments (I'd guess it was sireric and OpenGL guy who made some comments I recall, so perhaps search by post and look for those threads with matches with their names).
EDIT: I just realized that it might be necessary to look at this thread to understand some of what I say above.
My brief explanation of why, if not how: what if you update the buffer and store it compressed? If so, what happens if you update it the buffer for that screen position again? What if it can't be compressed into the same space? How do you manage the overflow that results? How do you maintain predictable alignment and addressing for each screen position such that the buffer can be randomly accessed?
IIRC, the general indication is that IHVs haven't discovered how to do the above efficiently with a lossless compression scheme yet. We had an interesting discussion primarily about z buffer compression. For framebuffer compression, well, I think Matrox FAA might be considered to be one method of dealing with some of the above issues if it actually saves storage space, though the current implementation seems to have issues.
As for the how of color compression for the R300: if several samples are the same color, you send the color once and cause it to be replicated in the suitable amount of places. Possibilities might be some flexibility in the addressing controller that lets a data value be sent to multiple locations (actual way of doing the above), or as I proposed in another post some sort of "all or nothing" 1 bit mask used to represent that either all the sample colors are the same for a pixel or they are not, and if set, only the first color sample is read at all, and it was the only one written in the first place (conceptual way of doing the above).
I'm pretty sure ATI engineers could conceivably have come up with something a bit different than the above , but since (in my limited knowledge) they appear feasible, there is a chance they didn't have to.
I'll note the first works pretty well to my mind with the idea of a limited amount of samples being allowed, and something about addressing concerns tickles my memory in regards to the discussions I mentioned.
Don't take the above as anything but a general indication of the issues and my own theories on how they might be resolved, though...search out the discussions for better answers!