Question concerning Video Memory Utilitization when using Multisampling AA

warmachine79

Newcomer
Multisampling Anti-Aliasing mainly impacts memory bandwith, not fillrate. But what with the amount of video memory being used? Does it increase as heavily as bandwidth utilization when enabling Anti-Aliasing (Multisampling)?
 
Hmm, that's an answer. Pretty short one. Can someone explain that in detail? Or maybe can post a link to an article where it is describend?
 
You need additional buffers according to the size of your desired AA-Target. The more samples, the more buffers. Special techniques such as 3dfx'/Nvidias filter at scanout use up additional memory.

With f/a/s you can almost saturate your average 128 MByte-card at 16x12 w/4-sample AA just with buffers.
 
The MSAA implementation on both Nvidia and AMD desktop GPUs stores all the samples until the final resolve pass. So if you're using 4xMSAA, multiply the memory size of both the color and depth buffers by 4. Normally the front buffer is downsampled so its size doesn't increase, though Nvidia is able to filter-on-scanout and in that case the front buffer is also 4 times as big.
 
Note that color and z buffers are quite highly compressed typically when using MSAA. In theory, it would be possible to split the buffers, so not all data would need to be on-card while still giving decent performance (say, for each compressed tile, only half the memory is on-card, the rest is placed in system memory - so only very few accesses should happen to system memory not hampering performance too much). However, no card can do that AFAIK, guess too much additional hw logic would be needed to make that worthwile to implement.
 
I'm fairly sure framebuffer size increases a lot more than bandwith usage does. With current implementations, you have to allocate space for all samples (as if no colour/z compression), yet most of them will typically end up being highly compressed.

I think many people also underestimate the fillrate/texture/shader hit of MSAA. While much cheaper than SSAA, there can be quite a few extra edge fragments to render.
 
I tested it. I used an older game (Far Cry) for testing, i think its more easy to compare with an older game because texture load on video memory won't be too heavy.
Settings: Ultra Quality, 16xAF, all Optimizations disabled. I used Rivatuner and Vidmemwatcher for it. I didn't use any CSAA-Modes because my question is only concerning pure Multisampling. Video Card is a 8800GTS with 640MB.

640x480
0x 188MB
4x 199MB
8x 205MB

1280x1024
0x 210MB
4x 260MB
8x 285MB

1600x1200
0x 225MB
4x 225MB (?)
8x 335MB

Every test was run twice. I don't understand the strange result with 4xMSAA on 16x12.
 
I'm fairly sure framebuffer size increases a lot more than bandwith usage does. With current implementations, you have to allocate space for all samples (as if no colour/z compression), yet most of them will typically end up being highly compressed.

I think many people also underestimate the fillrate/texture/shader hit of MSAA. While much cheaper than SSAA, there can be quite a few extra edge fragments to render.

Can't imagine fillrate is affected much. I guess it's less than 5%. That's very few additional stuff to render.
 
I tested it. I used an older game (Far Cry) for testing, i think its more easy to compare with an older game because texture load on video memory won't be too heavy.
Settings: Ultra Quality, 16xAF, all Optimizations disabled. I used Rivatuner and Vidmemwatcher for it. I didn't use any CSAA-Modes because my question is only concerning pure Multisampling. Video Card is a 8800GTS with 640MB.

[snip]

Every test was run twice. I don't understand the strange result with 4xMSAA on 16x12.
Did you use the ingame Settings to enable MSAA oder did you force it via driver? Reason i', asking is that there was once a GF6-driver which did not enable 4x MSAA in 16x12 because nvidia had trouble determining the correct buffer size. So the official procedure was not to use driver-aa.
 
Can't imagine fillrate is affected much. I guess it's less than 5%. That's very few additional stuff to render.

Those 5 percent are a Matrox guess from about five years ago. Geometric complexity of your average game has much increased since then.
 
Did you use the ingame Settings to enable MSAA oder did you force it via driver? Reason i', asking is that there was once a GF6-driver which did not enable 4x MSAA in 16x12 because nvidia had trouble determining the correct buffer size. So the official procedure was not to use driver-aa.

I had both the ingame AA and the driver AA (via nHancer) enabled to make sure it works. I also activated the checkbox that says "enhance the game setting". I used a tool named "Far Cry Benchmark"
When I am home, I will set the driver AA to "let application decide" and re-test. But it makes sense, describing exactly the scenario and settings I have been testing under.
 
Those 5 percent are a Matrox guess from about five years ago. Geometric complexity of your average game has much increased since then.

Do you know a game with extraordinary high Polygon Count while the textures and the shading are rather "weak" ? In case I would test that again on such a game with same settings I should get as a result

1) Massive FPS drops when increasing AA-Level
2) Massive increasement in memory space usage when increasing AA-Level

Weak shading and textures to make sure that the FPS drops/ Memory Usage results due to AA-increasement, especially if textures are render targets. Or to put it this way: "isolating" the AA-performance hits...
 
Ah and one thing I forgot. I activated Transparency MSAA. Values might differ on other benchmarks, I think TMSAA could cause lots of load in Far Cry, due to lots of vegetation I can imagine that there are lots of Alpha-Test Textures in this scenario. Activating TSSAA results in a VERY massive performance drop, I think its due to the dense vegetation.
 
Note that color and z buffers are quite highly compressed typically when using MSAA. In theory, it would be possible to split the buffers, so not all data would need to be on-card while still giving decent performance (say, for each compressed tile, only half the memory is on-card, the rest is placed in system memory - so only very few accesses should happen to system memory not hampering performance too much). However, no card can do that AFAIK, guess too much additional hw logic would be needed to make that worthwile to implement.
Matrox (Fragment AA) and 3DLabs (SuperScene AA) implemented split buffer approaches, although both buffers are kept in video memory. The first buffer holds a fixed number of samples per pixel, while the second buffer is used for allocating additional samples on demand, therefore it can be a lot smaller for typical scenes. Running out of sample memory can become a problem for more complex scenes, though, and you add a level of indirection to framebuffer accesses.
 
Did you use the ingame Settings to enable MSAA oder did you force it via driver? Reason i', asking is that there was once a GF6-driver which did not enable 4x MSAA in 16x12 because nvidia had trouble determining the correct buffer size. So the official procedure was not to use driver-aa.

Quasar, you have been right. Using the Ingame-4xMSAA memory usage increased from 225MB to 275MB.
 
Back
Top