what do you mean by post process msaa on spu?
I'd think the real issue is the bandwidth, and when that's the case sending unresolved buffer to main mem makes even less sense.
I seriously doubt they are doing full msaa.
Bad wording on my part, i meant "msaa" more as "some form of anti aliasing". I doubt they are doing full msaa as well, I'm guessing it's just an edge detect blur done with spu help.
It might not use that much bandwidth to do this. This is all complete speculation on my part...but here goes. If they are indeed doing an edge detect blur, then all the spu's need access to is the Z buffer. It might not even need to be a full Z buffer, it could be 1/4 sized at 640x360, and it could be reduced precision as well, say 2 bytes per entry. It's entirely possible that such a buffer already exists since it's usefull for other post process steps. So maybe ~450k needs to be sent back to the spu's.
The spu's process that small Z buffer, detect "edges", and write out a result buffer which again can be fairly course, it just needs to be hint data to the gpu as to "more or less" which pixels need to be blurred. This small/approximated "blur hint buffer" probably can be left in system ram, since the last blur combine step gets done on gpu anyways. As previously mentioned, I'm guessing that they already have available to them a "blurred color buffer", likely used in other post process steps, and also likely to be 1/4 sized. I'm guessing that "blurred color buffer" is in video ram.
So for the last step, the gpu samples from the small spu created "blur hint buffer" in system ram, samples from the small pre-existing "blurred color buffer" in video ram, then if the hint buffer says it's a blurred pixel the shader goes ahead and writes out the color value from the "blurred color buffer" into the final "opaque color buffer".
That last step can be done fairly cheap by combining it with the alpha merge post process step. Normally when the separate 1/4 sized transparency pass is done, that transparency buffer needs to be blended back with the original opaque color buffer. In this case tweak that step a bit, so instead of:
1) sample opaque color
2) blend with alpha color
3) write out new combined opaque/alpha color
...do it as:
1) sample both opaque color and blurred color (1/4 sized buffer so it's fairly quick) from video ram
2) sample AA hint value from small spu created hint buffer in system memory
3) blend alpha color with either opaque color or blurred color in a branchess manner
3) write out new combined opaque/alpha color
It's not free, there is some cost to do it this way, but msaa on rsx is very slow so this can be quicker if you have spu time to spare. This method may soften the image a bit, but from the screen shots it looks like the game already has a soft look anyways.