Alternative AA methods and their comparison with traditional MSAA*

Shifty Geezer · Dec 5, 2009

Mastperf said:
Which still doesn't look that different given the soft look of the game. There's very little aliasing even with it missing in the 360 version.

You're missing the point. The case here isn't how good the AA looks compared to XB360, but how good it looks close up compared to any game. I've created a few comparison pics to illustrate taken from pictures 3/4 from here.

This one illustrates texture detail which Quincunx should have destroyed.

I haven't labelled them meaning any perception is not skewed by platform information.

The quality of AA is exceptional. Perhaps with the blurring and JPEGing, it could, in these samples, be 4xMSAA. However, check out the white starting line from pictures 5 and 6. That looks more like a line-drawing algorithm in effect than any form of averaged sampling. At the very least, they are managing 4xMSAA on PS3 and 0AA on XB360, which goes completely against the norm. That suggests perhaps a more selective method as proposed in the presentation Arwin linked to. Perhaps PS3 is applying a selective 4xMSAA so doesn't have the bandwidth hit...but then XB360 could manage 2xAA at least.

grandmaster · Dec 5, 2009

I'll try to get a couple of lossless shots for tomorrow...

digitalwanderer · Dec 5, 2009

Quick question, with all this focus on console AA for this title shouldn't AA work with it on the PC?

It doesn't...

assen · Dec 5, 2009

No, the technique they use is very PS3-specific, I think would work well only on really high-end PCs in really low resolutions.

Tchock · Dec 5, 2009

Could it be a tent-like custom filter resolve?

If it's running on the SPE you could theoretically have loads of AA filter variations, basing off what I've seen of offline renderers. Catmull, Sinc, etc...

Or I'm on a completely different planet here

(can't get the PDFs as of now, perhaps later)

Quincunx isn't a particularly smart pattern to use, it's just a lazy exit that conveniently matches up with the lack of detail textures present in the current-gen consoles IMO.

jlippo · Dec 5, 2009

Tchock said:
Could it be a tent-like custom filter resolve?

If it's running on the SPE you could theoretically have loads of AA filter variations, basing off what I've seen of offline renderers. Catmull, Sinc, etc...

Or I'm on a completely different planet here (can't get the PDFs as of now, perhaps later)

Quincunx isn't a particularly smart pattern to use, it's just a lazy exit that conveniently matches up with the lack of detail textures present in the current-gen consoles IMO.

Not likely, even in case of tent filters the enges would get wider than single row of pixels.
This looks very close to the morphological AA paper from intel, which basically gets the quality of wu lines algorithm.

MfA · Dec 6, 2009

assen said:
No, the technique they use is very PS3-specific, I think would work well only on really high-end PCs in really low resolutions.

I doubt they are doing anything you couldn't do on a DX10 GPU. The GPU would probably waste a lot more time on skipping pixels than a CPU but meh ... it can spare the juice.

Weaste · Dec 6, 2009

This is very interesting. Has anyone got an interview in the pipeline to question them on exactly that they have done here (yes I know that they don't have jobs anymore), and how much SPU "power" so to speak it takes to do this? The game could run like ass for all we know.

djskribbles · Dec 6, 2009

I'm sure they wouldn't do it unless it didn't cause too much of a performance hit making it run noticeably worse than the 360 version.

joker454 · Dec 6, 2009

Weaste said:
This is very interesting. Has anyone got an interview in the pipeline to question them on exactly that they have done here (yes I know that they don't have jobs anymore), and how much SPU "power" so to speak it takes to do this? The game could run like ass for all we know.

They could be running it in parallel with the transparency pass, so that if they have spu's free the performance hit could be absorbed. In other words, finish the opaque pass on gpu like normal, then send the gpu off to do the 1/4 res transparency pass. While the gpu churns on that, the spu's can take the now complete opaque color buffer and post process the msaa on that. There is likely already a "blurred" color buffer available someplace since it's often needed by post process steps anyways, so the spu msaa task just has to do some edge detect and replace colors in the opaque color buffer with colors from the existing blurred color buffer whenever they are near an edge. If there are enough transparencies going on in a given frame, then the spu msaa cost can potentially be totally absorbed. Once both the transparency pass and msaa tasks are done, have the gpu blend back the transparency buffer into the new msaa'd opaque color buffer.

betan · Dec 7, 2009

joker454 said:
They could be running it in parallel with the transparency pass, so that if they have spu's free the performance hit could be absorbed. In other words, finish the opaque pass on gpu like normal, then send the gpu off to do the 1/4 res transparency pass. While the gpu churns on that, the spu's can take the now complete opaque color buffer and post process the msaa on that. There is likely already a "blurred" color buffer available someplace since it's often needed by post process steps anyways, so the spu msaa task just has to do some edge detect and replace colors in the opaque color buffer with colors from the existing blurred color buffer whenever they are near an edge. If there are enough transparencies going on in a given frame, then the spu msaa cost can potentially be totally absorbed. Once both the transparency pass and msaa tasks are done, have the gpu blend back the transparency buffer into the new msaa'd opaque color buffer.

what do you mean by post process msaa on spu?
I'd think the real issue is the bandwidth, and when that's the case sending unresolved buffer to main mem makes even less sense.
I seriously doubt they are doing full msaa.

joker454 · Dec 7, 2009

betan said:
what do you mean by post process msaa on spu?
I'd think the real issue is the bandwidth, and when that's the case sending unresolved buffer to main mem makes even less sense.
I seriously doubt they are doing full msaa.

Bad wording on my part, i meant "msaa" more as "some form of anti aliasing". I doubt they are doing full msaa as well, I'm guessing it's just an edge detect blur done with spu help.

It might not use that much bandwidth to do this. This is all complete speculation on my part...but here goes. If they are indeed doing an edge detect blur, then all the spu's need access to is the Z buffer. It might not even need to be a full Z buffer, it could be 1/4 sized at 640x360, and it could be reduced precision as well, say 2 bytes per entry. It's entirely possible that such a buffer already exists since it's usefull for other post process steps. So maybe ~450k needs to be sent back to the spu's.

The spu's process that small Z buffer, detect "edges", and write out a result buffer which again can be fairly course, it just needs to be hint data to the gpu as to "more or less" which pixels need to be blurred. This small/approximated "blur hint buffer" probably can be left in system ram, since the last blur combine step gets done on gpu anyways. As previously mentioned, I'm guessing that they already have available to them a "blurred color buffer", likely used in other post process steps, and also likely to be 1/4 sized. I'm guessing that "blurred color buffer" is in video ram.

So for the last step, the gpu samples from the small spu created "blur hint buffer" in system ram, samples from the small pre-existing "blurred color buffer" in video ram, then if the hint buffer says it's a blurred pixel the shader goes ahead and writes out the color value from the "blurred color buffer" into the final "opaque color buffer".

That last step can be done fairly cheap by combining it with the alpha merge post process step. Normally when the separate 1/4 sized transparency pass is done, that transparency buffer needs to be blended back with the original opaque color buffer. In this case tweak that step a bit, so instead of:

1) sample opaque color
2) blend with alpha color
3) write out new combined opaque/alpha color

...do it as:

1) sample both opaque color and blurred color (1/4 sized buffer so it's fairly quick) from video ram
2) sample AA hint value from small spu created hint buffer in system memory
3) blend alpha color with either opaque color or blurred color in a branchess manner
3) write out new combined opaque/alpha color

It's not free, there is some cost to do it this way, but msaa on rsx is very slow so this can be quicker if you have spu time to spare. This method may soften the image a bit, but from the screen shots it looks like the game already has a soft look anyways.

MfA · Dec 7, 2009

Even if applied selectively blurring just gives you a blurry staircase. For edge AA you need to do a weighted blend using the subpixel location of the edge inside the pixel (I'd try moments to determine the edge direction/location).

jlippo · Dec 7, 2009

MfA said:
Even if applied selectively blurring just gives you a blurry staircase. For edge AA you need to do a weighted blend using the subpixel location of the edge inside the pixel (I'd try moments to determine the edge direction/location).

First you find edges, then do pixel counting on those areas and basically fit wu lines to each pixel row and column and you get the needed blending information.
If I understood the idea of MLAA correctly you do not get correct sub-pixel accuracy for the blending, but still you have full gradients to play with.

After pixel counting the blending is simple and you have separate pass for horizontal and vertical lines, similar to wu algorithm you blend 2 texels with certain factor and that's it.

Fun thing is that you can basically use it to whatever you want in screen, z-buffer edges, full screen color, perhaps AA variance shadow map buffer.

semitope · Dec 7, 2009

joker454 said:
It's not free, there is some cost to do it this way, but msaa on rsx is very slow so this can be quicker if you have spu time to spare.

Makes it sound like it would never be used. No-one likes "very slow"

Still no word on how they did it? None of the usual suspects have gotten in contact with the dev since they mentioned doing MSAA on the spus?

This was interesting. Don't quite understand everything they say there.

http://diaryofagraphicsprogrammer.blogspot.com/2009/07/msaa-on-ps3-with-light-pre-pass-on-spu.html

deepbrown · Dec 7, 2009

semitope said:
Makes it sound like it would never be used. No-one likes "very slow"

Still no word on how they did it? None of the usual suspects have gotten in contact with the dev since they mentioned doing MSAA on the spus?

This was interesting. Don't quite understand everything they say there.

http://diaryofagraphicsprogrammer.blogspot.com/2009/07/msaa-on-ps3-with-light-pre-pass-on-spu.html

A little hard to contact them as Pandemic is no more.

Arwin · Dec 7, 2009

deepbrown said:
A little hard to contact them as Pandemic is no more.

It's not that the programmers suddenly went off and killed themselves, so it's a matter of tracking down the guys working on the SPUs. Should be possible. I did an interview myself recently where this exact same topic came up by the way, but I can't discuss it yet now, first have to edit it and get it PR approved.

PeterT · Dec 7, 2009

assen said:
No, the technique they use is very PS3-specific, I think would work well only on really high-end PCs in really low resolutions.

If it's really a morphological AA technique then it's quite well suited to GPGPU. Or rather, at least the methods I experimented with in Matlab are intended for 720p @ 60 fps use with OpenCL, and I'm quite confident that it's possible.

Are there any <= 1-pixel wide structures in the shots from the game? There are some quite distinctive artifacts for that in most MLAA algorithms I've seen (and considered).

AlNom · Dec 7, 2009

PeterT said:
Are there any <= 1-pixel wide structures in the shots from the game? There are some quite distinctive artifacts for that in most MLAA algorithms I've seen (and considered).

You mean like in the zoomed image I posted earlier in the thread?

homerdog · Dec 7, 2009

MLAA looks cool, but wouldn't it miss out on small or high frequency stuff?

Could it be used in conjunction with MSAA?

Alternative AA methods and their comparison with traditional MSAA*

Shifty Geezer

uber-Troll!

grandmaster

digitalwanderer

assen

Tchock

jlippo

MfA

Weaste

djskribbles

joker454

betan

joker454

MfA

jlippo

semitope

deepbrown

Arwin

Now Officially a Top 10 Poster

PeterT

AlNom

Moderator

homerdog

donator of the year

Similar threads