Alternative AA methods and their comparison with traditional MSAA*

Gitaroo · Jun 4, 2010

doesn't that free up more Vram for the GPU?

jlippo · Jun 5, 2010

Gitaroo said:
doesn't that free up more Vram for the GPU?

Not when compared to 1xAA which is traditional framebuffer, when compared to different MSAA modes ofcourse.

Laa-Yosh · Jun 5, 2010

Shifty Geezer said:
It's actually unimportant to the AA method. The clever bit is samoothing the steps. Finding edges can be done any number of different ways, on any suitable source data.

Actually, Intel's MLAA does not care about the geometry edges at all, it just looks for certain patterns on the final image. Many times it's technically wrong, in that what it thinks to be the underlying edges - but the difference between their solution and the Right Thing is practically unnoticeable, at least for now.

You don't need vrtex data, but it could prove handy in selecting edges.

Yeah, that might be a good extension, although I'm not sure if any implementation is actually using it. GOW3's approach is not covered publicly in enough detail yet (probably to maintain competitive advantage).
However, I'd say it's more likely that they just introduced more patterns and a more efficient recognition scheme instead of working with edges. SPE's can't hold too much data anyway, I think that increasing the tile size from 8x8 would be a better utilization of the limited local memory.

If early titles had SPU cycles to spare, I wonder if the PS3 firmware could be updated to provide an MLAA post mode?

Sony seems to be more interested about adding 3D support to their existing library. It has a far greater marketing value - "now with MLAA" would not sell more 3D TVs (and the price difference is quite atrocious by the way 8-O). So if there's any SPE resource left unused, it'd be far more likely for them to to spend the coding effort on stereo...

Arwin · Jun 5, 2010

Laa-Yosh said:
Sony seems to be more interested about adding 3D support to their existing library. It has a far greater marketing value - "now with MLAA" would not sell more 3D TVs (and the price difference is quite atrocious by the way 8-O). So if there's any SPE resource left unused, it'd be far more likely for them to to spend the coding effort on stereo...

Probably, but I doubt you can really retrofit something that scales over multiple SPUs just like that to existing titles. I'm pretty sure in fact that's not possible in many cases. They could add it to the Phyreengine though, that would be good.

However, you may be wrong about the link between MLAA and 3D - Sony indicated not too long ago I think in an interview that good AA is far more important than native res when it comes to 3D, according to their research.

Laa-Yosh · Jun 5, 2010

It might be a benefit and as such they might add it to titles that are converted to support 3D. But only to those; at least I don't see them going back to change other games to make them look better only for its own sake. There's no marketing value in MLAA on its own.

patsu · Jun 5, 2010

Laa-Yosh said:
Actually, Intel's MLAA does not care about the geometry edges at all, it just looks for certain patterns on the final image. Many times it's technically wrong, in that what it thinks to be the underlying edges - but the difference between their solution and the Right Thing is practically unnoticeable, at least for now.

I thought that's the beauty of MLAA ? You don't want to burden it with geometry edge checking unnecessarily. It's screen space AA right ?

Yeah, that might be a good extension, although I'm not sure if any implementation is actually using it. GOW3's approach is not covered publicly in enough detail yet (probably to maintain competitive advantage).
However, I'd say it's more likely that they just introduced more patterns and a more efficient recognition scheme instead of working with edges. SPE's can't hold too much data anyway, I think that increasing the tile size from 8x8 would be a better utilization of the limited local memory.

It's a balance between the fetches and the computation time. You want the tile size to be larger not because of the "limited local memory". It reduces the communication and access overhead; plus the SPU is fast enough to churn through the data while waiting for the next DMA. The fast Local Memory enables all this to happen quickly.

Sony seems to be more interested about adding 3D support to their existing library. It has a far greater marketing value - "now with MLAA" would not sell more 3D TVs (and the price difference is quite atrocious by the way 8-O). So if there's any SPE resource left unused, it'd be far more likely for them to to spend the coding effort on stereo...

They are both important. One sells hardware which benefits Sony Electronics, the other sells software which benefits NPS directly. Besides, 3D requires good AA. If Saboteur has something similar, the third party PS3 developers are not too far behind. Not sure if developers need Sony's help.

Laa-Yosh said:
It might be a benefit and as such they might add it to titles that are converted to support 3D. But only to those; at least I don't see them going back to change other games to make them look better only for its own sake. There's no marketing value in MLAA on its own.

Yes but we may also want to consider...

Patching old games to 3D is because they are low hanging fruits, and helps to understand the issues on hand. Stereoscopic 3D integration is high risk (unknown or rather low demand initially). So I think it make sense to start with old titles first. It should be up to the developers to decide (MLAA or stereoscopic 3D or something else).

Adding MLAA to upcoming game has better returns since the sales for new games are higher. For 3D, the market size is the limit. Whether you patch new or old games, the 3D people will still eat it up. You take lower risk and have shorter time to market by patching old titles to address the 3D need.

Laa-Yosh · Jun 5, 2010

patsu said:
I thought that's the beauty of MLAA ? You don't want to burden it with geometry edge checking unnecessarily. It's screen space AA right?

Yeah, it's both a strength and a weakness. We'll have to wait and see how it works with the upcoming next-gen content; heavy tesselation and displacement can change its quality seriously.

You want the tile size to be larger not because of the "limited local memory". It reduces the communication and access overhead; plus the SPU is fast enough to churn through the data while waiting for the next DMA.

I don't argue with this, all I'm saying is that adding geometrical edge data would only take away from this efficiency by increasing the memory requirements and reducing the tile size. Whereas we don't know if it'd also benefit the quality.

Stereoscopic 3D integration is high risk (unknown or rather low demand initially).

We've been through this. 3D is a top priority because of Sony's investment in 3D TVs. They need to build market demand for them by creating more content; patching old games for stereo is a relatively cheap way.

There is no perceivable demand for MLAA, and patching old games would not help them sell 3D TVs. So there's no reason to invest in it.

Noone has been talking about adding MLAA to new games. The original point was that if there's SPE processing time left, it could be done - but it does not make sense for Sony to do so. But adding 3D support where possible will help with their 3D TV business.

patsu · Jun 5, 2010

Laa-Yosh said:
Yeah, it's both a strength and a weakness. We'll have to wait and see how it works with the upcoming next-gen content; heavy tesselation and displacement can change its quality seriously.

How so ? Everything has both strengths and weaknesses.

I don't argue with this, all I'm saying is that adding geometrical edge data would only take away from this efficiency by increasing the memory requirements and reducing the tile size. Whereas we don't know if it'd also benefit the quality.

The algorithm should be heavily focused on screen space tiles. One may be able to add in some special case checks (instead of for all geometry data).

Even for geometry data jobs (SPU culling and tessellation), the same principle and local stores still apply. You want the right size of "tiles" to hide the latency. For future work, I think the Intel researcher and Guerilla Games have other ideas.

We've been through this. 3D is a top priority because of Sony's investment in 3D TVs. They need to build market demand for them by creating more content; patching old games for stereo is a relatively cheap way.

There is no perceivable demand for MLAA, and patching old games would not help them sell 3D TVs. So there's no reason to invest in it.

If they have time, they should add MLAA to new games instead of old games because the former has better sales potentials.

Noone has been talking about adding MLAA to new games. The original point was that if there's SPE processing time left, it could be done - but it does not make sense for Sony to do so. But adding 3D support where possible will help with their 3D TV business.

Noone ? I know many PS3 gamers who want MLAA in every game.

It's difficult to compare them because the TV and games have completely different models (margin, lifespan, franchise, merchandizing, ...)

Ruskie · Jun 9, 2010

MLAA implantation on GPU.

http://igm.univ-mlv.fr/~biri/mlaa-gpu/MLAAGPU.pdf

"Our implementation adds a total cost of 34ms (0.49ms) to the rendering
at resolution 1248x1024 on a NVidia Geforce 8600 GT (295
GTX)."

Dont get that part...

assen · Jun 9, 2010

Ruskie said:
MLAA implantation on GPU.

http://igm.univ-mlv.fr/~biri/mlaa-gpu/MLAAGPU.pdf

"Our implementation adds a total cost of 34ms (0.49ms) to the rendering
at resolution 1248x1024 on a NVidia Geforce 8600 GT (295
GTX)."

Dont get that part...

34 ms on 8600GT, 0.49 ms on 295 GTX.

Ruskie · Jun 9, 2010

Oh,thx.Its quite a difference between those two,on 8600gt its totally not worth it.

Neb · Jun 9, 2010

Ruskie said:
Oh,thx.Its quite a difference between those two,on 8600gt its totally not worth it.

MLAA on GPU... oh my!

Yeah the 8600GT is a dog and would be comparable to something like 7600-7800GT class perfomance.

patsu · Jun 9, 2010

As expected...

Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. We provide an open source OpenGL/GLSL implementation of our method at http://igm.univ-mlv.fr/ ̃biri/mlaa-gpu/.

Further works will consist in handling artifacts introduced by filter approaches in animation using techniques such as temporal coherence and auto determination of the discontinuity factor.

Someone should try to implement this in a real time rendering framework on PC.

EDIT: Why does it need a pre-computed/pre-determined discontinuity factor ?

Neb · Jun 9, 2010

patsu said:
As expected...

Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. We provide an open source OpenGL/GLSL implementation of our method at http://igm.univ-mlv.fr/ ̃biri/mlaa-gpu/.

Further works will consist in handling artifacts introduced by filter approaches in animation using techniques such as temporal coherence and auto determination of the discontinuity factor.

Click to expand...

Someone should try to implement this in a real time rendering framework on PC.

EDIT: Why does it need a pre-computed/pre-determined discontinuity factor ?

If they do it on GPU then CPU to GPU costs seems no problem. Seems quote is taken out of context.

Our implementation adds a total cost of 34ms (0.49ms) to the renderingat resolution 1248x1024 on a NVidia Geforce 8600 GT (295GTX). The GPU version tends to scale very well since the cost at 1600x1200 resolution is only 67.5ms (0.54ms) which represents a cost 98% (11%) higher for 144% more pixels. We can compare our results to a standard CPU implementation which runs in 67ms at 1024x768 and in 128ms at 1600x1200 on a Core2Duo2.20Ghz. Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. We provide an open source OpenGL/GLSL implementation of our method at http://igm.univmlv.
fr/˜biri/mlaa-gpu/.

patsu · Jun 9, 2010

Argh, my bad ! Thought it was on a separate paragraph. The data should already be in the video memory to begin with for the GPU version. >_<

Do you know what they meant by auto-determination of discontinuity factor for future work ? Is this related to the pre-computed area table texture mentioned in step 2 ? If the algorithm is general, the SPU implementation may gain from it too.

EDIT: Interesting that the Intel "unoptimized" algorithm had the [strike]single[/strike]Quad-core 3.0GHz CPU complete 720p (?) in about 5ms, while this algorithm require 67ms for 1024x768 on Core2Duo 2.20Ghz.

Neb · Jun 9, 2010

patsu said:
Do you know what they meant by auto-determination of discontinuity factor for future work ? Is this related to the pre-computed area table texture mentioned in step 2 ? If the algorithm is general, the SPU implementation may gain from it too.

Nope, sorry.

EDIT: Interesting that the Intel "unoptimized" algorithm had the [strike]single[/strike]Quad-core 3.0GHz CPU complete 720p (?) in about 5ms, while this algorithm require 67ms for 1024x768 on Core2Duo 2.20Ghz.

The architecture between a budget Core2Duo 2.2GHz and Q6xxx or better is quite step. The budget Core2Duos have less cache, worse FSB speed and lack new instructions, functions etc.

E4500 C2D 2.2GHz, 800MHz FSB, 2MB cache (laptop CPU).

2x5160 Xeon 3.0GHz (4 cores), 1333MHz FSB, 4MB cache (Server/desktop CPU).

patsu · Jun 9, 2010

I don't recall the Intel MLAA requires any "large" pre-compute structures. May be I'll compile the source on a PC to see what's going on. I doubt I need to enter any pre-compute parameters/structures to run it.

I know Core2Duo is a laptop part. ^_^
It'd be interesting to see someone run it on a Quad-core 3.0GHz.

Too bad PS3 Linux is no longer available, otherwise, we can port the source to run on the PS3 to try it out.

Shifty Geezer · Jun 9, 2010

Nebula said:
MLAA on GPU... oh my!

Nice to see someone's got around to trying it! Performance on the 8600GT shows that Cell was a very nice fit for this method given the age of the tech, as no GPU of the period or for some time after would have been able to add MLAA in a game. The 295GTX works out a fair bit faster per transistor per cycle, versus the 4ms figure for GOW3 across 5 SPEs (105 million transistors). That said, there's a qualitative difference. The bunny shot shows a marked transition from white pixel to the first interpolated value, much coarser than the ideal, and the MLAA examples, very noticeable close-up on the telegraph pole, are also coarse. Recalling Santa Monica said their initial efforts were similar to 2xMSAA, I guess we're seeing early results here, and it'll be interesting how things get refined and where the limits lie.

Importantly, not all MLAA is created equal, which is where we have to be cautious. A game claiming MLAA may not instantly have GOW3 level IQ, unlike one game offering 4xMSAA will have equal footing with another game with 4xMSAA. MLAA is something of an umbrella term for a class of post-process algorithm.

T.B. · Jun 9, 2010

Interesting stuff. Judging by the description, they may do something like the black&white version of MLAA, i.e. skipping the whole linear equation part.

Their performance numbers are curious, however. The algorithm should scale worse than linear, because it has some O(n*log(sqrt

) components and a whole lot of O

components. Still their 295GTX version only takes 11% longer for 1.5x the pixels. So they are either massively bound by some large resolution-independent term, or their measurement failed. The 8600GT numbers scale much more reasonably and perfectly in line with their CPU implementation, which seems to rule out the resolution-independent term. So if someone with a 285/295 could verify the numbers and the methodology used to measure them, that would be interesting.

Shifty, I've run Alex Reshetov's MLAA code over their HL2 test image and I see a lot less noise with his code. That may just be the edge detection, though.

TheD · Jun 10, 2010

The E4500 is not a laptop CPU, it is a low end desktop part.

Alternative AA methods and their comparison with traditional MSAA*

Gitaroo

jlippo

Laa-Yosh

I can has custom title?

Arwin

Now Officially a Top 10 Poster

Laa-Yosh

I can has custom title?

patsu

Laa-Yosh

I can has custom title?

patsu

Ruskie

assen

Ruskie

Neb

Iron "BEAST" Man

patsu

Neb

Iron "BEAST" Man

patsu

Neb

Iron "BEAST" Man

patsu

Shifty Geezer

uber-Troll!

T.B.

TheD

Similar threads