Not when compared to 1xAA which is traditional framebuffer, when compared to different MSAA modes ofcourse.doesn't that free up more Vram for the GPU?
It's actually unimportant to the AA method. The clever bit is samoothing the steps. Finding edges can be done any number of different ways, on any suitable source data.
You don't need vrtex data, but it could prove handy in selecting edges.
If early titles had SPU cycles to spare, I wonder if the PS3 firmware could be updated to provide an MLAA post mode?
Sony seems to be more interested about adding 3D support to their existing library. It has a far greater marketing value - "now with MLAA" would not sell more 3D TVs (and the price difference is quite atrocious by the way 8-O). So if there's any SPE resource left unused, it'd be far more likely for them to to spend the coding effort on stereo...
Actually, Intel's MLAA does not care about the geometry edges at all, it just looks for certain patterns on the final image. Many times it's technically wrong, in that what it thinks to be the underlying edges - but the difference between their solution and the Right Thing is practically unnoticeable, at least for now.
Yeah, that might be a good extension, although I'm not sure if any implementation is actually using it. GOW3's approach is not covered publicly in enough detail yet (probably to maintain competitive advantage).
However, I'd say it's more likely that they just introduced more patterns and a more efficient recognition scheme instead of working with edges. SPE's can't hold too much data anyway, I think that increasing the tile size from 8x8 would be a better utilization of the limited local memory.
Sony seems to be more interested about adding 3D support to their existing library. It has a far greater marketing value - "now with MLAA" would not sell more 3D TVs (and the price difference is quite atrocious by the way 8-O). So if there's any SPE resource left unused, it'd be far more likely for them to to spend the coding effort on stereo...
It might be a benefit and as such they might add it to titles that are converted to support 3D. But only to those; at least I don't see them going back to change other games to make them look better only for its own sake. There's no marketing value in MLAA on its own.
I thought that's the beauty of MLAA ? You don't want to burden it with geometry edge checking unnecessarily. It's screen space AA right?
You want the tile size to be larger not because of the "limited local memory". It reduces the communication and access overhead; plus the SPU is fast enough to churn through the data while waiting for the next DMA.
Stereoscopic 3D integration is high risk (unknown or rather low demand initially).
Yeah, it's both a strength and a weakness. We'll have to wait and see how it works with the upcoming next-gen content; heavy tesselation and displacement can change its quality seriously.
I don't argue with this, all I'm saying is that adding geometrical edge data would only take away from this efficiency by increasing the memory requirements and reducing the tile size. Whereas we don't know if it'd also benefit the quality.
We've been through this. 3D is a top priority because of Sony's investment in 3D TVs. They need to build market demand for them by creating more content; patching old games for stereo is a relatively cheap way.
There is no perceivable demand for MLAA, and patching old games would not help them sell 3D TVs. So there's no reason to invest in it.
Noone has been talking about adding MLAA to new games. The original point was that if there's SPE processing time left, it could be done - but it does not make sense for Sony to do so. But adding 3D support where possible will help with their 3D TV business.
MLAA implantation on GPU.
http://igm.univ-mlv.fr/~biri/mlaa-gpu/MLAAGPU.pdf
"Our implementation adds a total cost of 34ms (0.49ms) to the rendering
at resolution 1248x1024 on a NVidia Geforce 8600 GT (295
GTX)."
Dont get that part...
Oh,thx.Its quite a difference between those two,on 8600gt its totally not worth it.
Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. We provide an open source OpenGL/GLSL implementation of our method at http://igm.univ-mlv.fr/ ̃biri/mlaa-gpu/.
Further works will consist in handling artifacts introduced by filter approaches in animation using techniques such as temporal coherence and auto determination of the discontinuity factor.
As expected...
Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. We provide an open source OpenGL/GLSL implementation of our method at http://igm.univ-mlv.fr/ ̃biri/mlaa-gpu/.
Further works will consist in handling artifacts introduced by filter approaches in animation using techniques such as temporal coherence and auto determination of the discontinuity factor.
Someone should try to implement this in a real time rendering framework on PC.
EDIT: Why does it need a pre-computed/pre-determined discontinuity factor ?
Our implementation adds a total cost of 34ms (0.49ms) to the renderingat resolution 1248x1024 on a NVidia Geforce 8600 GT (295GTX). The GPU version tends to scale very well since the cost at 1600x1200 resolution is only 67.5ms (0.54ms) which represents a cost 98% (11%) higher for 144% more pixels. We can compare our results to a standard CPU implementation which runs in 67ms at 1024x768 and in 128ms at 1600x1200 on a Core2Duo2.20Ghz. Note that it does not include the costly GPU/CPU/GPU transfers in case of real time rendering. We provide an open source OpenGL/GLSL implementation of our method at http://igm.univmlv.
fr/˜biri/mlaa-gpu/.
Do you know what they meant by auto-determination of discontinuity factor for future work ? Is this related to the pre-computed area table texture mentioned in step 2 ? If the algorithm is general, the SPU implementation may gain from it too.
EDIT: Interesting that the Intel "unoptimized" algorithm had the [strike]single[/strike]Quad-core 3.0GHz CPU complete 720p (?) in about 5ms, while this algorithm require 67ms for 1024x768 on Core2Duo 2.20Ghz.
Nice to see someone's got around to trying it! Performance on the 8600GT shows that Cell was a very nice fit for this method given the age of the tech, as no GPU of the period or for some time after would have been able to add MLAA in a game. The 295GTX works out a fair bit faster per transistor per cycle, versus the 4ms figure for GOW3 across 5 SPEs (105 million transistors). That said, there's a qualitative difference. The bunny shot shows a marked transition from white pixel to the first interpolated value, much coarser than the ideal, and the MLAA examples, very noticeable close-up on the telegraph pole, are also coarse. Recalling Santa Monica said their initial efforts were similar to 2xMSAA, I guess we're seeing early results here, and it'll be interesting how things get refined and where the limits lie.MLAA on GPU... oh my!