Lucid_Dreamer
Veteran
That's what names are for (to be used). "Decent architecture" like "Xenon"? And, I'm fine with it. They have reworked some of the Cell architectural elements in Intel processors and GPUs are even pulling up the slack, now.Wow. You sure like to drop names, don't you? Imagine what those "greats" could have done with a decent architecture?
It must make you really sad that cell is dead.
The interesting thing will be to see who will generally put out the best technical games, next-gen. My guess is that it will be the same people that are excelling this gen. Time will tell. Then, we can hear the same ole arguments about bad architecture for that gen. "If this was made a little bit easier, we could have done a lot more with it."
Uhhh,
1. Intel invented MLAA, not sony.
2. Sony is unlikely to give out their version of MLAA to non sony devs.
3. Even if they did give it out, it would be programmed to run on Cell, not what everyone else uses for post AA (the GPU).
4. Most devs seem to like FXAA more.
5. A few 360 games do use MLAA.
1. I said "Sony's MLAA" because their version goes beyond the original paper written by Intel. Read the "making of God of War III" article from Eurogamer. And, to think the initial port over was 120ms. I guess making use of the Cell architecture couldn't have helped to reduce that time to 20ms with additions, right? That's the version I'm talking about.
4. I wonder why? It's universally applicable.
So every GPU thread processes a single pixel in this approach. In the MLAA algorithm however, pixels are not independent, but have a rather strict order in which they need to be processed. In other words, MLAA is not embarrassingly parallel and thus hard to implement on a GPU. Edge detection is not the issue.
http://forum.beyond3d.com/showpost.php?p=1433406&postcount=443
I believe that's without the "beyond" part of Sony's MLAA implementation being in effect.
EDIT: I wanted to add some more information about Sony's MLAA implementation.
http://forum.beyond3d.com/showpost.php?p=1435976&postcount=248
It was extremely expensive at first. The first not so naive SPU version, which was considered decent, was taking more than 120 ms, at which point, we had decided to pass on the technique. It quickly went down to 80 and then 60 ms when some kind of bottleneck was reached. Our worst scene remained at 60ms for a very long time, but simpler scenes got cheaper and cheaper. Finally, and after many breakthroughs and long hours from our technology teams, especially our technology team in Europe, we shipped with the cheapest scenes around 7 ms, the average Gow3 scene at 12 ms, and the most expensive scene at 20 ms.
In term of quality, the latest version is also significantly better than the initial 120+ ms version. It started with a quality way lower than your typical MSAA2x on more than half of the screen. It was equivalent on a good 25% and was already nicer on the rest. At that point we were only after speed, there could be a long post mortem, but it wasn’t immediately obvious that it would save us a lot of RSX time if any, so it would have been a no go if it hadn’t been optimized on the SPU. When it was clear that we were getting a nice RSX boost ( 2 to 3 ms at first, 6 or 7 ms in the shipped version ), we actually focused on evaluating if it was a valid option visually. Despite of any great performance gain, the team couldn’t compromise on quality, there was a pretty high level to reach to even consider the option. And as for the speed, the improvements on the quality front were dramatic. A few months before shipping, we finally reached a quality similar to MSAA2x on almost the entire screen, and a few weeks later, all the pixelated edges disappeared and the quality became significantly higher than MSAA2x or even MSAA4x on all our still shots, without any exception. In motion it became globally better too, few minor issues remained which just can’t be solved without sub-pixel sampling.
Last edited by a moderator: