How can AF be implemented effectively on consoles/RSX?

Shifty Geezer

uber-Troll!
Moderator
Legend
The subject of AF appears every now again, especially regards a general lack thereof in XB360 screenshots as they're the most prolific source of screenshots at the mo'. It's said of Red Steel for Wii that that'll have 8xAF. It's also been suggested by some that PS3 is showing AF where XB360 isn't. Now I am not saying this is the case (no XB360 vs. PS3 rubbish thanks), but, if so, whether on PS3 or Wii, what hardware trickery can be used to get over the massive texture BW demands of AF? Are we looking at large texture caches on GPU perhaps? Hasn't as much been suggested of RSX? Is the BW the major limiting factor, and are there any other special devices that could be used to add this IQ enhancement?
 
Texture BW demands (per clock, which is the only metric that matters for performance) of AF is minimal to non-existant compared to normal straight on viewing of a texture. The only way you could say AF costs BW is by noting that blurry textures use less bandwidth, but that's true whether you're looking at an angled surface or not. AF costs clock cycles.

The answer to minimize impact is to use AF selectively on the textures. Use it for the base colour texture, but not for normal maps, reflection maps, etc. If there is another texture for colour details, use it there too.

There's no performance problem here. AF ran acceptably when universally applied at high resolution on graphics cards from 2002, and I used it at 1280x1024 on my Radeon 64 from 2000!

My guess is devs will eventually start to care if we keep complaining about it. :p
 
Mintmaster said:
Texture BW demands (per clock, which is the only metric that matters for performance) of AF is minimal to non-existant compared to normal straight on viewing of a texture.
Really?! I've always understoof AF uses multiple texture reads per pixel. Citing our favourite spurious source :)D)
http://en.wikipedia.org/wiki/Anisotropic_filtering
True anisotropic filtering generates the anisotropic texture maps on the fly on a per-pixel basis. When the texture is sampled, several pixels (samples) of the texture around the center point are taken, but on a sample grid skewed according to view perspective. The texture is 'pre-perspective corrected', so to speak. A more distant part of the texture will contribute fewer samples, a closer part of it will contribute more samples. Each sample must also be trilinear filtered (or bilinear) which adds more sampling to the process. Sixteen trilinear anisotropic samples will then require 128 samples from the stored texture, as trilinear filtering needs to take four samples from each mip level and then anisotropic sampling (at 16-tap) needs to take sixteen trilinear filtered samples.
This makes anisotropic filtering extremely bandwidth intensive. Each sample is four bytes (32 bits) so each anisotropic pixel has required 64 bytes from texture memory. A display can easily contain over a million pixels, so the hit on texture memory can get very high (tens to hundreds of gigabytes per second) very quickly.
Now obviously something's going on that doesn't fit this idea, because GPU's don't have hundreds of gigabytes texture bandwidth for use in AF, but the manage AF none-the-less.
 
It might be just me but on PC, even 8x AF has been basically "free" for me for years, at least on my 5900U and now on my 6200. That's why i really don't see the reason in this huge issue with AF on the 360.
 
Shifty Geezer said:
Really?! I've always understoof AF uses multiple texture reads per pixel. Citing our favourite spurious source :)D)
Note that I said per clock. If you take multiple clock cycles to do the math of AF (i.e. issuing the multiple texture reads and also adding them together), you can read more data during that time too. For the most part, an angled surface with AF actually uses a bit less BW per clock than a surface viewed head on. The cost is clock cycles.

The distinction may seem irrelevant at first because increased BW needs eventually increases cycle count anyway, but in discussing the solution it does matter. Reducing bandwidth consumption by other means, e.g. by using NT's NAO32 format instead of FP16, won't help with AF performance.
 
Mintmaster said:
Note that I said per clock. If you take multiple clock cycles to do the math of AF (i.e. issuing the multiple texture reads and also adding them together), you can read more data during that time too. For the most part, an angled surface with AF actually uses a bit less BW per clock than a surface viewed head on. The cost is clock cycles.

The distinction may seem irrelevant at first because increased BW needs eventually increases cycle count anyway, but in discussing the solution it does matter. Reducing bandwidth consumption by other means, e.g. by using NT's NAO32 format instead of FP16, won't help with AF performance.

Minty, you're arguing semantics here. fact is an AF-sampled fragment takes more texels to produce than a properlly-mipmapped isotropic fragment (i.e. the isotropic always thakes the same number of samples, but may undersample if not properly mipmapped)
 
Mintmaster said:
The cost is clock cycles.


So in your opinion the only reason we haven't seen af widely used on 360 is nobody's complained yet?

I'm not saying you're wrong but in the madden 07 thread there was a comparison brought up between the ps3 and 360 version and while the ps3 version was behind the 360 version development wise, it still had af.

I know there's a logical explanation for why we haven't seen af widely used on 360 yet but it just doesn't seem right that the reason being not enough people complained. Is it maybe they were all under tight deadlines and af was a very low priority, hence dropped from everything we've seen so far and devs feel they've got away with it up to this point and not too many complained so why bother? Or is it really a pain in the Axx to impliment in 360 for some reason or another?

I remember early in ps2 lifecycle there where huge aliasing issues in almost all their early stuff and that was ironed out rather quickly. I guess the question on everyone's mind is, do you see the same happening with 360 (maybe ps3) this gen and the issue will go away and af will soon become common place or are we stuck with it due to some other limitation?
 
TheChefO said:
So in your opinion the only reason we haven't seen af widely used on 360 is nobody's complained yet?

I'm not saying you're wrong but in the madden 07 thread there was a comparison brought up between the ps3 and 360 version and while the ps3 version was behind the 360 version development wise, it still had af.

I know there's a logical explanation for why we haven't seen af widely used on 360 yet but it just doesn't seem right that the reason being not enough people complained. Is it maybe they were all under tight deadlines and af was a very low priority, hence dropped from everything we've seen so far and devs feel they've got away with it up to this point and not too many complained so why bother? Or is it really a pain in the Axx to impliment in 360 for some reason or another?

I remember early in ps2 lifecycle there where huge aliasing issues in almost all their early stuff and that was ironed out rather quickly. I guess the question on everyone's mind is, do you see the same happening with 360 (maybe ps3) this gen and the issue will go away and af will soon become common place or are we stuck with it due to some other limitation?
ATI would have to have really screwed up to make a hardware problem with AF in Xenos, and I really doubt it. They were the first ones to implement AF in hardware (I think), and always made it fast also. It's just as easy to enable AF on 360 as other platforms, and in fact it's easier to make it selective since XB360 allows you to override texture filtering in the HLSL pixel shader itself.

If you read my post about Madden, you'll see that it's just misinterpreted "evidence" of no AF. Just look at this pic. The horizontal lines look exactly the same with or without AF. You can also see how the dark part of the runway keeps getting narrower in the distance with AF, but not without. The hashmarks in the Madden shot also keep getting narrower, suggesting AF.

I'm not sure why it isn't there in current games, but AFAICS it's simply a matter of not being a priority. Whether that's due to laziness, tight timelines, misplaced image quality goals, lack of complaining, ignorance, or whatever, I have no idea.
 
Mintmaster said:
I'm not sure why it isn't there in current games, but AFAICS it's simply a matter of not being a priority. Whether that's due to laziness, tight timelines, misplaced image quality goals, lack of complaining, ignorance, or whatever, I have no idea.


Thanks Mint -
So then, the first multiplat games that come out for both systems later this year should let us know where we stand on af. I imagine if af is there with the ps3 version and not on the 360 version, in direct comparisons even if one doesn't know what their looking for would say that the ps3 version is "sharper" and hence feeds the notion that ps3 is more powerful.

So basicaly: "MS, if you don't want to feed the notion that you have a weaker system than ps3, get your developers on board with AF NOW, before it's too late to implement it in your fall releases!"

Priority may not be AF per se, but it should certainly be a priority to make sure you don't look like the weaker system to the average Joe looking at Madden etc. at the best buy display.
 
darkblu said:
Minty, you're arguing semantics here. fact is an AF-sampled fragment takes more texels
Semantics or not, the point is that having more bandwidth will not reduce the performance hit of AF, and surprisingly it would probably increase the hit a bit. If you look at the original post of this thread, you'll see that this is not just arguing semantics, and is fundamental to answering each of Shifty's questions.

Now, it's true that with AF you need to hold more texels in the cache, but no GPU in the last 5 years has been deficient in that respect. More cache could help under certain pathological conditions, so could help there I suppose.
 
Mintmaster said:
Semantics or not, the point is that having more bandwidth will not reduce the performance hit of AF, and surprisingly it would probably increase the hit a bit.

am i reaing you correct here that all conditions being equal a greater bandwidth would increase the performance hit? how so?

If you look at the original post of this thread, you'll see that this is not just arguing semantics, and is fundamental to answering each of Shifty's questions.

ok, how about an aniso sampler that produces the sampling coords in a single clock and passes them to multiple isotropic units - how would that not be more bandwidth per fragment?

keep in mind that with anisotropy you may easily get much less texel reuse than in a properly-mipmapped isotropic case where texel reuse between two adjacent fragments is 25-50% at non-magnification, count in the rest of the neightbours and you get ~100% reuse (poly edges notwithstanding). which is far not the case with aniso.
 
I get the feeling (I know, not very scientific) that most pixel shaders are still texture-bound. It's the old mindset, and it used to work really well. Now it's changing in hardware and the games are catching up. Until then, all the bandwidth in the world won't help because the pixel shaders are bottlenecked at running texture fetch instructions. Making some texture reads anisotropic would certainly not help that.

Those referencing AF performance on PC hardware must realize that such hardware is built to do more texturing per pixel than Xenos (and possibly other recent GPUs). Just reading in the color map, height map, normal map, specular map and shadow map would take, what, 9 cycles? That's a lot of math ops to fill if you have no particular plan for them. If it weren't for vertex shaders filling in some of the unused math ops that pixel shaders leave, I imagine Xenos would run quite slowly in some cases.
 
Shifty Geezer said:
How can AF be implemented effectively on consoles/RSX?
By developers with understanding of and respect for good image quality. ;)

Seriously, Monolith showed quality AF on the 360 at launch with Condemned. So while few other developers have used the feature, that is obviously just becuase they didn't design their games with quality texture filtering in mind.
 
kyleb said:
Seriously, Monolith showed quality AF on the 360 at launch with Condemned. So while few other developers have used the feature, that is obviously just becuase they didn't design their games with quality texture filtering in mind.

Maybe this is because Monolith is primarily a PC developer :?: Possible performance issues/conjectures aside, maybe there are some other factors here like (1) because PC consumers expect AF and Condemned is also a PC title, Monolith designed with it in mind; (2) PC developers understand the benefit to IQ and have been using it for years so Monolith took at for granted it would be used; (3) Monolith, as a PC developer, has been working in HD resolutions for many years and understood the IQ impact and necessity of implimenting the feature early instead of letting it fall to the wayside at crunch time?

As you point out a number of 360 games do use AF, which makes this more confusing. If it was really broke you would not expect one of the nicer looking games, like Condemned, using it while other lesser titles have ignored it. Mintmaster has given a number of reasons in the past why it may not be used. One being that it has to be enabled per surface and not globally, yet you would hope developers would understand the benefit AF has for IQ and would make time to make it happen. The cynical side of me says something else has to be going on since PC users have had AF cheaply for years, yet some E3 PS3 games are making me rethink my cynicism. We obviously have not seen any finalized PS3 software, but a number of E3 games appeared to lack AF as well. Considering the performance hit, or lack thereof, on G71 and that it probably can be enabled globally it makes you scratch your head.

We are barely 6 months into "next-gen" so I am willing to wait a little while longer. But if come fall 2007 a lot of cross platform titles are demonstrating AF on the PS3 and not on the 360 it will be something to begin seriously looking into. Until then (or until a developer speaks up) we are kind of left wandering in the dark trying to figure out why some titles use it, and others do not.
 
Acert93 said:
Mintmaster has given a number of reasons in the past why it may not be used. One being that it has to be enabled per surface and not globally,
I'm guessing you missunderstood him there. Developers can certianly get more efficient resuilts by appliying AF selectively, but I highly doubt that the 360 prevents anyone from apply AF globally to their games; at least not for any other reason than the fact that doing so would almost certianly make the games run slower than what they do without it.
 
Given that we are not seeing AF utilized in games, is it reasonable to assume that Xenos must use multiple passes to handle it? I would think MS would have included documentation stating that AF was nigh 'free' if that were the case. If it does indeed require mutliple passes then wouldn't raw fill become an issue for Xenos?

This is one of the reasons why I have been thinking that 16ROPs make sense for RSX- besides GS/BC issues if they did have 16ROPs they could offer 'free' AF to give them a fairly clear edge over the 360 in terms of visuals. Not saying that is going to happen, but I can see why they would go in that direction as it is a point where they could easily demonstrate their 'superior power' running side by side with the 360(it is much easier to notice AF then AA on a HDTV).
 
Acert93 said:
Maybe this is because Monolith is primarily a PC developer :?: Possible performance issues/conjectures aside, maybe there are some other factors here like (1) because PC consumers expect AF and Condemned is also a PC title, Monolith designed with it in mind; (2) PC developers understand the benefit to IQ and have been using it for years so Monolith took at for granted it would be used; (3) Monolith, as a PC developer, has been working in HD resolutions for many years and understood the IQ impact and necessity of implimenting the feature early instead of letting it fall to the wayside at crunch time?

As you point out a number of 360 games do use AF, which makes this more confusing. If it was really broke you would not expect one of the nicer looking games, like Condemned, using it while other lesser titles have ignored it. Mintmaster has given a number of reasons in the past why it may not be used. One being that it has to be enabled per surface and not globally, yet you would hope developers would understand the benefit AF has for IQ and would make time to make it happen. The cynical side of me says something else has to be going on since PC users have had AF cheaply for years, yet some E3 PS3 games are making me rethink my cynicism. We obviously have not seen any finalized PS3 software, but a number of E3 games appeared to lack AF as well. Considering the performance hit, or lack thereof, on G71 and that it probably can be enabled globally it makes you scratch your head.

We are barely 6 months into "next-gen" so I am willing to wait a little while longer. But if come fall 2007 a lot of cross platform titles are demonstrating AF on the PS3 and not on the 360 it will be something to begin seriously looking into. Until then (or until a developer speaks up) we are kind of left wandering in the dark trying to figure out why some titles use it, and others do not.


Yeah but COD2 I think has AF on the PC version while the 360 doesnt. Or am I wrong?:???:
 
BenSkywalker said:
Given that we are not seeing AF utilized in games, is it reasonable to assume that Xenos must use multiple passes to handle it? I would think MS would have included documentation stating that AF was nigh 'free' if that were the case. If it does indeed require mutliple passes then wouldn't raw fill become an issue for Xenos?
AF is handled on Xenos exactly as it has been on PC graphics processors. Multi-passing for AF makes zero sense, especially as, if the texture is used for any kind of shader, the whole thing would need to be executed. No, all the texture samples are taken up front, in multiple cycles in the samplers then fed into the shader when the fully sampled fragement is complete.

This is one of the reasons why I have been thinking that 16ROPs make sense for RSX- besides GS/BC issues if they did have 16ROPs they could offer 'free' AF to give them a fairly clear edge over the 360 in terms of visuals.
16 ROP's wouldn't affect texture quality at all. (Sony's docs also indicate 8 ROP's anyway)
 
I meant multiple cycles, not actual passes :eek:

I would assume with the relatively speaking low raw fill rate that stalling the pipes would have a rather nasty performance hit to go with it.

16 ROP's wouldn't affect texture quality at all. (Sony's docs also indicate 8 ROP's anyway)

If the chip needed extra cycles for AF then having additional ROPs could grant them 'free' AF under most circumstances. Also, am I not mistaken or is the entire assumption that RSX is only 8 ROPs based on one specific output format they have listed in a dev doc? Do we know if it is accurate or not? I'm not saying it isn't printed or anything like that, just we have seen several mistakes of those sorts made throughout the years.
 
Back
Top