Differences in AF quality betwen 6800 Ultra and 9800XT

Xmas · Apr 26, 2004

demalion said:
In your 16x"ad"AF versus 4x"b"AF comparison, you don't see any place at all where, even just evaluating image quality, 16x"ad"AF is better than 4x"b"AF?

But this isn't an accurate description of "ad"AF versus "b"AF. What about performance? What about what might be allowed by transistor savings for adequate performance having an impact on functionality and image quality?

Well, it seems to me rather obvious you're ignoring some of the angles to make that assertion, because the lack in your comparison (of 4x to 16x angle dependent AF) would be even more obtrusive. Why don't you bother to recognize that, given your opinion on image quality?

Well, in an FPS, where the benefit manifests most is where it is most noticable, wouldn't you agree? This is why I mentioned different game types. You also completely ignore how the "bigger difference" from 2x to 4x manifests in a game scene.

I'd like to put this all together, since it sums up three main aspects: performance, quality, and transistor count. 4x"b" vs 16x"ad" was just an example that fits for a "typical" game scene (well, typical for the games I play, but this whole discussion is subjective, because it includes quality), it doesn't always hold true.

* First, a "Doom" style scene. 90Â° angles only, only vertical and horizontal surfaces:
- Identical performance, identical quality for identical degrees of "ad" vs "b", because full AF is applied to all surfaces
- "ad" saves transistors

* Second, a flight sim scene. Mostly rather flat ground, but often tilted
- comparing identical degrees of "ad" vs "b": "ad" is faster when turning, but with very obvious quality drawbacks. Worst case performance is almost identical, however
- comparing identical performance is hardly possible, because "ad" performance changes too much with the angle
- "ad" saves transistors

* Third, a "rolling hills" game with many different angles
- comparing identical degrees of "ad" vs "b": "ad" is faster, but with obvious quality drawbacks
- comparing identical performance ("b" at lower level): "b" still reaches an overall better, more balanced image quality when doing the same work, because the distribution is more sensible.
- "ad" saves transistors

Most games today are a blend between 1 and 3, leaning towards 3. The only consistent advantage for "ad" is that it saves transistors. IMO performance is not an advantage of "ad", because "b" can reach better or at least identical quality for the same performance cost.

This isn't always 4x"b" vs 16x"ad". The numbers may vary depending on the predominant angles in a game. However, I firmly believe that with the same amount of work, "b" can always do at least as good as "ad" regarding quality, and visibly better most of the time.

This leaves transistor count to be discussed.

On the one hand, people can only notice the difference between 16x and 4x when comparing to different screenshots, and on the other the difference between 16x and 2x on some angles on different surfaces would jump out at them? OK...

...So the ideal tradeoff for you would be to limit to optimally implementing only up to 4x AF with high performance? What strikes me here is that it sounds like, since surfaces at 2x AF versus 4x AF would be a less obtrusive change, that 4x"ad"AF would be better than 16x, because reducing the glaring difference between 16xAF and 2xAF outweight the benefit of 4x to 16x?

You aren't trying to spin my words, are you?
People do notice the difference between 4x and 16x AF. However, you notice the difference between two adjacent surfaces on one image much, much easier than the difference between something you see and some other image you have in your mind.
16x"ad" is better than 4x"ad". However, it is not necessarily better than 4x"b". Or, generally, Mx"ad" is not necessarily better than Nx"b", M>N. The numbers just don't tell the whole story.

If they cut down AF to put in other features, then I think they got their priorities wrong.

Click to expand...

Even if those features were floating point texture filtering and/or work to speed it up?

Now, on to transistor count. There are lots of other features than the ones you mention. What about brilinear filtering, for example?

They managed to get 8x"b" AF into NV20 with its 57M transistors. Even into NV30 which was already huge. Now they produce a 220M+ chip and don't have the space for the decreasing relative cost of non angle-dependent AF? I don't buy it.

demalion · Apr 26, 2004

Belated due to DSL service outage:

Xmas said:
...
I'd like to put this all together, since it sums up three main aspects: performance, quality, and transistor count. 4x"b" vs 16x"ad" was just an example that fits for a "typical" game scene (well, typical for the games I play, but this whole discussion is subjective, because it includes quality), it doesn't always hold true.

Thank you for making progress in addressing my points.

* First, a "Doom" style scene. 90Â° angles only, only vertical and horizontal surfaces:
- Identical performance, identical quality for identical degrees of "ad" vs "b", because full AF is applied to all surfaces

Is performance identical at 90 degree angles? Why is the approach still offering significant performance impact reduction in "Doom" style scenes, when the scene is almost entirely made up of surfaces at those angles?

- "ad" saves transistors

Depending on your answer to my question, it saves performance impact too, right?

* Second, a flight sim scene. Mostly rather flat ground, but often tilted
- comparing identical degrees of "ad" vs "b": "ad" is faster when turning, but with very obvious quality drawbacks.

Only faster when turning?
As far as image quality, when the ground detail is comprised of a flat texture suitable for AF, I think the observation is valid, if subjective.

Worst case performance is almost identical, however

To summarize what I'm still wondering:
1) Where are you getting indication of wide performance variance when turning for "ad" AF? You're not just assuming that, right?
2) Performance savings would be more hidden by CPU limitations, so I'm not sure an "almost identical" performance impact means muach, at least without having the results you have in mind to evaluate.
3) In any case, on a card like the 16 pipe NV40, it seems clear that regardless (and because of) the above, "b" AF's performance impact should not be a negative, and the behavior of the scene should show it to its strongest advantage, and in my consideration likely to be clearly superior for the scenario outlined above.

- comparing identical performance is hardly possible, because "ad" performance changes too much with the angle

How much is this change that is "too much"?

- "ad" saves transistors

Ok.

* Third, a "rolling hills" game with many different angles
- comparing identical degrees of "ad" vs "b": "ad" is faster, but with obvious quality drawbacks

You keep saying things like "obvious" to represent the drawbacks as unassailable, but at the same time dismiss the drawbacks of having lesser degree at other angles (again, assuming "b" AF isn't implemented with the same maximum degree, as perhaps a transistor cost might dictate).

They are both "obvious" where they occur, but they can both go unnoticed by the viewer.
No need for subjectively weighting the discussion by substituting a subjective evaluation as an excuse to ignore one or the other, especially if a "b" AF implementation has the same maximum degree and makes the issue moot by delivering the same maximum image quality (the problem is just that this is necessary in order to make the issue of "b" AF image quality disadvantages actually moot outside of subjectivity).

- comparing identical performance ("b" at lower level): "b" still reaches an overall better, more balanced image quality when doing the same work, because the distribution is more sensible.

Why do you keep intermixing a subjective evaluation like this? This is nothing more than repeating that 4x "b" has better image quality than 16x "ad" and hiding behind a lack of specified numbers to propose it as somehow objective to say "lower" is always better than "higher" regardless of actuality.

- "ad" saves transistors

And even this by itself might allow a higher maximum degree with a given transistor budget, which you exclude from consideration for image quality by consistently and repeatedly ignoring any and all possibility of considering how this advantage manifests at all.
Even still, for this game type, angle dependent AF (the ones we are discussing, AFAIK) doesn't just show benefit at 0 and 90 degrees.
...
Overall, that still leaves the flight simulator case (for 16 pipe, and I'd speculate down to 8 pipe designs) where performance savings seem, to my expectation, likely to be completely unimportant for even high resolutions, when the ground details is delivered by a color texture.

Most games today are a blend between 1 and 3, leaning towards 3. The only consistent advantage for "ad" is that it saves transistors.

That last statement is only because you use subjective criteria to ignore that the advantages for "b" are not consistent either, just like the ones for "ad" that you just dismissed from consideration!
This is what is useless, not your opinion of it offering superior image quality. It precludes any realistic consideration or discussion that might result in a different conclusion, and only by subjective factors that seem geared towards justifying labels like "crappy" or "stupid".

IMO performance is not an advantage of "ad", because "b" can reach better or at least identical quality for the same performance cost.

Because you stipulate something that doesn't make sense as far as I've seen (for example, that the performance advantage for "ad" is minimized in "Doom" style scenes where its more highly filtered angles occur most, and cover more screen area than the occurence of lesser filtered angled surfaces).

This isn't always 4x"b" vs 16x"ad". The numbers may vary depending on the predominant angles in a game. However, I firmly believe that with the same amount of work, "b" can always do at least as good as "ad" regarding quality, and visibly better most of the time.

This is where the problem occurs...you've gone so far into subjectivity that you state as fact that a lesser degree of aniso is always the same as a higher, or simply ignoring its occurence. You've dropped the number 4x (here at least), and sidestep around requiring "b" AF to get up to the same maximum degree (where, of course, I'd agree with you) by simply omitting the possibility.
This ends up ignoring the possibility that the "consistent" transistor savings might be why "ad" can go up to 16x in the first place (remains to be seen).

This leaves transistor count to be discussed.

On the one hand, people can only notice the difference between 16x and 4x when comparing to different screenshots, and on the other the difference between 16x and 2x on some angles on different surfaces would jump out at them? OK...

Click to expand...

...So the ideal tradeoff for you would be to limit to optimally implementing only up to 4x AF with high performance? What strikes me here is that it sounds like, since surfaces at 2x AF versus 4x AF would be a less obtrusive change, that 4x"ad"AF would be better than 16x, because reducing the glaring difference between 16xAF and 2xAF outweight the benefit of 4x to 16x?

Click to expand...

You aren't trying to spin my words, are you?

No, it is the logic you present as I see it, as discussed above.

People do notice the difference between 4x and 16x AF. However, you notice the difference between two adjacent surfaces on one image much, much easier than the difference between something you see and some other image you have in your mind.

So, the difference in changing degree at varying angles is more significant than the change from increasing degree at some angles, is what you're saying, right?

16x"ad" is better than 4x"ad".

According to the logic of your statements, minimizing the change in degree at varying angles is more significant than the loss from dropping maximum degree used.

This is the only reading of the logic you present that fits categorically dismissing implementing "ad" AF as completely without any conceivable image quality benefit merit even compared to "b" AF of lesser maximum degree, or any other merit, as you continue to do. Any other logic leaves them both with (performance independent) image quality advantages and disadvantages, to consider in conjunction with performance and transistor savings factors, and you not disagreeing with my criticisms of lack of objectivity when you proclaim "always" when you do, and make the 16x and 4x comparisons you do.
Again:

Xmas said:
However, I firmly believe that with the same amount of work, "b" can always do at least as good as "ad" regarding quality, and visibly better most of the time.

And as necessary to be logical and consistent with objectivity, and not ignoring the possibility of higher AF degree when it manifests, visibly worse, and at angles that are certainly objectively quite common in specific game types. Again, unless the "b" has the same maximum degree.

What I don't understand is how you determine that 4x versus another surface at 2x is so significant that the 4x "ad" is not categorically better than 16x "ad", yet 4x "b" is. There just doesn't seem to be room there for a position as unassailable as you attempt to portray.
Unless you are looking at colored mip levels in a tunnel. Which is why I'm still curious about game scene evaluations you have in mind.

However, it is not necessarily better than 4x"b".

It is worse at certain angles, and better at angles that occur frequently, even in the game types you agree are most common.
"Not necessarily" doesn't mean "never". I'm arguing with your commentary based on the latter, I've never disputed the former once, have I?

Or, generally, Mx"ad" is not necessarily better than Nx"b", M>N. The numbers just don't tell the whole story.

True, that's why I'm trying to discuss it in a broader context than used to dismiss it as "crappy" and "stupid" when looking at just the numbers that occur at the "ad" worst case angles. I'm not proposing that the best case angles are the only factor, I only keep mentioning them because that is the part you refuse to acknowledge as having any possible merit.

If they cut down AF to put in other features, then I think they got their priorities wrong.

Click to expand...

Even if those features were floating point texture filtering and/or work to speed it up?

Click to expand...

Now, on to transistor count. There are lots of other features than the ones you mention. What about brilinear filtering, for example?

I don't understand what you are asking in the context of the statement, though I am still wondering about your thoughts on the question of performance and transistor savings with regard to floating point texture handling.
If you want my view of brilinear, in short it is a positive feature that becomes negative when mislabelled and used to misinform consumers. The only parallel to "ad" and "b" AF I see is if "b" was used in test apps and for colored mip levels and "ad" of the same degree was used everywhere else, i.e. deception. That's only while brilinear is still used in the same way, which it seems it won't be at the current time (for the NV4x series, at least).

They managed to get 8x"b" AF into NV20 with its 57M transistors.

Which was 4 pipes with much less processing functionality...

Even into NV30 which was already huge.

True, but how much of that transistor count growth was due to register files for floating point processing and the associated functionality, as well as vertex processing changes? The NV30 is somewhat of an enigma in this context in how little it delivered and for what transistor cost.

Now they produce a 220M+ chip and don't have the space for the decreasing relative cost of non angle-dependent AF?

Errr...but they quadrupled the pipeline/quad count. BTW, not that this gives useful info outside of your proposition that the NV20's transistor count tells us something by itself: 4*57 = 228.
I don't quite see how your reasoning holds together without some further clarification.

I don't buy it.

I don't see your reasoning so far.
For myself, I think "ad" might be the way they managed up to 16x degree AF with practical performance impact and transistor budget, which remains to be clarified.
I also think support for AF on floating point textures might have been a consideration (you keep not responding to my query about considering the issue), either in transistor savings (if "b" AF is either absent or restricted in degre) or performance (if "b" AF is present, though I don't think this is likely to be significant for traditional texture usage for the 16 pipeline part).

Xmas · Apr 27, 2004

Sorry demalion, but I really think we're discussing entirely different points.

* As I already said, we're talking about image quality, so my POV is, of course, entirely and absolutely subjective. The best I can provide is an "informed opinion", from playing several games on ATI and NVida hardware with AF on, and having read several papers on different AF algorithms.

* When I discuss "identical degrees" of AF, I really mean that.. Well, at least "nominally identical", since the meaning of the numbers is a bit different for "b" and "ad". I later on compare different degrees.

* When I discuss "angle dependency only", I really mean that..
Whether there is no angle dependency, or whether there is no surface that would be affected by angle dependency, the result is identical AF degrees. So if the AF is based on the same algorithm, the work done is the same, and the result is the same. Incidentally, ATI and NVidia do not use the same algorithm, even besides LOD and aniso degree determination.

However, I have to admit my "Doom" style example was wrong, because it is still affected by angle dependency. Even plain horizontal surfaces are treated differently towards the sides of the screen.

* Angle dependency caps the degree of AF applied to a surface depending on certain properties of the position of said surface. The reason this is done is that there is a mathematical simplification of a rather complex formula which exhibits its largest errors just in the case of those surfaces. The reason is not that those surfaces need less AF. They don't.

The mathematical simplification saves transistors. I accept this as a very valid point. But I don't think it makes up for the drawbacks.

Let's say you have angle dependent AF set to degree M, doing a particular amount of work (i.e. taking a certain amount of texture samples while using up a certain amount of bandwidth). And you have non angle dependent AF set to degree N <= M, doing exactly the same amount of work. In this case, I think the latter always results in at least as good IQ, because it distributes the work more sensible.

The amount of work being the same implies that there are some surfaces which "ad" filters with higher AF, and some with lower AF. Those with higher AF do not make up for those with lower AF, IMO, in any circumstances.

* "support for AF on floating point textures" - I don't know exactly what you mean. If you mean, saving transistors on AF to include FP16 filtering, yes, I think they could have done that, however, and IMO then they should have saved transistors on something else. Like dropping brilinear filtering. Which is, btw, the reason I mentioned brilinear. It takes additional transistors, but trades image quality for a marginal increase in performance (IIRC one of the NV40 previews had a performance comparison, and brilinear was like 2% faster, but I don't know which one from the top of my head).
If you mean something else, I don't know what it could be. AF controls the sampling positions, no matter whether the contents of the texture are encoded as 8 or 16 bit fixed point, FP16 or DXTx.

demalion · Apr 28, 2004

Xmas said:
Sorry demalion, but I really think we're discussing entirely different points.

* As I already said, we're talking about image quality, so my POV is, of course, entirely and absolutely subjective. The best I can provide is an "informed opinion", from playing several games on ATI and NVida hardware with AF on, and having read several papers on different AF algorithms.

There is nothing wrong with your subjective evaluation. What I'm complaining about is the exclusions you justify by it. I will explain as I go along.

* When I discuss "identical degrees" of AF, I really mean that.. Well, at least "nominally identical", since the meaning of the numbers is a bit different for "b" and "ad". I later on compare different degrees.

Yes, but 1) you're repeatedly ignoring the point I'm making, 2) where have you actually acknowledged that different degrees is an improvement in image quality when the difference is in favor of "ad" AF?
First, you do this to evaluate image quality comparison to the simple exclusion of factors that don't work in favor of your opinion. I've discussed several advantages that could manifest as image quality advantage, generally related to higher degrees and suitable framerates and the relationship between them and transistor budget, and you simply excise any consideration of such a relationship and isolate all benefit to transistor budget as a label that doesn't relate to anything else at all.
Second, you did this when I was discussing your 4x and 16x comparison...I only brought up 4x"ad" versus 4x"b" to illustrate an inconsistency of significance of issues presented in your commentary, so mentioning it and considering only it in your summaries only serves the above exclusion.

* When I discuss "angle dependency only", I really mean that..
Whether there is no angle dependency, or whether there is no surface that would be affected by angle dependency, the result is identical AF degrees.

This is what I mean about exclusion. Angle dependency is just one factor, and the negative one, and the above case is as far as you go in recognizing that, completely refusing to acknowledge any significance to there being higher degrees for "ad" being discussed and simply omitting them in response when I point them out.
What reason do you have for constantly seeking to isolate the negative factor when I'm not arguing at all against it being negative?
Other factors that matter are performance, transistor budget, and what they might allow a user to get for image quality, like, as only one example, the higher degree consideration you continually omit by various means to no apparent purpose except precluding anything but your established opinion.

So if the AF is based on the same algorithm, the work done is the same, and the result is the same. Incidentally, ATI and NVidia do not use the same algorithm, even besides LOD and aniso degree determination.

This statement actually goes along with my problem with your methods of isolation and polarized evaluations of the merits for each approach. I'm also not aware of the significance of this regarding both of their "ad" methods in comparison to each other, which is important beyond arguing by the label "ad" and dismissing all other considerations.

However, I have to admit my "Doom" style example was wrong, because it is still affected by angle dependency. Even plain horizontal surfaces are treated differently towards the sides of the screen.

How differently, in comparison to what, and with what impact? I assume you're not simply mentioning a distinction to dismiss the "Doom" discussion and my having pointed out issues with the relative significance of the issues you ignore and the issues you focused upon?

* Angle dependency caps the degree of AF applied to a surface depending on certain properties of the position of said surface. The reason this is done is that there is a mathematical simplification of a rather complex formula which exhibits its largest errors just in the case of those surfaces. The reason is not that those surfaces need less AF. They don't.

What is the point in correcting something that was not said? The only thing remotely similar that I've said is that their occurence might not manifest as often in some scenes. Are you capable of conceding that at all?

The mathematical simplification saves transistors. I accept this as a very valid point. But I don't think it makes up for the drawbacks.

Again, the only benefit you are willing to concede is "saving transistors", treated as a phrase that simply ends there in significance and goes no further, as opposed to something that might deliver something of merit and complicate things beyond looking at just "drawbacks", even in the image quality department.

Let's say you have angle dependent AF set to degree M, doing a particular amount of work (i.e. taking a certain amount of texture samples while using up a certain amount of bandwidth). And you have non angle dependent AF set to degree N <= M, doing exactly the same amount of work.

Things excluded by this argument construction:
That you might have more transistors to allow the angle dependent AF to do more work (go to a higher degree M), which actually seems to be the case for past "ad" versus "b" AF implementations. You simply posit the work as "equal" and focus on your subjective view of one being "better" to continue on with labels such as "always".
That you might have algorithms and other logic as part of the tradeoff that get the same output out of less bandwidth usage (i.e., less texture cache misses or otherwise more effective texture caching).
That you might be able to do, with acceptable performance, things you could not otherwise do with the available bandwidth and logic, such as AF on floating point textures (as an example that might directly pertain to even the 16 pipe NV40).

In this case, I think the latter always results in at least as good IQ, because it distributes the work more sensible.

This is based on the fallacious exclusions that was built into your argument construction, the issue being with how you arrived at "always" by ignoring the exceptions, and propose "sensible" as anything more valid than an opinion re-stated while ignoring things that argue against it.
I could construct a similar argument about how spending transistors to improve the R300's AF would be a waste, by simply focusing on the things you continually drop or talk around, and omitting the things you repeat over and over as if I didn't understand.
Again, this matters in your proclamation of exclusive "sensibility" and the notable absence of completeness in what you recognize in supporting how sensibility is absent in something simply for not following your preference, and does not matter as far as your liking one more than the other (not to me, because I'm not at all trying to exclude your reasons for it).

The amount of work being the same implies that there are some surfaces which "ad" filters with higher AF, and some with lower AF.

"Work" is a slippery word, prone to abuse by making a statement using other words superficially not resemble itself when it might already have been discussed. I've discussed answers to this proposition before, both interpreted as associated with "image quality" and "maximum AF degree".

Those with higher AF do not make up for those with lower AF, IMO, in any circumstances.

And the problem is you ignore how the subjectivity of the arguments leading up to this statement allow it to apply your 16x"ad"AF versus 4x"b"AF example (among others) in the exact opposite of the way you stated, to the degree of argument you propose, depending solely on the personal preference of the person making it.
Both arguments are so bound to the associated preference that they serve to simply restate it.

* "support for AF on floating point textures" - I don't know exactly what you mean. If you mean, saving transistors on AF to include FP16 filtering, yes, I think they could have done that, however, and IMO then they should have saved transistors on something else. Like dropping brilinear filtering. Which is, btw, the reason I mentioned brilinear.

Thanks for the clarification.

It takes additional transistors, but trades image quality for a marginal increase in performance (IIRC one of the NV40 previews had a performance comparison, and brilinear was like 2% faster, but I don't know which one from the top of my head).

Well, the benefits do certainly seem scarcer, but I'm not so sure as you seem to be of the significance of increased transistor cost in comparison to, again, bandwidth restrictions in some situations and lower pipeline count designs.
I won't argue that I see limited benefit for the 16 pipe part for this feature in the areas of clear usefulness for it, and I'd expect the 12 pipe part as well.

If you mean something else, I don't know what it could be. AF controls the sampling positions, no matter whether the contents of the texture are encoded as 8 or 16 bit fixed point, FP16 or DXTx.

It seems I've pointed out the apparent error in your discussion about transistor count significance, and you've acknowledged the point above (though discussing brilinear seems a replacement of discussion that obscures that, not a clarification). This is sufficient if it isn't simply reversed when considering other things I've brought up.

Differences in AF quality betwen 6800 Ultra and 9800XT

Xmas

Porous

demalion

Xmas

Porous

demalion

Similar threads