Why such little drop in performance in X800 when AF applied?

Looking at the initial set of reviews, one has to wonder: why is there such little relative drop in performance on the X800 cards when AF is applied in certain Direct3D games?

First, let's take a quick look at AA performance. Using 4xAA, the NV 6800 cards seem to take a slightly smaller performance hit than the ATI X800 cards. This seems reasonable to me, given the fact that ATI is using gamma-corrected AA, which may account for some of the differences in performance. In the future, I imagine that ATI will work on providing intelligent and adaptive algorithms that apply varying amounts of AA depending on the situation in order to help further improve performance.

Now, let's look at AF performance. Both NV and ATI are using angle-dependent AF algorithms at the moment. However, there are still some slight differences in the algorithms and in the AF quality, as some reviewers have pointed out.

That said, it seems that ATI has figured out a clever way to optimize their anisotropic and trilinear filtering algorithms so that performance penalty in some Direct3D games is very small when using AF. Given that both NV and ATi now use angle-dependent AF algorithms, it seems that--depending on NV settings used during benchmarking--the ATI cards may be doing less work when AF is applied.

Clearly, software/driver optimization plays an important role in helping to minimize performance loss when AF is applied on the ATI cards. This is evident because both the X800XT and X800Pro incur relatively small hits to performance when using AF compared to the 6800 cards, even when a card like the X800Pro is at a significant disadvantage in fillrate and memory bandwith compared to a card like the 6800 Ultra.

These are some reasons that I have seen suggested for why the ATI X800 cards incur such a relatively small performance hit when AF is applied:

--optimized "trylinear" filtering
--angle-dependent anisotropic filtering algorithm that differs from NV's angle-dependent AF algorithm
--texture stage optimizations, where optimized filtering may be applied at everything at or beyond the base texture
--lod optimizations
--adaptive anisotropic filtering algorithm that may switch AF modes on the fly depending on the situation
--etc.

All in all, benchmarking the ATI X800 cards against the NV 6800 cards will be quite a task, given the fact that the new NV drivers allow the reviewer to enable/disable both trilinear and anisotropic filtering optimizations, while the ATI cards have no such options to turn off optimizations via ATI drivers.

Many reviewers also tend to provide little detail on exactly what driver and game settings were used during benchmarking. To make matters even more complicated, apparently there are some filtering optimizations that may not be noticeable in a screenshot but may be noticeable in-game during play when in movement.

Thoughts on these issues are much appreciated. Thanks
 
I'm not even sure where to begin looking in that thread. :D I am not accusing ATI of "cheating", but rather am interested in learning more about exactly why we see the relative differences in performance, and am interested in hearing opinions on how reviewers should go about benchmarking the ATI cards vs the NV cards, given the differences in optimization that can be enabled/disabled.

Also, on the NV 6800 cards, when aniso and trilinear optimizations are both enabled in the driver, is the performance hit with AF similar to what we see on the X800 cards in certain D3D games like FarCry? And if not, why would that be?
 
Benching both cards at default settings is the best idea. Besides the fact that they both use af texture stage optimizations, adaptive af and a reduced trilinear filter so they should be comparable this is also what most users will play at. Remember to comment and/or give screenshots comparing image quality. I personally prefer comments of whether image quality differences can be noticed (this gets around the problem of differences that are only noticed in motion).

No idea why ATI takes such a little hit, my theory is that itssomething to do with the clock speed advantage. i.e. it helps the X800s in an area which is clock for clock the same on both ranges of cards. I mean even an X800 pro has a higher clock speed than a 6800 ultra EE
 
dan2097 said:
Benching both cards at default settings is the best idea. Besides the fact that they both use af texture stage optimizations, adaptive af and a reduced trilinear filter so they should be comparable this is also what most users will play at. Remember to comment and/or give screenshots comparing image quality. I personally prefer comments of whether image quality differences can be noticed (this gets around the problem of differences that are only noticed in motion).

No idea why ATI takes such a little hit, my theory is that itssomething to do with the clock speed advantage. i.e. it helps the X800s in an area which is clock for clock the same on both ranges of cards. I mean even an X800 pro has a higher clock speed than a 6800 ultra EE

I imagine its mostly due to fill rate (which would be directly effected by clock speeds) Since AF is mostly fill rate limited, It would make sense if you take ATIS higher fill rates into account.
 
I thought AF was mostly memory-limited, or does all these trilinear and AF optimizations shift the bottleneck?
 
Pete said:
I thought AF was mostly memory-limited, or does all these trilinear and AF optimizations shift the bottleneck?

Are you sure? I am almost 100% sure Anti aliasing is bandwith bound, And anistropic filtering is fill rate bound.
 
Depends on he title, its texturing requirements and how well they fit in the cache. Look at UT2003/4 and generally speaking they have a larger hit for AF than AA, which I just believe to be the size, quantity and quality of the textures forcing a lot of cache misses with AF.

Generally speaking though, most titles have fairly good caching and AF is going to be fillrate limited with AF texturing, as its just cycling through the texture sampler up to 16 times.
 
I imagine its mostly due to fill rate (which would be directly effected by clock speeds) Since AF is mostly fill rate limited, It would make sense if you take ATIS higher fill rates into account.

I'm not so sure about this, given the fact that both the 12-pipeline X800Pro and 16-pipeline X800XT seem to have a relatively small performance hit when AF is applied in certain D3D games compared to the 6800 cards. The X800Pro has approximately the same fillrate as the 6800GT, and the 6800U/UE have significantly more fillrate than either of these two cards. Also, a 475Mhz core clock 6800Ultra almost certainly would have a higher percentage hit to performance in certain D3D games when AF is applied vs the 475Mhz core clock X800Pro (see the Anandtech results using a 460Mhz core clock 6800Ultra), so I doubt that sheer clockspeed is the primary factor here in determining low performance penalty. Notice that this phenomena of very low performance hit when using AF on ATI X800 cards is not seen in OpenGL games. Adding up all these factors leads me to believe that software/driver optimizations are playing a key role here in reducing the performance hit.
 
Re: Why such little drop in performance in X800 when AF appl

jimmyjames123 said:
First, let's take a quick look at AA performance. Using 4xAA, the NV 6800 cards seem to take a slightly smaller performance hit than the ATI X800 cards. This seems reasonable to me, given the fact that ATI is using gamma-corrected AA, which may account for some of the differences in performance.
I don't think this has anything to do with gamma-corrected AA. Both OpenGL guy and sireric have mentioned that R420 has a new highly configurable memory controller, and it'll take some time to get everything they can from it, especially when AA is enabled.

As for AF, I don't know. NVidia is accusing ATI of cheating, and some people have noticed more aliasing with R420's AF versus R300's. It makes sense that clockspeed is part of the equation, though. I'm waiting for ATI to carry these optimizations over to the OpenGL driver.
 
Another thing to bear in mind when thinking about texturing between ATI and NVIDIA's solutions is that ATI's texturing performance isn't affected by any Pixel Shader operations as it has a separate texture address processor, whereas NVIDIA's solution is using on of tha shader ALU's for some operations.
 
DaveBaumann said:
Another thing to bear in mind when thinking about texturing between ATI and NVIDIA's solutions is that ATI's texturing performance isn't affected by any Pixel Shader operations as it has a separate texture address processor, whereas NVIDIA's solution is using on of tha shader ALU's for some operations.


what are the advantages/disadvantages for both solutions :?:
 
Another thing to bear in mind when thinking about texturing between ATI and NVIDIA's solutions is that ATI's texturing performance isn't affected by any Pixel Shader operations as it has a separate texture address processor, whereas NVIDIA's solution is using on of tha shader ALU's for some operations.

Does this situation hold true for the Radeon 9800 R3xx-type cards? I don't recall seeing such a small percentage drop in performance on these cards when AF was applied.
 
jimmyjames123 said:
I imagine its mostly due to fill rate (which would be directly effected by clock speeds) Since AF is mostly fill rate limited, It would make sense if you take ATIS higher fill rates into account.

I'm not so sure about this, given the fact that both the 12-pipeline X800Pro and 16-pipeline X800XT seem to have a relatively small performance hit when AF is applied in certain D3D games compared to the 6800 cards. The X800Pro has approximately the same fillrate as the 6800GT, and the 6800U/UE have significantly more fillrate than either of these two cards. Also, a 475Mhz core clock 6800Ultra almost certainly would have a higher percentage hit to performance in certain D3D games when AF is applied vs the 475Mhz core clock X800Pro (see the Anandtech results using a 460Mhz core clock 6800Ultra), so I doubt that sheer clockspeed is the primary factor here in determining low performance penalty. Notice that this phenomena of very low performance hit when using AF on ATI X800 cards is not seen in OpenGL games. Adding up all these factors leads me to believe that software/driver optimizations are playing a key role here in reducing the performance hit.


I dont believe OpenGL gets ATIS AF optimisations, Correct me if I'm wrong but I was almost certain on that, When you compare ATIS Trilinear Optimisations to Nvidia trilinear optimisations. Then ATIS fill rate advantage should be obvious.


On a minor note, ATI has used texture stage optimizations in the past with Control panel AF, Are they still employing these? Nvidia these are disabled by default, However you can still turn on the AF optimizations for Nvidia
 
DaveBaumann said:
Another thing to bear in mind when thinking about texturing between ATI and NVIDIA's solutions is that ATI's texturing performance isn't affected by any Pixel Shader operations as it has a separate texture address processor, whereas NVIDIA's solution is using on of tha shader ALU's for some operations.
But that does not affect the performance of trilinear filtering or AF.
 
There's certainly some resource contention going on in relation to ATI's arrangement:

6800
image036.gif


X800
image030.gif
 
Alstrong said:
what are the advantages/disadvantages for both solutions :?:

Its areally a case of balances - whilst the ratio of texturing to shader ops is relatively high on the texturing side then having a separate texture address processor can give you more opportunistic texturing/shader scheduling to minimise stalls from either one. However, as the ratio of shader instructions to textures moves towards shader instructions it may be more beneficial to save the die space of having dedicated units and putting the relavent instructions in an ALU to have more ALU's that can be utilised for shader processing.
 
There's certainly some resource contention going on in relation to ATI's arrangement:

Is there any chance that this situation will be different for NV when the 6x.xx series Forceware drivers become more mature? Have you noticed any differences in the above data with newer drivers and tri and/or aniso optimizations enabled on the NV cards?
 
Back
Top