ATi is chtg in Filtering

jimmyjames123 · May 17, 2004

NV40 has a pixel shader performance hit when utilising AF, R420 does not.

Any ideas why this would be the case?

Blacklash · May 17, 2004

sonix666 said:
This is quite fishy. After a year of very good behaviour from ATi compared to nVidia, I wouldn't have expected them to pull this stunt if it is true. Finally nVidia gave users a choice between real trilinear and brilinear, and now ATi pulls a NV3x.

This is first real competition Ati has received from Nvidia in some time. The coming months are the time to 'watch behavior'. The way the launches went down, to me, says the card series are very close.

Dave Baumann · May 17, 2004

jimmyjames123 said:
NV40 has a pixel shader performance hit when utilising AF, R420 does not.

Click to expand...

Any ideas why this would be the case?

The conclusion I reach is its because ALU1 in NV40 is tied to the texture unit and the extra cycles taking the extra samples are cycles that this ALU can't be used. R420's shader units don't have a direct relationship with the texture units so it can probably hide the extra texture accesses a little better when executing ALU instructions.

http://www.beyond3d.com/reviews/ati/r420_x800/index.php?p=24
http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=28

bloodbob · May 17, 2004

Assuming their is coloured mipmap detection.

Okay folks I think I got an idea on HOW they might be detecting it. Ati might be detecting ALL non-autogenerated mip maps and doing full trilinear on them. For autogenerated mipmaps they use a less then 100% trilinear/bilinear mix.

Someone should write up a little benchmark that benchmarks with A) auto-gened mipmaps and B) custom mipmaps and send it to some people who have access to R420 and get them to use the benchmark. ( Preferable lots of Hi-Res textures with lots of layers ).

jimmyjames123 · May 17, 2004

Comparing the drop for AF with AA is not a good test because this can often hide the performance issues associated with AF.

If you test 4xAA and gather one set of data, and then add on 8xAF and gather another set of data, then you get the performance delta from 4xAA to 4xAA/8xAF. That's all I was showing. Whether AA is on or not probably would not change the fact that the X800XT and X800Pro have a much much smaller performance hit from AF than the 6800U in games like UT2004 and FarCry. Tom's tested UT2004 with just 8xAF, and found as such. Now, we are starting to get to the heart of the matter, which is why we are seeing these results? Initially, I suggested that some reviewers have found NV's AF algorithm to be slightly clearly/sharper than ATI's AF algorithm, and this could account for some of the difference seen here. Also, you suggested that the 6800U loses some ps performance when AF in enabled, and this could account for some of the difference seen here as well.

Dave Baumann · May 17, 2004

9600 shows the same thing since Cat 3.4, which I believe was the first official drivers that support RV350. With full filtering you see more or less the same as you do for the X800, however if you drop down the texture quality slider BOTH the standard and coloured mip map differences also change, and the "trilinear optimisations" are present in the coloured mip shots when the texture preference slider is knocked down a notch.

micron · May 17, 2004

DaveBaumann said:
9600 shows the same thing since Cat 3.4, which I believe was the first official drivers that support RV350. With full filtering you see more or less the same as you do for the X800, however if you drop down the texture quality slider BOTH the standard and coloured mip map differences also change, and the "trilinear optimisations" are present in the coloured mip shots when the texture preference slider is knocked down a notch.

And you were trying to tell us that were'nt you...

Dave Baumann · May 17, 2004

If you test 4xAA and gather one set of data, and then add on 8xAF and gather another set of data, then you get the performance delta from 4xAA to 4xAA/8xAF. That's all I was showing.

And I'm saying thats not a very good test because there are too many variables there to be particularly useful in narrowing down what we are looking at there.

So any ideas why we would see a 20-30% decrease in performance for the X800XT/X800Pro when using 4xAA/8xAF vs 4xAA in IL-2, while the 6800U only loses 15% performance?

See, this is an example of different architectures doing different things in different scenarios when you introduce more variables. (And in this particular instance I'd take a stab at the X800 not being able to cache all the texture required for the extra samples in all cases, hence having to get more of the texture into its cache thus having a bandwidth penalty which is already constrained due to the use of FSAA).

Dave Baumann · May 17, 2004

micron said:
And you were trying to tell us that were'nt you...

Sorry?

jimmyjames123 · May 17, 2004

So DaveB, any ideas why we would see a 20-30% decrease in performance for the X800XT/X800Pro when using 4xAA/8xAF vs 4xAA in IL-2, while the 6800U only loses 15% performance?

We see a 4-7% decrease in performance for the X800XT/X800Pro when using 4xAA/8xAF vs 4xAA, while the 6800U loses 26% performance. Finally, we see a 8-9% decrease in performance for the X800XT/X800Pro when using 4xAA/8xAF vs 4xAA in FarCry, while the 6800U loses 25% performance.

Interesting results to say the least.

Dave Baumann · May 17, 2004

See above.

overclocked · May 17, 2004

Just a thoughtÂ´s of the debate from my understanding on whatÂ´s been said, iÂ´m trying to get the picture but please correct me if iÂ´m wrong.
First is the argument about when the driver detectÂ´s the app and then change it in a smart way, as when doing in-depht reviews and looking just at these thingÂ´s. I would call that a VERY "smart" bug...

Secondly i can see why/if ATI is doing this because when we take NV40U and X800XT i and many otherÂ´s look just at these ultra high settings with Maximum quality in filtering, and then a few or more frames makes all the difference.

jimmyjames123 · May 17, 2004

And I'm saying thats not a very good test because there are too many variables there to be particularly useful in narrowing down what we are looking at there.

Well, I disagree with that. These "tests" are useful in showing how performance changes from 4xAA to 4xAA/8xAF, period (there are also some 8xAF exclusive tests at Tom's that show the disparity in pure AF performance). If we use arguments such as "different architetures do different things", then we will never have a good enough "test" to definitely explain differences anyway. But we're just arguing semantics here anyway, and we have made some headway in proposing different reasons for why the data is as such.

Dave Baumann · May 17, 2004

jimmyjames123 said:
Well, I disagree with that. These "tests" are useful in showing how performance changes from 4xAA to 4xAA/8xAF, period. And if we use arguments such as "different architetures do different things", then we will never have a good enough "test" to definitely explain differences. But we're just arguing semantics here anyway.

A good test removes as many variables as possible (thats a fairly simple and well established scientific principle) - which in this case means, preferably, using a fixed function pipeline title with application controlled AF (or using rtool) that shows some variance with AF enabled. Good tests would be SS:SE and CoD.

jimmyjames123 · May 17, 2004

As many tests as possible would be nice. Will be interesting to see how things play out

Marc · May 17, 2004

Seems there is also something strange with NV40 when you enable aniso via CP

UT2003

aniso8x via game : 49.7
idem + colored mip : 49.7

aniso8x via cp : 51.0
idem + colored mip : 49.7

micron · May 17, 2004

DaveBaumann said:
micron said:

And you were trying to tell us that were'nt you...

Click to expand...

Sorry?

Regarding the 9600 series...you hinted that something was not right with the aplication slider, and the results given...didnt you?

Dave Baumann · May 17, 2004

No, I was just saying that 9600 had the capabilities for these trilinear optimisations and they could be enabled via the slider. What appears apparent is that what has been pointed out with the X800 from the article has always been present with this type of filtering since they introduced the hardware capabilties, which suggests that either they have attempted to conceal this from day one or we are just looking at how this particular filtering actually operates.

micron · May 17, 2004

DaveBaumann said:
either they have attempted to conceal this from day one or we are just looking at how this particular filtering actually operates.

They would have told somebody like you though Dave....if this is trully how they planned on implementing their tilinear filtering...you dont simply decide to do something like this, and expect people not to cry foul apon discovery.
**edit**
Has this technique really been around since Cat 3.4?...on the 9600 series?

Dave Baumann · May 17, 2004

9600 PRO, Cat3.4 1.2MB (JPG's will have to do now, PNG's will take to long to upload as I'm about to hit the sack)

Anyway, chew on these for a while. This are based on the same tests as before, but using a 9600 PRO and Cat3.4's. I've included the differences of Bilinear vs Trilinear for both the full texture slider quality and one notch down, and also included the differences with the mip map colouring on.

Before you say anything, yeah that is (apparently) the difference between the performance bi/trilinear (without coloured mip levels) - I ran it several times and got the same result.[/url]

ATi is chtg in Filtering

jimmyjames123

Blacklash

Dave Baumann

Gamerscore Wh...

bloodbob

Trollipop

jimmyjames123

Dave Baumann

Gamerscore Wh...

micron

Diamond Viper 550

Dave Baumann

Gamerscore Wh...

Dave Baumann

Gamerscore Wh...

jimmyjames123

Dave Baumann

Gamerscore Wh...

overclocked

jimmyjames123

Dave Baumann

Gamerscore Wh...

jimmyjames123

Marc

micron

Diamond Viper 550

Dave Baumann

Gamerscore Wh...

micron

Diamond Viper 550

Dave Baumann

Gamerscore Wh...

Similar threads

ATi is ch**t**g in Filtering

Gamerscore Wh...

Trollipop

Gamerscore Wh...

Diamond Viper 550

Gamerscore Wh...

Gamerscore Wh...

Gamerscore Wh...

Gamerscore Wh...

Diamond Viper 550

Gamerscore Wh...

Diamond Viper 550

Gamerscore Wh...

Similar threads

ATi is chtg in Filtering