Why such little drop in performance in X800 when AF applied?

It’s possible ATI went with a pipeline arrangement that sacrifices some non-AF performance for better AF performance. If that’s the case, then it would be a good tradeoff as AF usage is pretty well a given on a card as fast as the X800. Since the R420 is based heavily on R300 (and not a from “scratch” design) ATI would have had more time to massage or tweak the design and remove some of the performance bottlenecks for AF.
 
Xmas said:
DaveBaumann said:
Another thing to bear in mind when thinking about texturing between ATI and NVIDIA's solutions is that ATI's texturing performance isn't affected by any Pixel Shader operations as it has a separate texture address processor, whereas NVIDIA's solution is using on of tha shader ALU's for some operations.
But that does not affect the performance of trilinear filtering or AF.

Unless ATi somehow uses the Pixel shaders to augment filtering operations (ie uses a shader that performs like a texture address processor) which could almost be like having two texturing units per pipe.
 
Radar: a.) The performance graphs already indicate the opposite to that. b.) ATI don't have two texture units per pipe.
 
Don't pixel shaders need to be capable of texture Addressing?

Isn't it possible that if the pixel shader doesn't share this capability with the texturing unit that it could perform some calculations on behalf of the texture unit (assisting or speeding the texture unit up). This would be handy for AF where you are doing lots of lookups into the texture for filtering.
 
DaveBaumann said:
There's certainly some resource contention going on in relation to ATI's arrangement:
But there's nothing that hints at this being because more advanced filtering in the TMU blocks the ALU. It doesn't. After the texture coordinates are sent to the TMU, SU0 takes no part in the texture sampling process in any way.

If anything, these graphs show the differences between the AF algorithms. NVidia takes more samples on average, so this shader shifts from being arithmetic limited to partially texturing limited with higher AF. There's hardly any difference between NoAF and 2xAF. This indicates that the shader is still arithmetic bound. If it were the case that the additional sampling cycles took some resources from SU0, 2x would be slower.
 
Xmas said:
If anything, these graphs show the differences between the AF algorithms. NVidia takes more samples on average, so this shader shifts from being arithmetic limited to partially texturing limited with higher AF.

More samples from to 2X or 4X AF to ATI's 16X? Afterall, we are seeing drops for pretty much each AF setting on the NVIDIA board.

The tests are already arithmetically limited anyway, since I test these across each resolution earlier in the reviews and you can clearly see they are fill-rate (shader) bound (with a slight system limitation for lower res):

http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=21
r3d.gif


X800
image005.gif


The point being that these are already arithmetically bound, even at lower resolutions, hence the texture sampling could be "free" if you are hiding the instructions and any other temporaries - in the case of the Radeon that is happening, but its not with the 6800.
 
Xmas .. while I could agree with you there about 0xFF and 2xFF .. once you start getting to 8xFF and 16xFF the cards should start stressing. I think that's what Dave was showing.

image030.gif


Even the X800Pro with 12 pipes doesn't flinch.

So that might show why the X800 might have superior Pixel Shader operations, as it has a separate texture address processor.
 
The percentage of pixels on which 16X AF needs to be applied is usually very small. It's actually backwards from what you say: the greatest speed bump is from 1X to 2X, because in a typical 3D scene a good percentage of the pixels will need 2X AF.

This is easy to visualise: consider a typical FPS environment with a chessboard floor stretching away into the distance. You'll see that is it rare that the squares are 16 pixels wide for 1 pixel high - as this only happens near to the vanishing point - so it's a fairly small area.

Typically we see that very approximately, the additional penalty for each doubling of the samples is half the previous step. Therefore if there's no visible penalty for 2X AF, I wouldn't expect to see much more even for 16X.
 
I'm having a hard enough time keeping up with the convo (not my best subject) but i notice that ati only looses about 2-4% perfromance when going from 8x aniso to 16 aniso .

Has anyone else noticed that ?
 
jvd said:
I'm having a hard enough time keeping up with the convo (not my best subject) but i notice that ati only looses about 2-4% perfromance when going from 8x aniso to 16 aniso .

Has anyone else noticed that ?

In what games? Nvidia only loses like 1 FPS in UT2003 going from 8x-16x
 
ChrisRay said:
jvd said:
I'm having a hard enough time keeping up with the convo (not my best subject) but i notice that ati only looses about 2-4% perfromance when going from 8x aniso to 16 aniso .

Has anyone else noticed that ?

In what games? Nvidia only loses like 1 FPS in UT2003 going from 8x-16x

quake 3 and doom 3 :) . In d3d games like farcry i don't notice any hit at all
 
jvd said:
ChrisRay said:
jvd said:
I'm having a hard enough time keeping up with the convo (not my best subject) but i notice that ati only looses about 2-4% perfromance when going from 8x aniso to 16 aniso .

Has anyone else noticed that ?

In what games? Nvidia only loses like 1 FPS in UT2003 going from 8x-16x

quake 3 and doom 3 :) . In d3d games like farcry i don't notice any hit at all

Ahh I havent done any AF tests in OpenGL, That surprises me. The inconsistency :)
 
ChrisRay said:
Ahh I havent done any AF tests in OpenGL, That surprises me. The inconsistency :)
It's ATi. It's OpenGL. It's ATi's OpenGL. What did you expect? Same efficiency as in D3D? :?
 
I am still curious if ATI still employs the texture stage optimisations in its Control Panel AF. It used too do trilinear on first stage and Bilinear on the others.

Dont know if thats still occuring. Does anyone know?
 
ChrisRay said:
I am still curious if ATI still employs the texture stage optimisations in its Control Panel AF. It used too do trilinear on first stage and Bilinear on the others.

Dont know if thats still occuring. Does anyone know?

Ati will do first stage af and the rest bilinear unless the program asks for something else .
 
jvd said:
ChrisRay said:
I am still curious if ATI still employs the texture stage optimisations in its Control Panel AF. It used too do trilinear on first stage and Bilinear on the others.

Dont know if thats still occuring. Does anyone know?

Ati will do first stage af and the rest bilinear unless the program asks for something else .

Hmm Nvidia doesnt do that by default, Texture stage optimisations are disabled.

You can turn them on with the Nvidia drivers and performance improves significantly, But AF optimisations are off by default.
 
ChrisRay said:
jvd said:
ChrisRay said:
I am still curious if ATI still employs the texture stage optimisations in its Control Panel AF. It used too do trilinear on first stage and Bilinear on the others.

Dont know if thats still occuring. Does anyone know?

Ati will do first stage af and the rest bilinear unless the program asks for something else .

Hmm Nvidia doesnt do that by default, Texture stage optimisations are disabled.

You can turn them on with the Nvidia drivers and performance improves significantly, But AF optimisations are off by default.
And ? if the program doesn't ask for it why bother doing it ?
 
jvd said:
ChrisRay said:
jvd said:
ChrisRay said:
I am still curious if ATI still employs the texture stage optimisations in its Control Panel AF. It used too do trilinear on first stage and Bilinear on the others.

Dont know if thats still occuring. Does anyone know?

Ati will do first stage af and the rest bilinear unless the program asks for something else .

Hmm Nvidia doesnt do that by default, Texture stage optimisations are disabled.

You can turn them on with the Nvidia drivers and performance improves significantly, But AF optimisations are off by default.
And ? if the program doesn't ask for it why bother doing it ?

I'm just saying, Texture stage optimisations can make a big deal in regards to performance.

If you saw my NV40 Anisotropic Filtering investigation over at nvnews. On Nvidia cards it improves performance significantly. So now it makes you wonder. Nvidia has these opts disabled by default (Nvidias Texture stage opts are a bit more aggressive than ATIS IIRC)
 
I believe NVIDIA's trilinear optimizations are enabled by default. Easily turned off, though, and, yes, the filtering optimizations give roughly a 20% boost across the board according to my somewhat limited testing.
 
I'm just saying, Texture stage optimisations can make a big deal in regards to performance.

If you saw my NV40 Anisotropic Filtering investigation over at nvnews. On Nvidia cards it improves performance significantly. So now it makes you wonder. Nvidia has these opts disabled by default (Nvidias Texture stage opts are a bit more aggressive than ATIS IIRC)

I vist nvnews as often as i vist rage3d. Which is once a month.

If nvidia has it disabled by default then that is nvidia's problem is it not ?

If the game requests it then ati does it. It is not forced .

So if farcry wants it on through all texture stages than it will do it on all texture stages.

If will not force farcry to do it on only the first texture stage .

I do not know if nvidia's version will allways force it or lets the aplication choose .
 
Back
Top