Anisotropic Filtering & Multi-texturing on GF3/4 a no-go

LeStoffer

Veteran
I just read Typedef Enum's review on www.nvnews.net...

http://www.nvnews.net/reviews/Leadtek_ti4400/page_8.shtml

...and something interesting caught my eye: When Anisotropic Filtering is enabled the fill-rate (3Dmark2001) drops down to a certain level and stays there regardless of aniso-level and regardless of single-texturing or multi-texturing.

I just verified this on my own GF4: Without aniso, the fill-rate is 1062 MTexels (single) and 2320 MTexels (multi). With aniso (2x, 4x or 8x) it drops to a solid 612 MTexels for both singel and multi.

I know that 3Dmark2001SE might be half the culprit here, but it still seems to suggest that GF3/4 uses it's multi-texturing "engine" to do aniso when enabled.
 
hrm.... it's possible the aniso engine is taking as many texel samples as the pipeline can per clock, therefore taking more (enabling multitexture), drops the fillrate a corresponding amount.

This helps explain the steep performance hit (halving fillrate). It also helps explain why the performance hit appears less when coupled with FSAA... aniso mostly takes more cycles to make up the fill hit, while multisampling takes more bandwidth, but no more fillrate.

Is the fillrate still halved when using bilinear, aniso, and multitexturing?
 
rhink said:
hrm.... it's possible the aniso engine is taking as many texel samples as the pipeline can per clock, therefore taking more (enabling multitexture), drops the fillrate a corresponding amount.

This helps explain the steep performance hit (halving fillrate). It also helps explain why the performance hit appears less when coupled with FSAA... aniso mostly takes more cycles to make up the fill hit, while multisampling takes more bandwidth, but no more fillrate.

Is the fillrate still halved when using bilinear, aniso, and multitexturing?

You got it backwards, Aniso takes more bandwidth, MS AA takes more fill-rate. ^_^;
 
Tagrineth said:
You got it backwards, Aniso takes more bandwidth, MS AA takes more fill-rate. ^_^;

No.
Not on the GeForce3/4 at least.

On the Radeons Aniso only has a bandwidth hit.
On (all) GeForces it also has a fillrate hit and that is larger than the bandwidth one.

The advantage of MSAA is that it has no fill-rate hit, only bandwidth.
If you'd combine MSAA with tile-rendering it would be completely free.
 
No, MSAA is mainly a fill rate hit.

However the gf4 has a seperate unit that drastically reduces this.

MSAA has a very small bandwidth hit as for the majority of pixels (ie the ones not on an edge) only one texture read is needed.
 
Re: Anisotropic Filtering & Multi-texturing on GF3/4 a n

LeStoffer said:
I just read Typedef Enum's review on www.nvnews.net...

http://www.nvnews.net/reviews/Leadtek_ti4400/page_8.shtml

...and something interesting caught my eye: When Anisotropic Filtering is enabled the fill-rate (3Dmark2001) drops down to a certain level and stays there regardless of aniso-level and regardless of single-texturing or multi-texturing.

I just verified this on my own GF4: Without aniso, the fill-rate is 1062 MTexels (single) and 2320 MTexels (multi). With aniso (2x, 4x or 8x) it drops to a solid 612 MTexels for both singel and multi.

I know that 3Dmark2001SE might be half the culprit here, but it still seems to suggest that GF3/4 uses it's multi-texturing "engine" to do aniso when enabled.

That is an interesting bit. I think this is the chart you refer to. Thanks.

anisotropic_fillrate.gif
 
Bambers said:
No, MSAA is mainly a fill rate hit.

Let me clear this, when I talk about fillrate (as opposed to bandwidth), I mean theoretical fillrate. This is what the card would achive without bandwidth limitations.

The actual fillrate is limited by two factors, the bandwith requirement, and the memory controller efficiency. If the bandwidth reqiured for the theoretical fillrate (handicapped by inefficiencies) is larger then the available memory bandwidth, than the actual fillrate will be below the theoretical one.

However the gf4 has a seperate unit that drastically reduces this.

What unit is this?
AFAIK, what nVidia did is that they optimized the memory controller. (More efficiency)

MSAA has a very small bandwidth hit as for the majority of pixels (ie the ones not on an edge) only one texture read is needed.

Let me see (example is the 3DMark2001 fillrate test).

No AA: 1 texel read (4bit), 1 pixel read (32bit), 1 pixel write (32bit)
2x AA: 1 texel read (4bit), 2 pixel read (2x32bit), 2 pixel write (2x32bit)

It's a 68 bit/pixel -> 132 bit/pixel increase. This is a 94% increase.
You call this a small one ???
 
Re: Anisotropic Filtering & Multi-texturing on GF3/4 a n

LeStoffer said:
... regardless of aniso-level ...

GeForces do Aniso "on-demand". My guess is that the texture in the 3DMark 2001 fillrate benchmark is not distorted that much to require larger than 2x aniso. If it was the result would be even worse.
 
The GF3 & 4 only take an AA fill rate hit for pixels where the subsamples are taken from different polygons. The Ti4600 can do 4.8 billion AA samples/sec. Unfortunately it doesn't compress redundant frame buffer information (multisampling uses same color for each subsample). I think it's safe to assume the bandwith is the limiting factor unless aniso and/or heavy pixel shaders are used.

(edit)
I'm hoping for compressed frame buffer with JGMS for the NV30. Of course compression is unnecessary if Gigapixel tech is utilized, but I doubt that will happen.
 
easyride said:
4 Z-check units per pipe.

Even the GF3 can reject 16 pixels/clock based on the Z-check. Thats 4 per pipe.
Also the 3DMark2001 fillrate test seems to do alpha-blending but no z-buffering (I might be wrong though).
 
The gf4 has a separate unit to do multisampling in parallel with the rest of the rendering. Im not sure on how it works but the gf4 can do 4 times as many AA samples/s as pixels/s compared to the gf3s twice as many.
 
Bambers said:
The gf4 has a separate unit to do multisampling in parallel with the rest of the rendering. Im not sure on how it works but the gf4 can do 4 times as many AA samples/s as pixels/s compared to the gf3s twice as many.

Both GeForce 3 and 4 are capable of producing 4 AA samples per pxiel pipe (through the use of the 4 Z units per pipe).
 
GeForces do Aniso "on-demand". My guess is that the texture in the 3DMark 2001 fillrate benchmark is not distorted that much to require larger than 2x aniso. If it was the result would be even worse.

Its not some special "on demand" feature! Thats just how anisotropic filtering works! Extra samples dependent upon angle of polygon to viewplane.

Now, AFAIK, the polygons in the fillrate test are actually PARALLEL to the viewplane, and as such, should not be aniso'd at all!! So Aniso should have ZERO effect on this test.
On a radeon, what affect does Ansio have on the 3dmark fillrate test?
 
Re: Anisotropic Filtering & Multi-texturing on GF3/4

Hyp-X said:
LeStoffer said:
... regardless of aniso-level ...

GeForces do Aniso "on-demand". My guess is that the texture in the 3DMark 2001 fillrate benchmark is not distorted that much to require larger than 2x aniso. If it was the result would be even worse.

You might be right on the money here. The fill-rate (multi) drops almost to 1/4 with aniso, which only makes sense when you think about the fact that the GF3/4 otherwise can apply 4 textures in one pass (DX8).

So GF3/4 multi texture ability is killed when aniso is enabled (according to 3Dmark2001 that is).

Is there any chance that some of you folks have a contact within nVidia to comment on this. Where is Ghost of Envy? 8)
 
On a radeon, what affect does Ansio have on the 3dmark fillrate test?

It doesn't appear to have any effect at all. I threw in a couple extra low-detail tests as a control to ensure anisotropy was properly enabled for the quicky tests:


No anisotropic filtering.
Game 2 - Dragothic - Low Detail: 151.9 fps
Game 3 - Lobby - Low Detail: 126.5 fps
Fill Rate (Single-Texturing) 821.3 MTexels/s
Fill Rate (Multi-Texturing) 1807.6 MTexels/s

4x anisotropy
Game 2 - Dragothic - Low Detail: 140.4 fps
Game 3 - Lobby - Low Detail: 116.1 fps
Fill Rate (Single-Texturing): 821.2 MTexels/s
Fill Rate (Multi-Texturing): 1807.5 MTexels/s

16x anisotropy
Game 2 - Dragothic - Low Detail: 138.0 fps
Game 3 - Lobby - Low Detail: 106.3 fps
Fill Rate (Single-Texturing): 820.9 MTexels/s
Fill Rate (Multi-Texturing): 1807.5 MTexels/s
 
Sharkfood said:
It doesn't appear to have any effect at all. I threw in a couple extra low-detail tests as a control to ensure anisotropy was properly enabled for the quicky tests:

This supports my recollection/theory that the Fillrate polygons ARE parralell to the viewplane, and as such, Anisotropic should have ZERO effect.
 
[quote="Althornin]
This supports my recollection/theory that the Fillrate polygons ARE parralell to the viewplane, and as such, Anisotropic should have ZERO effect.[/quote]

Yeah, when I come to think about, we might have had this discussion before.

Anyway: There still seems to be something fishy about the way nVidia determinds where and when aniso filtering should be applied. Could anyone check whether this hit in multi texturing with aniso also happens in Serious Sam? I don't have the game, but there is a benchmark for pure multi texturing, isn't there?
 
Back
Top