My R9700 experince and Aniso still "flawd"?

KimB · Sep 12, 2002

One poster here commented that when programming for the X-Box, disabling anisotropic on one of the texture stages can result in a very significant increase in performance.

So, another explanation may be that the GeForce4 can only calculate the degree of anisotropy once per pixel pipeline, so that if two textures go through that are both set to use anisotropic, then one has to wait for the next clock.

I could probably confirm this if I could find the settings for disabling anisotropic on every other texture stage, but I can't seem to find the settings (I thought they were available in RivaTuner, but can't seem to find them).

Humus · Sep 12, 2002

Chalnoth said:
I believe the main thing that ATI does to make its implementation a little bit more adaptive is that it reduces the amount of anisotropic filtering for textures that don't need it (such as lightmaps).

Lightmaps are typically small and will thus be magnified in almost all cases, so that shouldn't make any measureable difference.

KimB · Sep 12, 2002

Yes, Humus, which is why I feel it's definitely a good idea.

KimB · Sep 12, 2002

Which brings me to another point.

If the primary limitation of the GeForce3/4's anisotropic is indeed a failure to be able to compute the anisotropic degree for more than one texture per pixel pipeline per clock, then it may be that other hardware does not need to specifically disable anisotropic for certain texture stages for optimal performance.

After all, the lightmaps in question, since they are usually quite low-res, should need any degree of anisotropic in most situations.

alexsok · Sep 12, 2002

I thought they were available in RivaTuner, but can't seem to find them.

They are indeed avaiable in the Latest release of RivaTuner, which u can download here

But, in order to use it's unique feature you're talking about, you'll need 30.30+ detonator drivers, or the "optimize" option will be grayed out.

Xmas · Sep 12, 2002

Chalnoth said:
Notice that there is absolutely no additional performance hit from increasing the degree of anisotropic. This is simply because no anisotropic is actually being applied (something else is, apparently, causing the performance drop...hopefully the NV30 will fully address this issue). All polygons in this fillrate test are parallel to the viewplane, in which situation the anisotropic specifications require no anisotropy.

This is not true. The only reason for the "performance drop" is that anisotropic filtering really IS required here.
Remember, the degree of anisotropy depends on du/dx, dv/dx, du/dy and dv/dy only. If you put a 1024Â² texture on a 800x600 polygon, you need anisotropic filtering.

I did a lot of testing on my GF3 regarding AF. I wrote a simple program that just renders thousands of 1000x1000 quads with two 16Â² textures (totally fillrate limited) that are "stretched" across the quads
There is no inherent fillrate hit neither from using two textures nor from simply enabling AF. Only if AF really is applied, fillrate suffers.

2xAF is applied when the u/v ratio is above ~1.00034 (i.e. if i use texture coordinates [0, 0] [0, 1] [1.0004, 1] [1.0004, 0]), 4x above ~2.004 and 8x above ~4.02

Of course I can only speak for GeForce 3 here.

darkblu · Sep 12, 2002

Xmas said:
2xAF is applied when the u/v ratio is above ~1.00034 (i.e. if i use texture coordinates [0, 0] [0, 1] [1.0004, 1] [1.0004, 0]), 4x above ~2.004 and 8x above ~4.02

Of course I can only speak for GeForce 3 here.

would you mind sharing where you got the above figures from? devised them empirically?

Xmas · Sep 12, 2002

darkblu said:
Xmas said:

2xAF is applied when the u/v ratio is above ~1.00034 (i.e. if i use texture coordinates [0, 0] [0, 1] [1.0004, 1] [1.0004, 0]), 4x above ~2.004 and 8x above ~4.02

Of course I can only speak for GeForce 3 here.

Click to expand...

would you mind sharing where you got the above figures from? devised them empirically?

Yes. Like I said, I wrote a small program rendering thousands of quads, and counted seconds (10000 quads of 1000x1000 pixels take about 15 seconds on my Ti200

). I adjusted the texture coordinates to find out where the transitions occur. (Of course I took the time only to check if performance really always cuts in half. It's easy to spot AF with colored mipmaps

)

KimB · Sep 13, 2002

Update:
Hrm, just reread your previous post. I'll leave what I just posted (as usual...I generally don't like to delete things), but I now see why if a square texture is applied to a non-square surface, even if that surface is viewplane coplanar, then it would require anisotropic filtering.

Xmas said:
This is not true. The only reason for the "performance drop" is that anisotropic filtering really IS required here.
Remember, the degree of anisotropy depends on du/dx, dv/dx, du/dy and dv/dy only. If you put a 1024Â² texture on a 800x600 polygon, you need anisotropic filtering.

That is not true. Here's a link to the EXT_texture_filter_anisotropic extension specification:

http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_filter_anisotropic.txt

Here are the equations for generating the proper degree of anisotropic:

Px = sqrt(dudx^2 + dvdx^2)
Py = sqrt(dudy^2 + dvdy^2)

Pmax = max(Px,Py)
Pmin = min(Px,Py)

N = min(ceil(Pmax/Pmin),maxAniso)
Lamda' = log2(Pmax/N)

N is the degree of anisotropic calculated (which is generally rounded up to the nearest power of 2).

For those who have a hard time reading equations, or would rather not, here's a little explanation.

The Px and Py calculations should be recognized as akin to the pythagorean theorem. The result is that the Px is, effectively, the number of texels in the x direction, whereas Py is the number of texels in the y direction. The Pmax/Pmin basically sort the two into whichever is larger.

Notice how the degree of anisotropy chosen, N, is dependent upon the ratio of Pmax/Pmin. This means that the absolute maginitude of any of the partial derivatives is inconsequential. It's the ratios that matter.

I figured that lower-resolution textures would generally select lower degrees of anisotropy, however, as whenever a texture is magnified in all directions, there is no need to use any degree of anisotropy:

The particular scheme for anisotropic texture filtering is
implementation dependent. Additionally, implementations are free
to consider the current texture minification and magnification modes
to control the specifics of the anisotropic filtering scheme used.

KimB · Sep 13, 2002

Thanks, Aleksok, for the link.

With that version of RivaTuner, I did some tests.

When anisotropic filtering was disabled for stages 1 and 3, there was no difference in performance (either with ST or MT) in the fillrate tests, compared with normal anisotropic.

When anisotropic filtering was disabled for stages 0 and 2, there was no performance drop from enabling anisotropic.

This leads me to believe that my conjecture that only one texture per pixel pipeline per clock can have a degree of anisotropic calculated for it, and if the first texture has anisotropic calculated, then the second texture does not go through the pipeline.

Thus, it seems apparent to me that the optimal configuration for optimizing anisotropic filtering for GeForce3/4 cards is to place any high resolution texture (particularly the base texture) in stage 1 or 3, and any texture for which anisotropic might be disabled, such as a lightmap, in stage 0 or 2.

This really does seem like a bass ackwards way of doing things, so it might be nice if nVidia's drivers could switch around the stages when anisotropic was disabled for stage 1 or 3 (This might be nontrivial to accomplish, of course...).

WaltC · Sep 14, 2002

Chalnoth said:
Thanks, Aleksok, for the link.

With that version of RivaTuner, I did some tests.

When anisotropic filtering was disabled for stages 1 and 3, there was no difference in performance (either with ST or MT) in the fillrate tests, compared with normal anisotropic.

When anisotropic filtering was disabled for stages 0 and 2, there was no performance drop from enabling anisotropic.

I did some similar testing with my GF4 4600 (before I retired it), but rather than run synthetic benchmarks I ran some actual 3D games I'd been playing. Regardless of how I tinkered with the stages even setting two of them at a time to no filtering at all (and trying various combinations between the 4 stages RT allows you to control) I could find no performance difference that was apparent to me in any of the games I was playing, so in the end I just left all 4 stages set for full 8x AF.

KimB · Sep 14, 2002

While I would like to see some other situations, one of the things you must realize is that in a real gaming situation, there are often many polygons that are not viewplane coplanar, making it so that higher degrees of anisotropy are often used.

The thing is, for all polygons that are viewplane coplanar (and not distorted in any direction...), no anisotropy should be used. This is why the fillrate tests in 3DMark2k1 are of consequence.

By turning off anisotropic on the 0 and 2 stages, I think I've shown relatively effectively that at least part of the hit from enabling anisotropic stems from an inability to apply anisotropic to more than one texture per pixel pipeline.

I'd need to do some more testing, but I have a feeling that the other part has something to do with blending modes.

KimB · Sep 14, 2002

By the way, WaltC, I went ahead and did some real-game tests myself. I used Serious Sam: The Second Encounter for testing.

With stages 0 and 2 disabled, the score was much higher than with all stages enabled (85.8 fps vs. about 56 fps). As a side note, OpenGL with "performance anisotropic" ran the same demo at 68 fps on my machine.

One thing to keep in mind is that, at least in this version of RivaTuner, the particular stage settings for anisotropic are only available in Direct3D.

My R9700 experince and Aniso still "flawd"?

KimB

Humus

Crazy coder

KimB

KimB

alexsok

Xmas

Porous

darkblu

Xmas

Porous

KimB

KimB

WaltC

KimB

KimB

Similar threads