When enough is enough (AF quality on g70)

Ailuros said:
There aren't or shouldn't be any optimisations enabled at high quality. All optimisations on for 77.77, means all optimisations enabled.

I do speak German since I grew up in Germany if you've forgotten:

It shouldn't? HQ not equal no optimization. HQ can theoritcally stand for anything. What nvidia declares as HQ may not the same definitions i have or anybody else and the definition nvidia had with nv40 can apparently change as time goes on with newer hardware.

I know you can read german but it was more towards the people who can't to better understand what the article tries to do.
 
It shouldn't? HQ not equal no optimization. HQ can theoritcally stand for anything. What nvidia declares as HQ may not the same definitions i have or anybody else and the definition nvidia had with nv40 can apparently change as time goes on with newer hardware.

Then you might want to ask computerbase of explanations then because that one here:

...also der Abstand der Ergebnisse mit allen Optimierungen (77.77), mit der „Pfeiloptimierung“ (76.10) und ohne jegliche Optimierungen (71.84, identisch mit der Bildqualität des NV4x-Chips)...

....nor what they call "all optimisations" for the red coloured bars are actually "high quality". The only other optimisation that stays enabled in current drivers is the shimmering thing in high quality; all others get disabled and there is a 25-30% performance difference between optimisations on and off either on 6800 or 7800.
 
To be clear - these are the driver enabled/disabled trilinear/Mip-Map sample optimisations - but not any app specific optimisations built into the driver profiles, if any. Correct?
 
Presumably it's possible to do two things:

1. see if the 6800U gets the same performance improvement over this driver set

2. see if the 6800U when "device ID tweaked" to make it look like a 7800GTX gets the same performance improvement

Jawed
 
2. see if the 6800U when "device ID tweaked" to make it look like a 7800GTX gets the same performance improvement

I have actually been trying this without much success. I have spent alot time trying to find options to help people get a work around for the shimmering. Honestly I'm a bit perplexed by these findings. I modified the Ini files myself and installed the 7800GTX on the 71.89 drivers and my results were disasterous.
 
A bit off course...

I just wonder, is it possible at driver or HW level to rotate the angled AF pattern together with the 3D scene, instead of fix it to the 2D screen (witch mostly caused the shimmering).
That way (if we consider the present situation with AF filtering on NV40/G70), we won't get full 16x qualty anyway, but at least the shimmering would be surpressed (mostly)?!
 
fellix said:
I just wonder, is it possible at driver or HW level to rotate the angled AF pattern together with the 3D scene, instead of fix it to the 2D screen (witch mostly caused the shimmering).
That way (if we consider the present situation with AF filtering on NV40/G70), we won't get full 16x qualty anyway, but at least the shimmering would be surpressed (mostly)?!

Texture anisotropy is produced by the projection of a pixel from a 2D framebuffer over a texture. This projection is what creates the anisotropy as the projected shape may not be a regular square. However in the current implementations a texture sample is either a point or a 'square' of four texels (plus the two mipmaps for trilinear). To solve the anisotropy (read not squarish projected form) anisotropy filtering takes additional texture samples (squares) to form a shape more similar to the projected shape. For example if you measure anisotropy only in the x axis and the projected shape looks longer in the x axis than wider in the y axis you may take two or more texture samples on the direction of x axis projected in the texture. So obviously the measure of the anisotropy is based on the 2D framebuffer (actually the derivatives of texture coordinates u and v over x and y AKA dudx, dvdx, dudy, dvdy).

In any case I keep saying that taking more AF samples in those angles isn't going to help unless you change the AF algorithm to support and measure anisotropy in additional axis (I may try someday to add more axis to the simulator to see how it looks) or you use an AF algorithm that doesn't measure the anisotropy based on axis but based on the shape of the projected sample area and allows to take AF samples in a grid rather than in a line, which in my opinion isn't very feasible without more transistors and likely increased latency/pipeline stages.
 
Last edited by a moderator:
Randell said:
To be clear - these are the driver enabled/disabled trilinear/Mip-Map sample optimisations - but not any app specific optimisations built into the driver profiles, if any. Correct?

I haven't found a way yet to disable the "optimisation" that causes the shimmering; thus when I read all optimisations on or all optimisations off, I consider them to be the 3 optimisations available in the driver control panel.

If I'm not evaluating the computerbase article wrong and the real difference is only between green and blue, then I really start to wonder if this one is actually intentional. And if it is, who on God's green earth would degrade as much IQ for such a small performance increase....
 
RoOoBo said:
Texture anisotropy is produced by the projection of a texture sample (in the current implementations is a 'square' of one to four texels, plus two mipmaps for trilinear) over a 2D framebuffer. The projection on the 2D framebuffer is what creates the anisotropy as the projected shape may not be a regular square. So, obviously, the measure of the anisotropy is based on the 2D framebuffer (actually the derivatives of texture coordinates u and v over x and y AKA dudx, dvdx, dudy, dvdy).

In any case I keep saying that taking more AF samples in those angles isn't going to help unless you change the AF algorithm to support and measure anisotropy in additional axis (I may try someday to add more axis to the simulator to see how it looks) or you use an AF algorithm that doesn't measure the anisotropy based on axis but based on the shape of the projected sample area and allows to take AF samples in a grid rather than in a line, which in my opinion isn't very feasible without more transistors and likely increased latency/pipeline stages.


Since you have a fair understanding of the math involved with AF algorithms, could you think of any possible flaw/error in the equation that could cause some sort of underfiltering, yet only on a specific area of the screen?
 
Ailuros said:
Since you have a fair understanding of the math involved with AF algorithms, could you think of any possible flaw/error in the equation that could cause some sort of underfiltering, yet only on a specific area of the screen?

No.

In any case I don't think the equations are somekind of secret. The anisotropy extension shows the equations for measuring the anisotropy axis on two axis: X and Y. The ATI R2xx family seems to implement precisely that algorithm. The current ATI and NVidia algorithm combines the measured anisotropy for four axis X and Y and the -45º and 45º. Precisely the linear combinations of the derivative vectors used for the X and Y axis.

This is the code the simulator is currently using. The sample positions calculated for the for the 45º axis aren't the correct ones though.

Code:
        /*  Calculate texture scale in the horizontal and vertical screen axis.  */
        px = GPU_SQRT(dudx * dudx + dvdx * dvdx);
        py = GPU_SQRT(dudy * dudy + dvdy * dvdy);

        /*  Calculate texture scale on the XY/YX axis (axis rotated 45 degrees).  */
        pxy = GPU_SQRT((dudx + dudy) * (dudx + dudy) + (dvdx + dvdy) * (dvdx + dvdy));
        pyx = GPU_SQRT((dudx - dudy) * (dudx - dudy) + (dvdx - dvdy) * (dvdx - dvdy));

        /*  Calculate the minimum and maximum for both axis.  */
        pMin = GPU_MIN(px, py);
        pMax = GPU_MAX(px, py);

        /*  Calculate ratio for X/Y axis.  */
        N = GPU_MIN(pMax/pMin, f32bit(maxAniso));

        /*  Calculate the minimum and maximum for both axis.  */
        pMin2 = GPU_MIN(pxy, pyx);
        pMax2 = GPU_MAX(pxy, pyx);

        /*  Calculate ratio for XY/YX axis.  */
        N2 = GPU_MIN(pMax2/pMin2, f32bit(maxAniso));

        /*  Determine which */
        if (N >= N2)
        {
            if (pMax == px)
                axis = TextureAccess::X_AXIS;
            else
                axis = TextureAccess::Y_AXIS;

            /*  Calculate the number of samples required.  */
            samples = u32bit(GPU_CEIL(N));

            /*  Calculate the texture scale for each sample.  */
            //scale = pMax / f32bit(samples);
            scale = pMax / N;
        }
        else
        {
            /*  Determine the anisotropy axis.  */
            if (pMax2 == pxy)
                //axis = TextureAccess::XY_AXIS;
                axis = TextureAccess::X_AXIS;
            else
                //axis = TextureAccess::YX_AXIS;
                axis = TextureAccess::Y_AXIS;

            /*  Calculate the number of samples required.  */
            samples = u32bit(GPU_CEIL(N2));

            /*  Calculate the texture scale for each sample.  */
            //scale = pMax / f32bit(samples);
            scale = pMax / N2;
        }

        /*  Calculate the per anisotropic sample offsets in s,t space.  */
        switch(axis)
        {
            case TextureAccess::X_AXIS:
//printf("X ");
                dsOffset = dudx / f32bit(samples + 1);
                dtOffset = dvdx / f32bit(samples + 1);
                break;
            case TextureAccess::Y_AXIS:
//printf("Y ");
                dsOffset = dudy / f32bit(samples + 1);
                dtOffset = dvdy / f32bit(samples + 1);
                break;
            case TextureAccess::XY_AXIS:
//printf("XY ");
                dsOffset = (dudx + dudy) / f32bit(samples + 1);
                dtOffset = (dvdx + dvdy) / f32bit(samples + 1);
                break;
            case TextureAccess::YX_AXIS:
//printf("YX ");
                dsOffset = (dudx - dudy) / f32bit(samples + 1);
                dtOffset = (dvdx - dvdy) / f32bit(samples + 1);
                break;
        }

        /*  Normalize to s,t space : [0..1].  */
        dsOffset = dsOffset / f32bit(1 << textureWidth2[textUnit]);
        dtOffset = dtOffset / f32bit(1 << textureHeight2[textUnit]);
    }

First calculates how much the texture coordinates 'grow' in the x vs the y screen directions and calculates a scale factor (the value used to calculate the lod or mipmap to access) for each of the two axis. Those scale factors are used to calculate the anisotropy degree for the pair of axis. Then it does the same with the in -/+ 45º screen axis. Finally decides which of the two pairs of axis has a larger degree of anisotropy and selects the axis on which the AF will be taken.

The selection algorithm may change though, and I have doubts that the implemented one is the best. In the code XY (+45?) becomes X when taking the samples and YX (-45?) becomes Y because the function that calculates the actual texel positions wasn't working propperly in those axis and I was too lazy to correct it. I was just trying to find the correct 'form' and then I got busy with other things. I was not researching on AF but just trying to add another feature to the simulator on a some spare time. Maybe someday I will have time to add MSAA ...
 
Last edited by a moderator:
Ailuros said:
I haven't found a way yet to disable the "optimisation" that causes the shimmering; thus when I read all optimisations on or all optimisations off, I consider them to be the 3 optimisations available in the driver control panel.

yep - I had the idea other people thought you meant something else.
 
Be nice if when you have an option to turn off optimizations for a specific function, like AF, that it actually turned them all off. Otherwise you end up needing 'leventy buttons --

Turn off AF Opt A
Turn off AF Opt B
Turn off AF Opt C. . .

etc, etc
 
digitalwanderer said:
How in the hell did I miss this thread? :rolleyes:

Be back after reading about 9 pages...

Yes I wonder.. there are two streams right now for r520, one is making it faster than g70, two is slowing the g70 down with reasonable IQ.

(some nv fanboi will now scream "FIXED"

I say.. well that's about 1,5 year late i reckon.. and I don't think there will be "no performance reduction"
 
neliz said:
Yes I wonder.. there are two streams right now for r520, one is making it faster than g70, two is slowing the g70 down with reasonable IQ.

I hate to say it but we'll have to wait another couple of months for the real product to be tested/analyzed.

I say.. well that's about 1,5 year late i reckon.. and I don't think there will be "no performance reduction"

There's a reason why I'm trying to get an answer from folks like Robobo, if he can detect a flaw in the algorithm.
 
Regarding the computerbase article and benchmarks, it is clearly written on the page that describes the benchmark setup that ALL benchmarks have been made with the driver set to HIGH QUALITY without any optimizations enabled in the control panel:
http://www.computerbase.de/artikel/..._nvidias_g70_texturen/5/#abschnitt_benchmarks
"Als Qualitätseinstellung haben wir „High Quality“ ohne jegliche Optimierungen benutzt, da dort die vorhandenen Bildunterschiede zwischen den Chipgenerationen sehr groß sind."

The different bars describe the different optimizations which are hardcoded in the driver for the high quality settings, as proved on the pages before.

The red bar is the driver version 7.77 with all (the shimmering and the arrow optimization)
The blue bar is the 76.10 driver with only the arrow optimization
The green bar is the 71.84 driver which seems to have no additional optimization compared to the NV40

http://www.computerbase.de/artikel/hardware/grafikkarten/2005/bericht_nvidias_g70_texturen/6/
 
Last edited by a moderator:
With respect to that article, how do we know that the performance improvement is solely due to additional optimizations given that the older drivers do not officially support the G70 and were "hacked" for the purpose of the comparison?
 
I'd really like to try the installation .inf that they used. I tried to replicate the results but windows gave me all sorta of errors ((Including a darkened grey screen,)) So I musta done something wrong.
 
trinibwoy said:
With respect to that article, how do we know that the performance improvement is solely due to additional optimizations given that the older drivers do not officially support the G70 and were "hacked" for the purpose of the comparison?

That's one important point, the next being the confusion that it has caused me while reading it. In any case the author clearly states at the end that the results aren't safe enough yet for conclusions.
 
Back
Top