Why do we still have fixed function AA and filtering?

der_turmann

Newcomer
Is it just me, or is this just an outdated concept that is in dire need of removal? With the introduction of programmable pixel samplers, is there really any point of dedicating significant amounts of silicon to fixed-function anti-aliasing/filtering algorithms and sampling maps? First of all, it would finally end the idiotic, neverending debate about which manufacturer has the better-quality AA, and it would also allow developers to use their own, possibly more efficient algorithms.

Sorry if I'm coming off as brash and ignorant, but I mainly just want some feedback from the more learned members of this community. I just find it ironic that with all this talk about programmability, we're still stuck with this seemingly useless approach. Personally, I don't like using any kind of anti-aliasing or texture filtering unless necessary, because it eats up bandwidth for a minimal gain in image quality. If anything, an increase in screen resolution will cure all my jaggies.
 
AA and texture filtering require lots of processing, and, since the algorithms aren't going to need to change much from scene to scene or from surface to surface, there is little reason to not go fixed-function.

Now, with the NV3x, you could possibly do software texture filtering, but it will be, at the very fastest, 1/4 the speed of hardware texture filtering. Due to caches and whatnot, it may well be much slower.

Additionally, fragment shading seems to be heading the way of higher-precision operations. The vast majority of textures are, and probably will continue to be, simply 8 bits per channel.
 
Still, it goes pretty much against the whole purpose of DirectX 9. Why not just go back to a fixed-function bump-mapping, specularity, vertex lighting, etc? I mean, they must be faster, since all the computation is already built into the hardware (I know it sounds a bit radical, but you get my point).
 
der_turmann said:
Still, it goes pretty much against the whole purpose of DirectX 9. Why not just go back to a fixed-function bump-mapping, specularity, vertex lighting, etc? I mean, they must be faster, since all the computation is already built into the hardware (I know it sounds a bit radical, but you get my point).
Those functions are less computationally-intensive and lend themselves well to a programmable pipeline.

I think the main difference here is that texture filtering is applied to nearly all surfaces, whereas many of these other effects are applied much more selectively, which means that it makes more sense to go the programmable way so that the resources dedicated to, say, DOT3 bump mapping can be used for other things when DOT3 is not being performed.

But, you do make a valid point. As various pixel shading techniques start to move into the fore, it may be good for performance to nix the transistors currently used in texture filtering, and use those to beef up the programmable units. But, there still remains a major problem: the MIP level and anisotropic degree detection algorithms don't lend themselves well to programmable hardware. That is, the hardware dedicated to those algorithms won't work well for much of anything else in 3D.

In the end, we're just not there yet. If the removal of fixed-function texture filtering is done, it will be done for purposes of speed. Too many games today work best with traditional texture filtering for it to be beneficial to go programmable (programmable filtering would be good for filtering of surfaces that don't lend themselves well to traditional filtering, which would include pretty much any non-color texture, such as a normal map).
 
Chalnoth said:
But, you do make a valid point. As various pixel shading techniques start to move into the fore, it may be good for performance to nix the transistors currently used in texture filtering, and use those to beef up the programmable units. But, there still remains a major problem: the MIP level and anisotropic degree detection algorithms don't lend themselves well to programmable hardware. That is, the hardware dedicated to those algorithms won't work well for much of anything else in 3D.

For MIP levels, would it be possibly for developers to set a certain Z-value in which the application would apply a downsampled texture? That's proably a very vague and uneducated assumption, but hey it's an idea. :)
 
der_turmann said:
For MIP levels, would it be possibly for developers to set a certain Z-value in which the application would apply a downsampled texture? That's proably a very vague and uneducated assumption, but hey it's an idea. :)
MIP level selection depends more on the angle of the surface than the distance (though distance also does factor in). The basic unit of computation is:

sqrt(x^2 + y^2)

Where sqrt is the square root, and x^2 is the square of a partial derivative of texture coordinates with respect to screen coordinates (which will take into account both distance and angle).

Anyway, the important thing here is that these are scalar operations, whereas much of the rest of the hardware is geared for vector ops.
 
Chalnoth said:
Anyway, the important thing here is that these are scalar operations, whereas much of the rest of the hardware is geared for vector ops.
Not always... the complete formula is:

max(sqrt(du/dx^2+dv/dx^2), sqrt(du/dy^2+dv/dy^2))

which is often approximated by:

max(max(du/dx, dv/dx), max(du/dy, dv/dy))

this is acceptable since it's 'preservative' in the sense that it doesn't add aliasing. Another way of writing it is:

max(du/dx, dv/dx, du/dy, dv/dy)

Which is the maximum of a 4D vector. So in a way that could be a vector operation instead of a scalar operation. Anyway it could be implemented differently for quality mode where you indeed need scalar operations. I didn't really read the rest of the thread so just ignore me if it's irrelevant...
 
The original post also talked of AA, but you guys only replied about AF, hehe.

ATI's AA is programmable AFAIK. Sure, it's not 100% programmable, just the sample patterns, you can say you want even more programmability ( And what would you use it for? ... ) - but it's still fairly good.
Not sure if they're exposing that in any way though - they love to claim it's useless because they got the "best patterns" right now anyway, hehe. Which is quite true anyway... ( maybe one of you ATI guys could clarify that? ;) )

What would be nice potentially is to do a programmable AA, with the Rampage trick. I doubt it's patented to nV, but it could be...
That is, loopback even in MSAA if you need more than your required amount of Z units. The NV3x supports that, but it isn't exposed in the drivers - and that means the NV3x is also capable of 8x and 16x MSAA, potentially even more.

And that's expose in the Quadro FX, too ;)
Although it can degrade performance more than normal MSAA, because, for example, if you got 8x MSAA and 4 Z units, if you don't loopback once automatically, the hardware is forced to loopback anyway.
So the end result is the cost of 4x MSAA + 2x SSAA, with none of the quality advantages of the SSAA part!

Still, it *is* a very nice trick... Wouldn't be surprised at all if nVidia exposed it in a future NV3x product!


Uttar
 
Uttar, nVidia's 8x AA *is* mixed supersampling/multisampling. Check B3D's recent review of the FX 5900 Ultra.
 
Nick said:
Not always... the complete formula is:

max(sqrt(du/dx^2+dv/dx^2), sqrt(du/dy^2+dv/dy^2))

which is often approximated by:

max(max(du/dx, dv/dx), max(du/dy, dv/dy))
I think this was the case with the TNT line from nVidia and with up to the Radeon 8500 with ATI. ATI currently appears to be doing a variation on the above approximation, while nVidia currently is doing what looks very close to the actual formula (though there are differences in cases...I'm not sure exactly how those fit into the math, though I have a suspicion that they, in some cases, subtract instead of add...dunno why, though).

One quick note. If I remember correctly, a logarithm is always taken of the result anyway. This means, to me, that the square root is a trivial operation: if done after the logarithm, it becomes a simple division by two. It's just an aside, but worth thinking about if one is interested in how hard it is to actually implement the above operations. The most expensive thing about doing the full operation is squaring the partials (Another aside: I suspect that a large portion of the performance problems associated with nVidia's anisotropic come down to this selection issue).
 
Tagrineth said:
Uttar, nVidia's 8x AA *is* mixed supersampling/multisampling. Check B3D's recent review of the FX 5900 Ultra.

AFAIK all modes (exposed through the driver panel or 3rd party applications) higher than 4xMSAA, are hybrid modes.

8x = 4xOGMS + 2xOGSS
8xS = 2xRGMS + 4xOGSS

Here´s an article for reference:

http://www.3dcenter.de/artikel/geforcefx_aa_modi/index_english.php

Although not officially supported the 8xS mode is the most efficient; any other sampling mode with a higher value makes little sense.
 
I think the main reason to not use a "randomized" AA pattern is about the Z generation and Z compression. A fixed pattern makes Z generation and compression much easier (simple patterns such as which NV30 used are even easier).

And while we're at it. why don't we design a display system that doesn't cause aliasing?

Because the aliasing is not caused by the display system, it's caused by the 3D rendering process.
 
reever said:
And while we're at it. why don't we design a display system that doesn't cause aliasing?

See answer above. Anti-aliasing is a necessity for sound processing too.
 
In reply to the original poster's comments.

Do you appreciate NVIDIA's mixed SS/MS algorithms? If you do, how will it be exposed as a D3D cap? You can't because DX doesn't allow for it, and therefore your application can't provide users with all those NVIDIA 4xS, 6xS and 8x modes.

While it may good for reviewers (saves them a good deal of time!) to have, and only use, application-controlled AA/AF, I can certainly appreciate the efforts NVIDIA has put into their "not supported by API" AA modes.
 
We should definitely not appreciate the royal crap that nVidia is putting on the table, with respect to their long-in-the-tooth, slap-it-together AA modes.

I mean, call a spade a spade. nVidia should be ashamed of themselves for continuing to push these crappy-ass modes. I mean, I pretty feel that any sample level higher than 4 is just a complete and total waste. I would be willing to put $$ on the table that virtually nobody who owns the 5900 Ultras enables any of those modes, because they:

a. take too much of a performance hit
b. barely look any better than the 4-sample variety

Anyhow...If nVidia doesn't have something totally revolutionary in the way of AA for NV40...
 
a. take too much of a performance hit
b. barely look any better than the 4-sample variety

Agreed for (a), yet (b) is where we part mindset.

4xOGMS= 2*2
8xS= 4*4

In the same resolution the difference between the two in terms of EER alone, is more than just a bit. Even running 4xOGMS in 1600*1200 (wherever possible) isn't going to give you better edge quality than 8xS in a much lower resolution.

The latter IS slow though; 1/4th of the fillrate says it all.
 
Tag: I ain't saying the GeForce FX 5900's 8x isn't an hybrid mode. I'm talking about the Quadro FX 2000 and 3000 here ( probably 1000 too, no idea about 500 )
Remember nVidia got different driver teams, including one for workstation-only cards, and their priorities are very different.

Before I insist much more on this, I'm gonna have to recheck that - wouldn't want to look like an idiot, again ;)

Uttar
 
Typedef Enum said:
We should definitely not appreciate the royal crap that nVidia is putting on the table, with respect to their long-in-the-tooth, slap-it-together AA modes.

I mean, call a spade a spade. nVidia should be ashamed of themselves for continuing to push these crappy-ass modes. I mean, I pretty feel that any sample level higher than 4 is just a complete and total waste. I would be willing to put $$ on the table that virtually nobody who owns the 5900 Ultras enables any of those modes, because they:

a. take too much of a performance hit
b. barely look any better than the 4-sample variety

Anyhow...If nVidia doesn't have something totally revolutionary in the way of AA for NV40...
There is no doubt in my mind that there are purchasers of the high end video cards that don't play UT2003 or Splinter Cell or 3DMark03 or any of the more recent-and-demanding games/apps. For those gamers that are still enjoying much older and less demanding games, 4xS, 6xS and 8x may really be a benefit to them.
 
Uttar said:
Tag: I ain't saying the GeForce FX 5900's 8x isn't an hybrid mode. I'm talking about the Quadro FX 2000 and 3000 here ( probably 1000 too, no idea about 500 )
Remember nVidia got different driver teams, including one for workstation-only cards, and their priorities are very different.

Before I insist much more on this, I'm gonna have to recheck that - wouldn't want to look like an idiot, again ;)

Uttar

If you install a "Quadro Only" driver on a GeForce you'll get an additional 16x mode (which is 12x in D3D and 16x in OGL BTW).
When I asked nVidia about this they said that 16x is only meant for Quadro.

16x is a mixed mode which leads me to believe that it's just a general misconception that the Quadro would support more than 4x MSAA.
Though just as you I'll have to add:
"Before I insist much more on this, I'm gonna have to recheck that - wouldn't want to look like an idiot, again ;)"
heh
 
Back
Top