Maybe I can help a bit with somewhat better Russian translation, I used to learn it in school (it was a long time ago though):
--- START OF ARTICLE ---
Introduction
Recently questions started arising more often about realization of texture filtering in GPUs. One would assume that with current GPU functionality such a basic function as filtering would be performed perfectly, yet the manufactures had the tendency to cut corners ever since the early days of 3D. With current budget (which is more than hundreds of millions) it is possible to perform perfect texture filtering.
But, it appears that in some cases furter approximation of "perfect" does not return what is expected. Our friends at 3dcenter.org were the first to notice it. nVidia in the entire NV3x generation used "optimized" trilinear filtering. But recently it turned out that ATI used similar "optimization" in its new chips beginning with RV3x0 up to the latest R420.
Purpose of this article is to demonstrate and to analyze differences in the realization of trilinear filtering between ATI and NVIDIA in their high-quality and optimized mode of the last generation of chips R420 - Radeon X800 and NV40 - GeForce 6800, respectively.
Let us first familiarize with some terms:
Trilinear filtering (trilinear MIP mapping) - further improvement of bilinear filtering with MIP mapping. Weighted mean value between the results of two bilinear samples from adjacent MIP-levels of texture is the result of trilinear filtration. Depending on the relationship of the sizes of pixel/texel, value of one of the two bilinear samples predominates. This method prevents flickering and sharp change of the clearness of textures during camera motion relative to objects.
I must repeat, that this article will not investigate differences in the bilinear and anisotropic filtering, but only special features of the realization of trilinear filtering.
Excluding insignificant details in this case, trilinear filtering depends on computation of two values Rho(1) and Lambda(2) [look at the original article for formulas].
The formulas given above are taken from OpenGL specification. Although it is evident that Lamda directly depends on Rho, this article will use these two values simultaneously, and from further material it will become clear why. Rho is scale factor, and Lamda is level of detail - LOD.
The final value of color Tau is calculated as linear interpolation between the values of the color, selected from two MIP levels, integer part of Lamda is used to determine from which MIP levels to select color values, and the fractional part of Lamda is the coefficient of interpolation. Everything that we should know about the formula (1) by which Rho is determined, is that resulting value depends on the area of texture, which maps to one pixel of final image.
Our regular visitors have already seen similar images [look at the original article for images] many times. Such images are obtained as a result of rendering of the infinite cylindrical tunnel, whose walls are painted with the aid of a special texture. Special feature of this texture is that it consists of MIP levels where every subsequent level has different color. What is the point of those images? With one look at them it is possible to determine the level of filtering 3D accelerator uses with different level of texture magnification/reduction, isotropic/anisotropic imposition of the textures, and also to determine dependance of filtering level on angle at which textures are superimposed. And all that could be determined with only one image!
Hmm... at least it was possible until recently.
...[explanation of what can be seen on the picture, not too relevant so I snipped it out]
The form of the lobes, shown on the picture above, depends on the method, by which Scale factor Rho is calculated. In the case of trilinear filtering, colors on the picture must blend smoothly.
In the article below we will show similar pictures which use somewhat simplified textures - in it all MIP levels will be white, with exception of 2 according to the degree of detailing, which are painted red. On such texture is evident, perfect evident, in what places will that MIP level be used, and how trilinear interpolation between the adjacent MIP levels occurs.
All images and results are obtained with the aid of program TextureFilteringTester developed in iXBT, which makes it possible to investigate the qualitative and speed characteristics of the algorithms of texture filtering, carried out by 3D accelerators.
So, lets see how does the image look like, in the case of high-quality (not optimized) trilinear filtration on ATI R420 and NVIDIA NV40. Each picture below is hyperlinked to larger (uncompressed) PNG version.
How the images were obtained? On NVIDIA NV40 for obtaining the image of the non-optimized trilinear filtering optimization was turned off from the control panel.
ATI at the moment uses more drastic technology of the on/off control of the optimization of trilinear filtering. When the texture is loaded driver analyzes the differences between MIP levels of texture, and if it decides that the less detailed MIP levels are not the reduced copy of previous, then the optimization of trilinear filtering is turned off and the standard method of filtering is used for this texture.
Therefore for obtaining the images with the optimized trilinear filtering, driver was deceived by using special texture. In the course of experimenting with different textures it became clear that ATI includes the optimization of trilinear filtering only for certain class of textures - depending on their description by application. For example, in DirectX textures can belong to one of the three classes: static, dynamic and managed. Optimized trilinear filtrering in the current version of driver is included only for managed textures.
What immediate conclusion can be made from the comparison of the images given above? To the naked eye, completely white and completely red regions increase approximately equally in both chips with the start of the optimized trilinear filtering. It is also evident on the images obtained from R420 that banding occurs, whereas color blending is exceptionally smooth on NV40.
Let us show you the image obtained on reference rasterizer RefRast, which forms part of DirectX SDK.
It is evident that the image obtained with RefRast coincides with the image obtained on R420 with the optimization of filtering turned off. As a convenience for our readers we have made difference images using Photoshop. As a result different types of filtering became visible to the naked eye and at the same time it is possible to compare the behavior of different chips. The first conclusion is that the changes in the trilinear filtering on all chips do not depend on the angle at which textures are superimposed. Another obvious conclusion is the almost completely identical result of the filtering carried out with R420 and RefRast.
Let us look at what happens in real life. Below Are four graphs created according to the experimental data on two for each of the chips R420 and NV40. One of the graphs shows the dependence of the used coefficient of trilinear interpolation on Rho, and another on Lamda (see formulas and description at the beginning of the article). Two curves, for the standard trilinear filtering and for optimized, are presented for each of the graphs as well as one for RefRast.
Thus, it is possible to explain why making of two graphs for each of the chips was necessary. It is evident that the linear interpolation carried out by ATI and NVIDIA chips is based on different variables. ATI uses Rho, and NVIDIA Lamda. This becomes obvious, if we look at what point the coefficient of interpolation is equal to 1/2. For ATI this coefficient becomes equal to 1/2 with Rho = 1.5, and for NVIDIA with Lamda = 0.5. It is also evident that chip ATI in non-optimized mode carries out trilinear interpolation equivalent to DirectX RefRast, we will get back at that soon.
Lets analyze graphs:
- It is obvious that NVIDIA in the standard (non-optimized) mode carries out trilinear filtering ideally close to OpenGL specification OpenGL (at least for 32 bit textures).
- ATI in the standard mode uses high weight coefficients for more detailed MIP-levels.
- NVIDIA in its optimized trilinear filtering uses only bilinear filtering in the range of ±0.16 from the entire value of Lamda (LOD).
- ATI in its optimized trilinear filtering, uses bilinear filtering in the section which represents approximately 30% of entire range, and that section is located on the side where more detailed MIP-levels are used.
Basis on those assertions it is possible to draw the following conclusions:
(a) the percent relationship of the sections where ATI and NVIDIA "optimize" and where only bilinear filtering is used is practically equal
(b) both with standard and with optimized trilinear filtering, images obtained with ATI chips will sometimes appear clearer, and the images, obtained on NVIDIA chips more smeared, but during the camera motion images obtained with ATI chips are more susceptible to moire and sand-like effects (dithering), that appear as a result of
under-filtering.
Now let return to the analysis of the graphs obtained on R420 and RefRast. We can note two things: stair-like graph shape, and the linear dependence of the coefficient of interpolation on Rho, and not on Lamda as OpenGL specification recommends.
The stair-like form of graph is caused by the fact that DirectX requires only 5-bit accuracy for the coefficient of interpolation. As I already mentioned above, this was first noticed by our friends from 3DCenter.org. And although I would like that trilinear interpolation is carried out with larger accuracy, DirectX specifications do not require this and vendors are free to do whatever they want as long as they meet the specification.
Let us examine the second thing - the nonlinear dependence of the coefficient of trilinear interpolation on Lamda. What does that mean? The point is that the logarithm calculation is demanding operation and requires large number of transistors. That is why the approximate calculation is used [see formula in original article]
Then it is possible to derive x from Rho exponent, and y from its mantissa. Specifically, this optimization is used both in RefRast and in ATI chips.
Conclusion
I won't go deeper into the analysis of the quality of the trilinear filtering because in this situation its importance is subjective. Specifications that describe optimal algorithms for filtering do exist and they haven't changed, but they are numerous so vendors cannot implement every one of them.
But as GPUs become more productive and powerful the greater becomes the desire to use them as co-processors for arbitrary calculations and tasks. In that area, strict conformance to the specifications is more often a requirement than just a wish. Examples of GPU use can be found on
www.gpgpu.org.
--- END OF ARTICLE ---
Just a small conspiracy theory from me, this looks like deal between Microsoft and ATI to push OpenGL standard out of the market.
Clear advice of this article would be that anyone in need of strict OpenGL conformance should not use ATI cards because they have reduced precision in logarithm calculation and because they do not follow OpenGL recommendations (Rho .vs. Lambda).
Others who use computer only for gaming should decide with their own eyes what they like best and whether they would settle for lower precision in their pursuit for more FPS.
Am I concerned with all this? Does it hurt me? Yes, since I don't use computer only for gaming my next card will be NV4x based.