question on nv30 and ati radeon 9700

Althornin · Sep 2, 2002

Basic said:
Using higher precision in the filtering can reduce calculation errors. So if you have banding or noice that comes from lacking calculation precision in the filtering, then it's possible to remove it. (I don't think I've seen such errors though.)

It can't remove aliasing, since that needs a better filter pattern.

Honestly, is the amount of math done in trilinear filtering going to cause precision errors? I dont think so.

noko · Sep 2, 2002

What do you based that on? Still what is the precision here, 8 bit, 10 bit, 12 bit per color? If 128 samples are taken and each is being used in a calculation why wouldn't precision be important?

Humus · Sep 2, 2002

I would guess the precision is exactly what is needed in every case on all hardware. Assuming 8bit/channels sources, if you only want point sampling 8bits will do, if you want to filter 4 texels you need 2 more bits, that 10bits, for trilinear then you'll need 11bits, for a full 16x anisotropic filter you'll thus need 15bits.

noko · Sep 3, 2002

Humus,

Could you break that down more or explain it more? I would and I think others would much appreciate the discussion. Thanks

.

KimB · Sep 3, 2002

Humus said:
for a full 16x anisotropic filter you'll thus need 15bits.

Not if it's done in stages.

Quick example, trilinear:

10-bits for bilinear-filtered sample one
10-bits for bilinear-filtered sample two

The two samples are down-sampled to 8-bit before moving onto the blending stage, requiring 9 bits for blending.

That is, as long as you have enough bits at every stage, you shouldn't need more than about 10-bits. After all, today's anisotropic only blends multiple bilinear-filtered samples. If a pixel pipeline (as the R300's presumably is) can do four bilinear samples per clock, then it would seem to make sense that it would blend in groups of four, requring a precision of 10 bits for blending.

3dcgi · Sep 3, 2002

nevidimka said:
well, some very wise guy told me about this once. i forgot about the details . but from what i can recall.. trilinears in todays graphic cards are done with a single pass but double pipelines. it have to do with a single pipeline not enough to carry all the values? for a trilinear. so double pipelines are used. so they arent true trilinear. trilinear when done in a single pipeline.. is much better IQ wise( more sharper ). the rampage was capable of this. it did trilinear in a single pipeline, single pass.

I don't know what the wise guy means concerning true trilinear since I agree with OpenGL Guy's definition of trilinear filtering, but I think I know what he meant by double pipelines. I think he was only refering to texture pipelines, not pixel pipelines. For example, using common definitions, if a chip has one pipeline and two texture units it can process one dual-textured bilinear filtered pixel per clock. Some architectures combine the two "texture units" to process one single-textured trilinear filtered pixel per clock. This doesn't necessarily tell us what the hardware is doing, but it is likely that two texture pipes are combined into one for trilinear filtering.

3dcgi · Sep 3, 2002

Chalnoth said:
If I was properly informed, the Radeon 9700 is capable of processing four bilinear-filtered pixels per clock. So, it should not only be capable of doing one trilinear-filtered pixel per clock, but should be able to do one 2-degree trilinear aniso pixel per clock (or one 4-degree bilinear aniso pixel per clock).

There has been no information on the processing power of the NV30's pipelines, but I would expect it to be similar.

Are you correct that the 9700 can only do 4 bilinear pixels per clock and not 8? I would figure an 8x1 architecture means the x1 can be bilinear filtered. If you're correct that means the 8 pipes are great for gouraud shaded pixels like shadow volumes, but the extra 4 pipes are worthless for quality texturing.

KimB · Sep 3, 2002

I meant per pixel pipeline per clock, 3dcgi.

In other words, if the texture inputs were there, when plain bilinear filtering was enabled, the Radeon 9700 should be able to handle four texture per pixel pipeline per clock (i.e. 8x4 architecture). Quick side note: There'd also have to be additional processing for the blending between different textures, but that's not a whole lot...

I believe the main reason it doesn't do this is simply because ATI didn't feel it was necessary to optimize for non-anisotropic situations. I personally support this decision. While an 8x2 architecture would provide a small speedup when anisotropic filtering is enabled, it wouldn't be much, and may not be worth it.

Randell · Sep 3, 2002

Is there a chance here the 'wiseguy' was explaining why the V5 by default didnt do true trilinear and mip-map dithering didnt count?

Basic · Sep 3, 2002

Althorin: Agreed.

My point was that the only errors that could be removed is calculation errors, not aliasing. And I did say that I haven't seen any errors that look like it stems from filtering calc precision.

I think we already have enough precision in the filtering to make the most out of a 32 bit texture.

nevidimka · Sep 3, 2002

randell when he said graphic cards cant do true trilinear he also said that v5 also cant do it. not through mip mappping either. he only said the rampage could. n i think 3dcgi is correct. i guess over time the info i received have changed a lil.. lol.. my memory is failing.. ehehe..

so that means a to process a trilinear 2 TMu's is needed in each pipeline to do it. thus cards whose pipelines include only 1 TMU like the rad 9700 need 2 pipelines.. right? so the rampage 's single TMU's was capable of processing a trilinear. n i guess thats what is shown in the pic.. the floor is clear at the far end. but how about the nv 30?

mboeller · Sep 3, 2002

nevidimka said:
randell when he said graphic cards cant do true trilinear

wrong

he also said that v5 also cant do it. not through mip mappping either. he only said the rampage could...

yes

so that means a to process a trilinear 2 TMu's is needed in each pipeline to do it. thus cards whose pipelines include only 1 TMU like the rad 9700 need 2 pipelines.. right?

No!

so the rampage 's single TMU's was capable of processing a trilinear.

IMHO yes.

n i guess thats what is shown in the pic.. the floor is clear at the far end. but how about the nv 30?

You mean anisotropic filtering, or? trilinear blurrs the image far away quite a lot.

Randell · Sep 3, 2002

just because the Radeon9700 has only 1 TMU per pipe, dont assume what it can apply to that TMU is related to old hardware like the V5.

nevidimka · Sep 3, 2002

erm i'm abit confused. anyway.. that screnshot pic of the rampage was only trilinear not anisotropic.

OpenGL guy · Sep 4, 2002

nevidimka said:
so that means a to process a trilinear 2 TMu's is needed in each pipeline to do it.

As stated before this is not true. Trilinear filtering is simply a method for filtering a texture. It's based on taking a certain number of textures samples from the appropriate part of the texture. The number of TMUs/pipelines/thingamajigs is completely irrelevant. For example, the Savage 3D/4/MX/NB (and others) could do trilinear filtering with a single TMU and in a single cycle from a single mipmap. Now do you see what I am saying?

thus cards whose pipelines include only 1 TMU like the rad 9700 need 2 pipelines.. right?

No.

noko · Sep 4, 2002

Lets see If I can sum up the bit requirements for filtering:

A bilinear has 4 samples from the source texture, each one having 8 bits/channel. 2^8 x 4 = 2^10 information or 10bits.

A trilinear has two bilinear so it would be 2 x 2^10 = 2^11 informtion or 11bits.

Anisotropic of 16x on the Radeon 9700 using trilinear filtering would be
16 x 2^11 = 2^4 x 2^11 = 2^15 or 15 bits of information.

Now no one here really answered what the real hardware actually has for precision. Now if precision is lost would also means colors are lost as well as dynamic range. I've notice that my ATI Radeon compared to my GF3 has much better texture colors, meaning more brigher and darker appearance in the textures. It was obvious to me and to a number of others when switching over to a Nvidia card that the textures have a duller look. Could this be solely because the Radeon and beyond has a higher precision filtering algorythm? Also on the Radeon you have a much broader range in adjusting gamma then on the Nvidia GF3 before you see banding at least by my experience. Which to me points out better precision filtering on the Radeon or ATI cards as well. Now I could be wrong and both designs use the same amount of precision but I have evidence at least from my standpoint otherwise.

Humus · Sep 4, 2002

noko said:
I've notice that my ATI Radeon compared to my GF3 has much better texture colors, meaning more brigher and darker appearance in the textures. It was obvious to me and to a number of others when switching over to a Nvidia card that the textures have a duller look. Could this be solely because the Radeon and beyond has a higher precision filtering algorythm?

Most likely different gamma/brightness/constrast defaults. Lack of precision would show up as banding, not as being duller.

noko said:
Also on the Radeon you have a much broader range in adjusting gamma then on the Nvidia GF3 before you see banding at least by my experience. Which to me points out better precision filtering on the Radeon or ATI cards as well. Now I could be wrong and both designs use the same amount of precision but I have evidence at least from my standpoint otherwise.

AFAIK the Radeon has 10bits/channel gamma tables. Not sure about the GF3.

KimB · Sep 4, 2002

noko said:
Lets see If I can sum up the bit requirements for filtering:

A bilinear has 4 samples from the source texture, each one having 8 bits/channel. 2^8 x 4 = 2^10 information or 10bits.

A trilinear has two bilinear so it would be 2 x 2^10 = 2^11 informtion or 11bits.

Anisotropic of 16x on the Radeon 9700 using trilinear filtering would be
16 x 2^11 = 2^4 x 2^11 = 2^15 or 15 bits of information.

No, because you can do the filtering in stages.

Quick example:

Mathematically, the two methods are the same:

T1 = A + B + C + D
Avg = T1 / 4

This requires T1 to have an additional two bits for the final divide to not lose anything.

Second method:

T1 = A + B
T2 = C + D
Avg1 = T1 / 2
Avg2 = T2 / 2
T3 = Avg1 + Avg2
Avg = T3 / 2

Here, because the most we're dividing by is two, no more precision than 9 bits (assuming the inputs A, B, C, and D, as well as the outputs, Avg1, Avg2, Avg are all 8-bit) is necessary.

However, one additional thing needs to be considered here, and that is that modern hardware does not always do straight averages, but usually works with weighted averages. This means that plain bilinear filtering probably needs quite a lot more than 10-bit accuracy in the calculation to be done properly in one stage. Trilinear filtering, also, depends on a weighted average, and so may need more than the suggested 9-bit accuracy.

Anisotropic filtering, however, doesn't require any additional accuracy than about 9-10 bits, as long as the filtering is done in no more than 2-4 bilinear samples at a time, since anisotropic will just do a straight average on the bilinear samples.

Now no one here really answered what the real hardware actually has for precision. Now if precision is lost would also means colors are lost as well as dynamic range.

No, dynamic range is not lost. Dynamic range is the difference between the darkest and brightest color. Lower precision calculations will not diminish dynamic range. They just lose color data, causing dithering/banding.

The floating-point formats allow for a higher dynamic range simply because they are floating-point.

As for the Radeon being "higher-precision," it does have one mode that offers higher-precision, and that is when using overbright lights with PS 1.4. No game that I am aware of today uses these lights, though DOOM3 may (JC has support in the engine, but has said within the past couple of months that game developers have not yet taken advantage of it).

I don't believe for a moment that the GeForce-series has lower-precision filtering/rendering in general, however, as there is no effective loss in color depth from enabling trilinear, anisotropic, or FSAA. If there was a loss under these situations, then I might agree.

As for the gamma issues, I don't know. I would like to see some objective comparisons, but I doubt that those will show any conclusive results.

noko · Sep 4, 2002

Chalnoth

Thanks for the well thought out explanation and your time to explain

. I am working on a sample to show some differences in Q3. Which won't rule out a number of difference's mainly LOD settings that affect texture quality. Still the differences are apparent enough to make me wonder. The hard part is setting up the shot so that they are the same viewing angle. Once again gama, brightness, contrast and what not is hard to equate between two different cards besides very different designs and drivers.

One test that would probably show the filtering quality precision difference is to take an exact same screne shot, save it to 32 bit then have a program count the colors. The GPU/VPU that has the most colors you could say indicates a higher precision filter. What do you think?

Sharkfood · Sep 4, 2002

For example, the Savage 3D/4/MX/NB (and others) could do trilinear filtering with a single TMU and in a single cycle from a single mipmap.

On the side..

I specifically remember Microsoft deeming this as NOT "true trilinear" from their earliest WHQL tests. These fudged non-uniform colors into the different mipmaps to ensure trilinear was taking it's samples from multiple mipmaps, rather than mipmapping your own "on the fly" from a single mipmap. After all, a 2x2 in current mipmap could be used to interpolate a 1x1 of your neighboring mipmap in real texturing. This used to be grounds for early WHQL rejection for trilinear, around ~DX5 if my memory serves me... unsure if this still sticks today.

question on nv30 and ati radeon 9700

Althornin

Senior Lurker

noko

Humus

Crazy coder

noko

KimB

3dcgi

3dcgi

KimB

Randell

Senior Daddy

Basic

nevidimka

mboeller

Randell

Senior Daddy

nevidimka

OpenGL guy

noko

Humus

Crazy coder

KimB

noko

Sharkfood

Similar threads