NV30,35 & R300/R350 Pixel Shader Pipes Compared (New inf

psurge · Nov 11, 2003

Err Chalnoth, I may be wrong but I thought that R3X0 used fp32 for all texture addressing, and fp24 was for shading ops only.

If true then AFAICS fp24 would only be an issue for dependent texture reads (can aniso even be used with dependent reads?... how would anisotropy and MIP level be determined?).

IMO the shot you linked is much more plausibly explained by the way ATI computes anistropy/MIP level.

arjan de lumens · Nov 11, 2003

psurge said:
(can aniso even be used with dependent reads?... how would anisotropy and MIP level be determined?)

Yes. The usual way is to run 4 pipelines (for a 2x2 pixel patch) in lockstep, and measure the differences in computed texture coordinates of adjacent pixels, and then use those differences as an estimate for the texture coordinate derivatives wrt screen coordinates - which are the data you need to do mip-level and anisotropy axis determination.

Also, AFAIK, if fp24 precision issues affected the aniso quality, the result would have been grainy or noisy mipmap boundaries with aniso enabled, not the blooming that R3xx currently exhibit.

Bambers · Nov 11, 2003

Thats just atis messy texture filtering isn't it?

If you look closely you'll see that the mip levels only seem to be done for 4 pixel blocks rather than per pixel like on geforces.

It was the same on the 8500 and despite its higher internal precision than the gf4 (16bit vs 9bit) the gf4 always produced much cleaner results on the AF programs.

FUDie · Nov 11, 2003

Chalnoth said:
I have to admit that it's not very rigorous. My basic train of thought is this:

The R3xx's F24 Has some subpixel accuracy, and is made to work properly with bilinear filtering on large textures. Anisotropic filtering requires more samples per pixel, and so, I believe, requires more subpixel accuracy.

I could be wrong, but the use of FP24 for texturing seems the most obvious reason for those artifacts.

You're grasping at straws as usual. If what you say were true, then the GeForce 4 should look worse than the R3x0 when doing aniso.

-FUDie

KimB · Nov 11, 2003

zeckensack said:
I didn't click on the link (Tom's on my embargo list), but ... what the heck? Texture filtering isn't performed by the shading units, so how would it be affected by their precision? IANAHWD but something tell's me that Tom's just repeating some uninformed babble he picked up in the laundry.

If this is about the 'angular' problem (uneven multiples of 22.5Â°) ...

Right, which means that texture addressing doesn't absolutely need to be done at FP24, but since the shader units operate at FP24, and the GPU isn't designed around fixed-function programming, it doesn't seem farfetched to believe that ATI didn't bother to have better accuracy on fixed-function texture addressing than they do with pixel shader texture addressing.

KimB · Nov 11, 2003

FUDie said:
You're grasping at straws as usual. If what you say were true, then the GeForce 4 should look worse than the R3x0 when doing aniso.

-FUDie

No. The GeForce4 does all texture ops at much higher accuracy than the ALU ops.

This is why in the GeForce3/4 line, the pixel shader assembly is separated between texture ops and arithmetic ops. The operations are physically different in hardware. For example, whenever you want to do arithmetic leading to a texture read, you have to use specific texture shader instructions.

By contrast, with PS 2.0, the texture instructions and arithmetic instructions use the same instructions, but all texture ops are forced to be FP32.

Now, I'm not certain what the exact accuracy of texture ops is in the GeForce3/4, but it's probably either FP32 or a higher-precision integer format (likely higher than 16-bit).

Tahir2 · Nov 11, 2003

Higher integer format than FP32?
I have heard of 12,15,18 and 21 bit formats in an NVIDIA document describing a few effects but this is for the GFFX.

KimB · Nov 11, 2003

Tahir said:
Higher integer format than FP32?
I have heard of 12,15,18 and 21 bit formats in an NVIDIA document describing a few effects but this is for the GFFX.

Sorry, meant higher integer format that 12-bit, which is, I believe, the integer format for the register combiners (the non-texture arithmetic ops) in the NV2x.

Tahir2 · Nov 11, 2003

Are you saying that because GF4 and GF3 do Integer FX32 internally they are better or more accurate at aniso filtering?

As per the DirectX 9 specification, R300 supports data precision that is substantially greater than previous generations of hardware. Most graphics hardware has supported 32-bit integer formats, where internal rendering accuracy was an integer falling between 32 and 48-bpp. Now supported is 128-bit float point precision, which obviously is dramatically more accurate.

ATIâ€™s design is slightly confusing to most, for saying that it provides 128-bit floating point precision is accurate, though not entirely accurate. R300â€™s texture address logic supports 32-bits per-channel (128-bit floats), where the shader logic uses a 24-bit per-channel float point accuracy, totaling out to a precision of 96-bits per-pixel. The output format can either be reduced to a format of lower accuracy, or expanded all of the way to a 128-bit float

Shader logic is lower so AF is lower quality = your conclusion

... er

Ostsol · Nov 11, 2003

Link to the source article:

http://www.firingsquad.com/hardware/radeon_9700/page2.asp

KimB · Nov 11, 2003

Tahir said:
Are you saying that because GF4 and GF3 do Integer FX32 internally they are better or more accurate at aniso filtering?

I didn't say they do FX32 internally. But they do use higher than FX12 for texturing ops. I'm not sure as to the exact precision, but it must be greater than FX16 for proper rendering of textures.

Ostsol · Nov 11, 2003

Do you mean for texture addressing? It's likely FP32, the same as the precision using in vertex shaders. FX16 is viable, though, as it provides 4-5 decimal places, which is all that's necessary for 4096x4096 textures.

Tahir2 · Nov 11, 2003

Chalnoth said:
Tahir said:

Are you saying that because GF4 and GF3 do Integer FX32 internally they are better or more accurate at aniso filtering?

Click to expand...

I didn't say they do FX32 internally. But they do use higher than FX12 for texturing ops. I'm not sure as to the exact precision, but it must be greater than FX16 for proper rendering of textures.

And to be superior than the R3x0 core it needs to be greater than [equivalent] FP24 right? So at least FX25....no?

Edit: clarification

Dio · Nov 11, 2003

Chalnoth said:
The R3xx's F24 Has some subpixel accuracy, and is made to work properly with bilinear filtering on large textures. Anisotropic filtering requires more samples per pixel, and so, I believe, requires more subpixel accuracy.

Perhaps in extreme magnification cases on very large textures. But in standard mipmapping?

FP24 is by no means 'marginal' and a slightly higher accuracy requirement wouldn't be a problem.

Althornin · Nov 11, 2003

chalnoth, FUDie is right.

Hyp-X · Nov 11, 2003

Bambers said:
Thats just atis messy texture filtering isn't it?

If you look closely you'll see that the mip levels only seem to be done for 4 pixel blocks rather than per pixel like on geforces.

If you look even closer you'll see that GeForces do mip levels in 4 pixel blocks too...
(Just like every other card released in the past few years.)

It was the same on the 8500 and despite its higher internal precision than the gf4 (16bit vs 9bit) the gf4 always produced much cleaner results on the AF programs.

First, conserning textures the 8500 doesn't have a higher internal precision than gf4. (It's equal at best, the gf4 is known to hangle at least fx16 precision in the texture shaders.)

But it has nothing to do with the AF quality at all.

The 8500 doesn't support trilinear with AF, and uses insufficient samples. (which has similar result to having a lod bias.)
Both of these issues has been fixed on R3xx.

Dave Baumann · Nov 11, 2003

Bambers said:
Thats just atis messy texture filtering isn't it?

If you look closely you'll see that the mip levels only seem to be done for 4 pixel blocks rather than per pixel like on geforces.

ATI's mesy filtering is most likely as a result of its 5bits of LOD accuracy.

Hyp-X · Nov 12, 2003

Ostsol said:
Link to the source article:

http://www.firingsquad.com/hardware/radeon_9700/page2.asp

Don't trust an article that says that VS2.0 has 32 constant registers...

Xmas · Nov 12, 2003

arjan de lumens said:
Also, AFAIK, if fp24 precision issues affected the aniso quality, the result would have been grainy or noisy mipmap boundaries with aniso enabled, not the blooming that R3xx currently exhibit.

Actually, you do get grainy mipmap boundaries on R3x0. Not as "diffuse" as on Parhelia (

) but not really precise either.

Chalnoth said:
Now, I'm not certain what the exact accuracy of texture ops is in the GeForce3/4, but it's probably either FP32 or a higher-precision integer format (likely higher than 16-bit).

NV2x uses FP32 in the "texture shader".

DaveBaumann said:
Bambers said:

Thats just atis messy texture filtering isn't it?

If you look closely you'll see that the mip levels only seem to be done for 4 pixel blocks rather than per pixel like on geforces.

Click to expand...

ATI's mesy filtering is most likely as a result of its 5bits of LOD accuracy.

Correct. It's always 32 levels of mipmap interpolation, it's not related to texture size. This allows ATI to save some transistors on the final mipmap lerp.

KimB · Nov 12, 2003

Xmas said:
NV2x uses FP32 in the "texture shader".

Yes, that's what I had heard, but since I don't remember any official docs that state it, I felt I was better off just stating it was higher precision than FX12.

NV30,35 & R300/R350 Pixel Shader Pipes Compared (New inf

psurge

arjan de lumens

Bambers

FUDie

KimB

KimB

Tahir2

KimB

Tahir2

Ostsol

KimB

Ostsol

Tahir2

Dio

Althornin

Senior Lurker

Hyp-X

Irregular

Dave Baumann

Gamerscore Wh...

Hyp-X

Irregular

Xmas

Porous

KimB

Similar threads