FP16 enough for most normalizations (NV40 related)?

LeStoffer · Aug 23, 2004

I was wondering whether FP16 is enough for most normalizations instructions or if FP32 is needed often.

I'm of course asking because the NV40 has the neat trick of being able to replace adp3/rsq/mul sequences with a fast (FP16) nrm instruction. I don't know the math behind, so I'm just asking out loud: Will FP16 be enough or will it produce artifacts?

Sigma · Aug 23, 2004

I think FP16 is more than enough for normals (and in part colors) because normals are always in [-1, 1] range. Same with color altought they can be bigger than 1.

Personly, in GLSL I always use half or half3 for stuff concerning normals and colors and never had precision problems. The rest remains FP32 (or FP24 for ATI)...

991060 · Aug 23, 2004

Experiment would be the best answer.

ATI has a high precission normalmap demo, maybe you can do some samll modifications to it then see the result...

Ostsol · Aug 23, 2004

*shrugs* Depends on the shader program. For short programs it's probably okay, but as they get longer and FP16 is still used, the errors will accumulate just as people are saying it will with FP24 (only faster).

ERP · Aug 23, 2004

Ostsol said:
*shrugs* Depends on the shader program. For short programs it's probably okay, but as they get longer and FP16 is still used, the errors will accumulate just as people are saying it will with FP24 (only faster).

If were talking about normals and colors, I can't really see very many shaders being structured in such a way that a normalise will cause the error to accumulate.

You'd need to care about a lot of accuracy after the normalisation, and in general you just don't.

I guess if you had some sort of iterative algorithm that required normalisation at each step it might be an issue, in that case you'd probably want to do an RK step to improve the result.

ehart · Aug 24, 2004

I can't comment on the particular implementation, but I have done precision measurements on this sort of thing before. The experiment that I found to do a pretty good job of figuring out what levels of precision are necessary on a vector in bad cases was as follows:

Render a flat quad with a normal pointing straight out, or apply a bumpmap with a very slight slope to it.
Perform per pixel lighting by interpolating a light vector for a point light from the corners of the quad.
Apply a specular power of 32 or greater.
Play with the light distance to the object/orientation of the object.

I did these experiments back in the days of fixed point shaders to get a feel for where some of these precision issues cropped up. At the time, I can remember that 8 bits was definitely not enough, and that it didn't seem to kill the banding with a specular power of 32 until you got to around 10-12 bits plus a sign bit. I am not sure how this applies to floats with a 10 bit mantissa.

-Evan[/list]

LeStoffer · Aug 24, 2004

Thanks for the replies so far. If not conclusive, it seems that the FP16 nrm instruction is quite useful.

Mintmaster · Aug 24, 2004

I'd say you need two conditions for FP16 not being enough for normals.

First, the normals would have to slowly vary across the surface, like the bald head of a Doom3 character as opposed to the bumpy walls in HL2.

Second, you'd have to be doing either a high exponent specular calculation or a reflective, dependent texture read (like in the ATI car demo).

Specular essentially spreads out a tiny portion of the 0-1 result from a DP3 to a 0-1 colour range, so precision is needed to avoid banding, but you need a relatively flat or smooth surface to make the specular highlight big enough to see this.

Texture lookups always need high precision, especially since images being reflected often scroll along the reflective surface. Your reflection rays have to be accurate when they hit, say, a 256x256 cubemap face, especially when the reflection is pretty large and/or the screen resolution is high.

jpaana · Aug 25, 2004

I wouldn't worry too much about accuracy of near unit vectors, but normalizing world scale vectors might require more range and accuracy than fp16 gives.

KimB · Aug 25, 2004

Thinking about lighting, I don't see why normalizing vectors across half the worldspace would pose a problem.

That is, if this is a vector from the viewer to the object, then if it's long, it's not going to be very big, and errors won't be obvious.

If this is a vector from the object to the lightsource, then there is once again no problem, as if the object is far away from the light, either the light shouldn't be rendered on that object, or it will be unidirectional.

jpaana · Aug 25, 2004

Yup, if you manage to fit the vectors in fp16 range in the first place, fp16 doesn't have infinities so get clamping, which might or might not be a problem, again depending on the usage. Accuracy is minor issue if your vector points to completely wrong direction.

Mintmaster · Aug 26, 2004

jpaana said:
I wouldn't worry too much about accuracy of near unit vectors, but normalizing world scale vectors might require more range and accuracy than fp16 gives.

Range shouldn't be a big problem. Worst comes to worst, you can scale these vectors in the vertex shader.

Accuracy for all vectors, even near unit vectors, can indeed be an issue, as I mentioned above. Because we're talking about floating point, the angular precision will have minimal dependency on vector size. Even NVidia said FP24 is not enough for directions (though I strongly disagree), so FP16 will easily have its moments of inadequacy.

The other problem with normalization via nrm_pp is that it can't be used with good normal compression, either 3Dc or ordinary two channel textures, because different math is needed for deriving the third component.

Free FP16 normalization was a very smart move on NVidia's behalf looking at todays games with ordinary 8 bit per channel normal maps, but it definately has its limitations for the future. But it's "free", so I'm not complaining

KimB · Aug 26, 2004

Well, even with compressed normal maps, there are other vectors in the lighting calculations that will need to be normalized.

FP16 enough for most normalizations (NV40 related)?

Similar threads