I was wondering whether FP16 is enough for most normalizations instructions or if FP32 is needed often.
I'm of course asking because the NV40 has the neat trick of being able to replace adp3/rsq/mul sequences with a fast (FP16) nrm instruction. I don't know the math behind, so I'm just asking out loud: Will FP16 be enough or will it produce artifacts?
I'm of course asking because the NV40 has the neat trick of being able to replace adp3/rsq/mul sequences with a fast (FP16) nrm instruction. I don't know the math behind, so I'm just asking out loud: Will FP16 be enough or will it produce artifacts?