NVidia Maths

Bilinear interpolation (FP16 on G60/G70) = 36 flops.

blend operation (FP16 on G60/G70/X1000) = 12 flops.

nrm_pp (FP16 on G60/G70) = 9 flops(?)

PS: Vec4 MAD*2 = 16 flops

VS : Vec4 MAD*1 + Scalar MAD*1 = 10 flops

G70@550MHz (no geometry delta clock)
(10*8vs+(36+16+9)*24ps+12*16rop)*550MHz
= (80+1464+192)*550MHz
= 1736*550MHz
= 954800 MFLOPS
~ 955 GFLOPS
 
cho said:
Bilinear interpolation (FP16 on G60/G70) = 36 flops.
4-channel FP16 fetches take two cycles. And I'm not sure the interpolation parameters are FP. However, many texture address calculations need FP32, and after the bilinear interpolation you need another LERP/MAD step, for trilinear/ansiotropic filtering.
 
Back
Top