... to check the execution time of this little PS snippet?
NVShaderPerf says 6 cycles for CineFX1 and 5 for CineFX2, but unfortunately it doesn't provide numbers or NV40 yet. My guess is that it should either take 3 or four cycles, but I'm not sure of that.
edit: added write mask to add.
Code:
ps_2_x
dcl_2d s0
dcl_2d s1
dcl t0
texld r2, t0, s0
dsx r0.xy, t0
dsy r1.xy, t0
add r2.xy, t0, r2
texldd r0, r2, s1, r0, r1
mov oC0, r0
edit: added write mask to add.