I disassembed a bunch of vs/ps and am going through it and found quite some nonsensical code-fragments like this:
rsq r1.w, r5.w // 1/sqrt(), r5.w is saturated [0,1]
rcp r1.w, r1.w // 1/1/sqrt(), sqrt()
add r2.w, -r1.w, c0.y
cmp r1.w, -r1.w, c0.y, r2.w // -r1.w >= 0 ? ... : ..., c0.y is 1
This is:
(-sqrt(r5.w) >= 0.0 ? (1 - sqrt(r5.w)) : 1)
(sqrt(r5.w) <= 0.0 ? (1 - sqrt(r5.w)) : 1)
Make no sense. sqrt can't yield negative results, besides the input being positive. Breaks down to:
(sqrt(r5.w) == 0.0 ? (1 - 0.0) : 1)
Whut? Some clouded way to set a register to 1? Did I oversee something?
---------
Got some more (checks for == 0 clearly):
((r0.w * r0.w) <= 0.0 ? 1.0 : 0.0)
(-r3.x >= r3.x ? 1.0 : 0.0)
Weirdos:
max(-r6.w, r6.w)
// which is abs(r6.w), the compiler didn't know _abs?
(IN.texcoord_0.x <= 0.0 ? (1 - IN.texcoord_0.x) : (IN.texcoord_0.x + 1))
// which is 1 + abs(IN.texcoord_0.x), the compiler didn't know _abs? I can't believe breaking _abs into three scheduleable ops is possibly more efficient.
rsq r1.w, r5.w // 1/sqrt(), r5.w is saturated [0,1]
rcp r1.w, r1.w // 1/1/sqrt(), sqrt()
add r2.w, -r1.w, c0.y
cmp r1.w, -r1.w, c0.y, r2.w // -r1.w >= 0 ? ... : ..., c0.y is 1
This is:
(-sqrt(r5.w) >= 0.0 ? (1 - sqrt(r5.w)) : 1)
(sqrt(r5.w) <= 0.0 ? (1 - sqrt(r5.w)) : 1)
Make no sense. sqrt can't yield negative results, besides the input being positive. Breaks down to:
(sqrt(r5.w) == 0.0 ? (1 - 0.0) : 1)
Whut? Some clouded way to set a register to 1? Did I oversee something?
---------
Got some more (checks for == 0 clearly):
((r0.w * r0.w) <= 0.0 ? 1.0 : 0.0)
(-r3.x >= r3.x ? 1.0 : 0.0)
Weirdos:
max(-r6.w, r6.w)
// which is abs(r6.w), the compiler didn't know _abs?
(IN.texcoord_0.x <= 0.0 ? (1 - IN.texcoord_0.x) : (IN.texcoord_0.x + 1))
// which is 1 + abs(IN.texcoord_0.x), the compiler didn't know _abs? I can't believe breaking _abs into three scheduleable ops is possibly more efficient.
Last edited by a moderator: