My app has a volume renderer that uses axis-aligned stacks of 2d textures. I recently updated the pixel shaders to ps_3_0 and dynamic branching to eliminate unnecessary gradient+lighting calculations. This results in a big win on the ATI X1800GTO in my dev machine (30% increase in fps on a busy volume, higher on those with more empty space) but a slight *decrease* in performance on the Go7800 in my Dell 9400 laptop. Makes no sense at all to me. The Go7800 must be executing both paths of the branch, as if it didn't support dynamic branching.
Is this something particular to the Go7800 or should I expect the same on all Nvidia 7x00 GPUs?
Another interesting performance item: replacing an opacity-correction equation in the pixel shader ( 1 - pow(1-alpha,delta) ) with a simple 2d texture lookup resulted in a 35% increase in fps on the X1800GTO.
I'm using Direct3D 9.0c and HLSL in a effect file to display a 96x96x64 scalar volume. The volume renderer is pixel shader intensive, rendering a filled 640x480 window at ~65 fps on the X1800GTO. The Go7800 only manages 30 fps in the same test.
Anyway, I'd appreciate hearing anyone's thought on the perf weirdness I'm seeing.
Thanks,
Mike
Is this something particular to the Go7800 or should I expect the same on all Nvidia 7x00 GPUs?
Another interesting performance item: replacing an opacity-correction equation in the pixel shader ( 1 - pow(1-alpha,delta) ) with a simple 2d texture lookup resulted in a 35% increase in fps on the X1800GTO.
I'm using Direct3D 9.0c and HLSL in a effect file to display a 96x96x64 scalar volume. The volume renderer is pixel shader intensive, rendering a filled 640x480 window at ~65 fps on the X1800GTO. The Go7800 only manages 30 fps in the same test.
Anyway, I'd appreciate hearing anyone's thought on the perf weirdness I'm seeing.
Thanks,
Mike