Jawed
Legend
Whoops, sorry, I went to bed before realising that the MUL performs the same on both GPUs - since G80 is "4x wider" per clock and 4 clocks per fragment is the "normal" rate for a vec4 operation anyway.16 fragments x 4 clocks looks unhealthy in comparison with a vec3+SF GPU: 4 batches, each of 4 fragments x 1 clock (both GPUs considered as having 16-SIMD ALUs). The latter is 4x faster on the MUL. But that's an extreme case, generally G80 wins out significantly.
Jawed