Luminescent
Veteran
A question to start: when a company like Nvidia does a flop count for their processors (supposedly 200 gflops for NV30), do they count only 1 type of flop operation (fmad), or do they also count special purpose floating point ops in addition to the fmads?
I recall 51 gflops for the pixel shader in NV30. With each virtual pipeline having 4 fmads running at half precision, the number should have been obtained through counting something like- 2 ops per mad*2 half-float ops*4 fmad units a pipe*8pipelines*400 million cycles a clock=51.2gflops. It seems Nvidia is not counting the special purpose pixel units which should be able to execute at 1 op per cycle and be included in the 8 pipelines. I am sure they have to be in there, or sin, cos, log2, ddx, ddy, etc. support would be missing, or not feasible. This caught my attention. Any thoughts?
I recall 51 gflops for the pixel shader in NV30. With each virtual pipeline having 4 fmads running at half precision, the number should have been obtained through counting something like- 2 ops per mad*2 half-float ops*4 fmad units a pipe*8pipelines*400 million cycles a clock=51.2gflops. It seems Nvidia is not counting the special purpose pixel units which should be able to execute at 1 op per cycle and be included in the 8 pipelines. I am sure they have to be in there, or sin, cos, log2, ddx, ddy, etc. support would be missing, or not feasible. This caught my attention. Any thoughts?