A bit off topic, but just for clarification:
The R300 (and R350, presumably) holds 60 programmable floating point processors (fmad/frcp/flog/ect.). This is how the numbers add up:
In the vertex shader: there are 5 units per vertex pipeline, 4 fmads, and 1 scalar (complex function) fp core. Each fmad can execute 2 fp instructions per-cycle and the scalar fp unit can execute 1. This yields 9 fp ops per vertex unit per-cycle. Four vertex units would yield a theoretical fp performance rate of (9*4*400)
14.4 gflops at 400 MHz.
In the fragment shader: there are three major floating point cores. A texture filtering and coordinate unit, a texture address unit, and a fragment color unit. The major contributer to floating point ops is the fragment color unit, which consists of 4 fmads and a special purpose fp core (like the vertex shader's pipeline, organized a bit differently). Each fmad contributes with a potential 2 fp ops per-cycle, and the complex fp core can kick in at 1 fp op. This would also sum-up to a total of 9 fp ops possible per clock. Eight pixel pipelines mean 8*9 fp color ops per cycle. This yields 28.8 gflops at 400 MHz. Counting the fp texture address unit, which is capable of 1 fp op in itself would increase the fp per-cylce count to (10x8) 80 fp ops. This would add up to a whopping
32 gflops of capability in the R300's pixel shader. (Man that was long!)
These are all fully programmable units, so they are, in essence, fully programmable flops (at least for the tasks they were meant to handle). If we add the max theoretical outputs of the vertex and pixel pipelines of the R300/R350, we arrive at
46.4 gflops, which is nothing to scoff at. Remember, this does not include the triangle set-up unit throughput or texture filtering unit throughput, which are floating point capable. However, these units, to my knowledge, are not fully programmable. The same goes for the anti-alaising portions of the processor.
Let us not forget that out of those 46.4 potential gflops, 28.8 of them (the ones coming from the fragment color processor) are at 24-bit per color component (96-bit total) max precision, while the others, from the vertex units and texture address unit, are 32-bit per color component (128-bit total) max precision.
I came to this information by reading a few informative theads, in the Beyond 3D forum, and consulting with a certain, reliable, individual for clarification.
Information resources may be found here:
http://www.beyond3d.com/forum/viewtopic.php?p=28682#28682
http://www.beyond3d.com/forum/viewtopic.php?p=53279#53279
http://firingsquad.gamers.com/hardware/radeon_9700/default.asp
P4 = 2 FP ops (scalar) + 2 FP ops (SSE) ~ 4 FP ops per cycle @ 3000 MHz =~12 GF
So at $60 vs $600, the R300/R350 blows the competition (PIV) away. Are you ready ... for more!! (
)