FLOPS is short for FLoating point Operations Per Second, and hence is used to measure the rate of computanional capacity.
Usually an 'op' is either an addition or a multiplication. Divisions are not counted as they are vastly more expensive to do fast, so much so that divisions are typically done by multiplying with the reciprocal ( a/b -> a* 1/b).
The numbers normally quoted in sales material is the theoretical maximum (the 'guaranteed not to exceed' number), which is often a poor measure of the true throughput of the chip/solution.
For nvidia to reach a number of 200GFLOPS they probably count every single floating point unit on the chip, assumne that each is active every single cycle and multiplies with the number of cycles per second. Looks good... But isn't very informative.
Take the R300 with 8 pixel shaders and 4 vertex shaders each of which can do 8 FP ops per cycle for a total of 31GFLOPS of shading power. One would assume that the NV3x would wipe the floor with R300 in FP. - it doesn't.
That's because alot of the FP power in the Nvidia number, goes into "hidden" stuff like triangle setup, FP<->int conversions, iterators etc. Whereas the 31 GFLOPS for the R300 is the raw shader power.
As for general purpose CPUs:
P4 can do 4 FP ops per cycle for +12GLOPS @ +3GHz, Athlon/Opteron can do a similar amount of ops per cycle but is clocked (alot) lower (But has fewer restrictions on issuing FP ops so probably has slightly higher throughput per cycle than the P4).
Gecko can do 2 FPMADDS (2FP adds and 2 FP muls) per cycle, so just under 2GFLOPS.
Cheers
Gubbi