What do the FLOPS hop-stop compares?

chaphack · Aug 14, 2003

I know PS2 EE is at 6.2gflops, DC SH4 2.4gflops(?) and Nvidia FX cards are around 200gflops and of course PS3 is aiming for 1Tflops.

Anyone know about ATi R300 line? How about the latest HT P4 CPUs? Or AMD coming 64bit CPUs? Where does GC Gekko + Flipper stand?

I wonder if hundreds of millions of flops, alone, will really guarentee you good graphics?

Flops slop hop what lop?

notAFanB · Aug 14, 2003

I wonder if hundreds of millions of flops, alone, will really guarentee you good graphics?

when you got enough sure why not 8)

really it's just a measurement of computing power. the things which seem to be really important to you are already relatively fixed in terms of implementation so IQ should not be an issue at at all.

Gubbi · Aug 14, 2003

FLOPS is short for FLoating point Operations Per Second, and hence is used to measure the rate of computanional capacity.

Usually an 'op' is either an addition or a multiplication. Divisions are not counted as they are vastly more expensive to do fast, so much so that divisions are typically done by multiplying with the reciprocal ( a/b -> a* 1/b).

The numbers normally quoted in sales material is the theoretical maximum (the 'guaranteed not to exceed' number), which is often a poor measure of the true throughput of the chip/solution.

For nvidia to reach a number of 200GFLOPS they probably count every single floating point unit on the chip, assumne that each is active every single cycle and multiplies with the number of cycles per second. Looks good... But isn't very informative.

Take the R300 with 8 pixel shaders and 4 vertex shaders each of which can do 8 FP ops per cycle for a total of 31GFLOPS of shading power. One would assume that the NV3x would wipe the floor with R300 in FP. - it doesn't.

That's because alot of the FP power in the Nvidia number, goes into "hidden" stuff like triangle setup, FP<->int conversions, iterators etc. Whereas the 31 GFLOPS for the R300 is the raw shader power.

As for general purpose CPUs:

P4 can do 4 FP ops per cycle for +12GLOPS @ +3GHz, Athlon/Opteron can do a similar amount of ops per cycle but is clocked (alot) lower (But has fewer restrictions on issuing FP ops so probably has slightly higher throughput per cycle than the P4).

Gecko can do 2 FPMADDS (2FP adds and 2 FP muls) per cycle, so just under 2GFLOPS.

Cheers
Gubbi

PiNkY · Aug 14, 2003

As Gubbi pointed out FLOPs are somewhat of a generic measurement. To determine the flop rating of an IC, usually its fastest fp operation's throughput per second is measured. This makes flops one of the most meaningless performance metrics there is (even worse then OPs counts.). For an in-depth easy read on the various reasons (besides the obvious difference in arithmetic complexety between a fully IEEE-compliant 128 bit fp division and an arbitrarly rounded 16 bit FMAC, where the first one is regarded a single op while the second counts as 2...) for this i'd really recommend the beginning chapters of Computer Architecture by Hennessy & Patterson. Another good example is that when fp-coprocessor became add-in options for low-end desktop pcs, your flop rating actually shrinked when adding those while your performance went up, as emulating fp operations with int arithmetic usually yielded higher inst. throughput.

What do the FLOPS hop-stop compares?

chaphack

notAFanB

Gubbi

PiNkY