Are all FLOPS created equal?

Agisthos

Newcomer
With all this talk about next gen console cpu's im hoping someone can explain a few things.

It seems that in all these forums we have tech experts who take a PPU x 2 x no of instructions = 400 FLOPS e.t.c

Comparing the Xbox360 70-90 FLOPS to the PS3 256 FLOPS means game over as the PS3 is much more powerful.
But is it as simple as that? Are we talking just theoretical figues? what about real world performance and other issues?

Most discussion here seems to revolve around "x this by that" - which shows me that a lot of people have no real idea what they are talking about as to the actual workings of these new multiple core cell/cpu's, which is what I want to learn about.
 
More importantly... how the hell does J Allard come up with "over a teraflop of targeted computer performance" for Xbox360?? :D

(Apparently, there are different ways of interpreting FLOPS-numbers)
 
Yes good point, there are ways of pumping and manipulating these figures. Thats where the 'theoretical vs real world' comes into play.
 
Good thread though.. I would also like to know how and what differences there are "calculating" flops-numbers, real-world figures etc etc.

PS3 seems like a beast FLOPS-wise (if the machine will have the 1+8 config) but these are (?) theoretical and no "real world"-numbers?
If so, then the supposed 70-90 Gflops for Xbox360 also should be theoretical..

should be interesting to see what comes out in this thread..
 
Agisthos said:
With all this talk about next gen console cpu's im hoping someone can explain a few things.

An example from the PC world, a P4, Celeron, A64 and Athlon XP all has the same peak flops per clock cycle, a G5 actually twice the peak FP power per clock than these chips. What chip would you buy as a gamer? In real life a single 2.2GHz A64 would outperform a dual 2.5GHz G5 in games in spite of the fact that the dual G5 is 4.5 times faster in peak flops.

Looking at Cell we have one general purpose processor and eight special streaming processors - offering an extremely high peak flops score, but also extremely high performance penalties when running workloads that are not suitable for these kind of processors (easy 10x times or more).

The Xbox2 CPU has 3 general purpose cores/processors it offers way less in peak performance than Cell - but most types of workloads should run pretty well.

What the best solution is, is hard to say - it depends purely on the workload.
 
FLOPS = Floating Point Operations Per Second.
'Floating Point' means a decimal number like 234.1832 or 0.001834

The Flops rating is therefore a measure of have many calulations can be performed per second. Given that all computing is processing numbers, that should give a rough indicator as to the level of performance of a computer. This is theorectical peak values.

In these FLOPS you get general/programmable flops, and targetted flops. General/programmable flops are where you can set them working on any calculations you like. You get these from your general purpose CPU. Targetted flops serve only one purpose. You get these in a graphics card. eg. a vertex pipeline might be doing so many flops transforming vertices, but is you want to calculate weather patterns it's no good to you - it can only be used to transform vertces.

What we're seeing more and more is companies digging out as many FLOPS as they can find. Every time any action is performed on a floating point number, it's a flop. This is where Allard's that 1 teraflop number comes from.

You can work out the Flops rating of a processor by little sums. The number of operations a core can do per clock cycle x the number of cores x the number of cycles per second.

A 3 core, 3 GHz processor where each core can do 4 floating point operations per cycle has a peak rating of 3 x 3 billion x 4 = 36 billion, or 36 gigaflops.

In world world situations your processor isn't JUST number crunching. It's doing other stuff too. It has to wait for data to be made available to work on, and this is reliant on other system bottlenecks. These restrictions vary from application to application, and different optimisations, so real-world perofrmance isn't predicatable. That's really why peak numbers are used as they are factual. With all processors quoting theoretical peaks, you get a general impression of relative power assuming scaling down of performance in real-world applications is comparable.

Comparing XB360 and PS3 this way adds another nightmare in that Cell works very differently to XB360CPU, so predicting bottlenecks is extra hard, especially given we've no details of Cell incorporation into the PS3 system as a whole.

So, as is often the way, the answer is nobody knows! It's a case of wait and see, and when the machines are actually out comparing them. The only real comparison it's reasonablyfair to make is if PS3CPU = 256+ GFlops, and XB360CPU=80 GFlops, PS3 should be more able to crunch numbers by 2-3x, though you'll even get people here saying such assumptions are wrong and Cell might only achieve 25% better performance!
 
Tim said:
Agisthos said:
With all this talk about next gen console cpu's im hoping someone can explain a few things.

An example from the PC world, a P4, Celeron, A64 and Athlon XP all has the same peak flops per clock cycle, a G5 actually twice the peak FP power per clock than these chips. What chip would you buy as a gamer? In real life a single 2.2GHz A64 would outperform a dual 2.5GHz G5 in games in spite of the fact that the dual G5 is 4.5 times faster in peak flops.

Looking at Cell we have one general purpose processor and eight special streaming processors - offering an extremely high peak flops score, but also extremely high performance penalties when running workloads that are not suitable for these kind of processors (easy 10x times or more).

The Xbox2 CPU has 3 general purpose cores/processors it offers way less in peak performance than Cell - but most types of workloads should run pretty well.

What the best solution is, is hard to say - it depends purely on the workload.

Give this man a cigar! :D
 
Tim said:
Agisthos said:
With all this talk about next gen console cpu's im hoping someone can explain a few things.

An example from the PC world, a P4, Celeron, A64 and Athlon XP all has the same peak flops per clock cycle, a G5 actually twice the peak FP power per clock than these chips. What chip would you buy as a gamer? In real life a single 2.2GHz A64 would outperform a dual 2.5GHz G5 in games in spite of the fact that the dual G5 is 4.5 times faster in peak flops.
...

What your saying here is that a dual G5 has ~ 4 times the FP rating as a single Athlon clock for clock, yet it's worse for games in the real world? Yes, these games are really optimised for G5's and Athlons respectively I must say! :rolleyes:

Yes, MS must really be retarded to have Dual G5 alpha dev kits for Xenon and not going with a single Athlon! :rolleyes:

Comparing games with PC-land open architectures do not apply to embedded, closed console architectures... :rolleyes:
 
Jaws said:
What your saying here is that a dual G5 has ~ 4 times the FP rating as a single Athlon clock for clock, yet it's worse for games in the real world? Yes, these games are really optimised for G5's and Athlons respectively I must say! :rolleyes:

Yes, MS must really be retarded to have Dual G5 alpha dev kits for Xenon and not going with a single Athlon! :rolleyes:

Comparing games with PC-land open architectures do not apply to embedded, closed console architectures... :rolleyes:

I really think you need to buy yourself a new brain, the one you have currently has a serious reading comprehension problem.
 
No, I got the end of your post alright...your initial analogy with dual G5's and a single Athlon was way off and completely misleading. Unless you care to clarify further as I hate misunderstadings you know...?
 
I think this is a big misunderstanding.

Tim was saying that ultimately, the output of a chip largely depends on the software.

So one chip might have higher theoretical "peak" number mumbo jumbo, but if the application is not optimised for it, it will perform worse than a chip with smaller PR numbers.

Nothing new really, this has been discussed before. The G5 example was a bit unclear, but it makes sense.
 
london-boy said:
I think this is a big misunderstanding.

Tim was saying that ultimately, the output of a chip largely depends on the software.
...

Yes, I got that point as it's been discussed here to death. It's the initial analogy with a dual G5 and a single Athlon comparison being presented as fact (or it seemed to me) that bothered me.
 
Back
Top