Xbox 360 1 Teraflop of Performance Explained

Titanio said:
Pixel Shaders: 16 flops / clock cycle (only counting main ALUs)
- 16 x 550Mhz x 24 Pixel Shaders = 211.2 Gflops
One note on that calculation. It assumes there is no texturing going on. The first ALU splits time as the bi-linear filter for texture reads.
 
richardpfeil said:
Titanio said:
Pixel Shaders: 16 flops / clock cycle (only counting main ALUs)
- 16 x 550Mhz x 24 Pixel Shaders = 211.2 Gflops
One note on that calculation. It assumes there is no texturing going on. The first ALU splits time as the bi-linear filter for texture reads.

Yes, this is true, but it's still programmable flops, so I think it can be counted. If it was a choice between dedicated texture addressors etc. and a alu that can do time as such, I'd take the latter, especially as shader lengths increase. It's a caveat when comparing with Xenos figures, though.
 
While I agree with Titanio's numbers as a straight programmable flops comparison:

PS3 = 472.8 vs X360 = 355.2

or 33% more for the PS3. There are just so many variables that need to be considered when contrasting these platforms that such numbers are truly meaningless. I can't find a single undisputable argument for either platform that could be used to establish superiority of one over the other. The only honest assesment that can be made is that the extent of any performance advantage is likely to be smaller than what the majority of consumers are able to perceive.
 
x34.jpg


it's nice to see that 52X listed. that's the increase from Xbox CPU to Xbox 360 CPU in single precision floating point operations per second.

heh, that's a larger increase in FLOPS than from Emotion Engine to PS3's CELL, which is a 35x jump.
 
Bu EE has ~2(?)x flops than XCPU, so it is normal, that they get a lower increase.

Anyway we still dont know anything from XeCPU, so flops alone will tell to few...
 
So if we take the average of PS2 and XB CPU to serve as the baseline reference representing "typical" performance from that era, that comes out to 4.2 GFLOPs. Now take respective ratios, using PS3 and XB2 CPU, we get XB2 = 27x the average, PS3.....= 52x the average! :p Fancy that! ;) Numbers are so fickle.
 
I thought the G5 in the alpha kit did 12 floating point ops/cycle, and the XCPU did 8.... has this changed?
 
I cant get more than 8 flops either... Maybe someone is stretching things somewhere...

I guess 3x3.2x 12 = 115.2

is much more impressive than

3x3.2x8 = 76.8

( Though even Sony's numbers may fall foul to the 12 flops/cycle PowerPC marketing number )

3.2x12 + 7x3.2x8 = 217.6 ( 218 )

is bigger than ( by a smaller amount though.. )

3.2x8 + 7x3.2x8 = 204.8 ( 205 )

If you take those numbers the PS3 cpu is closer to 3 times faster than the XB360..
 
Since when is 2.9*52 = 115? :?

CrazyAce said:
Maybe someone is stretching things somewhere...
The going explanation is 2-way single precision SIMD on the scalar FPU - like Gekko's. It's a minor modification in silicon with major PR effects when you have 3 cores.
Next to useless in real life, but who counts that :p
 
Fafalada said:
The going explanation is 2-way single precision SIMD on the scalar FPU - like Gekko's. It's a minor modification in silicon with major PR effects when you have 3 cores.
Next to useless in real life, but who counts that :p

One possible explanation :rolleyes:

My personal favorite explanation ( apart from the obvious PR cloning of G5 specs from alpha kit.. ) is that someone has made an overzealous interpretation of how many flops are used in the altivec instruction vnmsubfp ;)
 
AFAIK Xenos has 48 ALUs grouped in 3 processors (16 ALUs per processor).
At any given clock cycle a processor can work on 16 vertices or on 16 pixels (or 4 pixel quads..)
This doesn't mean Xenos batches pixels or vertices in 16 entities groups, I heard pixels/vertices batches have a 64 entities size.
 
CrazyAce said:
One possible explanation
Well it's sensible enough to me, especially if VMX and FPU can coissue (and hints have been that they can). EE Flops were counted by simply adding all the FMACs together, so at least 2Flop/cycle come from scalar FPU. But perhaps you're right, the other 2 could still be from something else.

My personal favorite explanation ( apart from the obvious PR cloning of G5 specs from alpha kit.. ) is that someone has made an overzealous interpretation of how many flops are used in the altivec instruction vnmsubfp
Well cloning of G5 specs don't make sense to explain how PS3 PPE got the same boost :p But the creative instruction interterpretation could work for both, yeah.
 
Rockster said:
There are just so many variables that need to be considered when contrasting these platforms that such numbers are truly meaningless.
Of course. These peak figures are what's attainable if everything is processing at the same time without any holdups or bottlenecks. Yet using these processing pipelines is a whole different aspect of the system architecture.

At least for Cell we to date a few examples of how effective the utilisation can be, with the FFT example showing it's possible for Cell to bring a substantial amount of that FP performance to bare. Of course we're lacking examples from the game world, but we can be hopeful similar performance effeciency is attainable once the right algorithms are used. At the moment we have no realworld examples of XeCPU's sustainable FLOPS in any application at all. Just how well can it's VMX units be fed?
 
nAo, I remember that as well. I think the point being made is that on the surface the G70/RSX seems more granular because the SIMD paths are only 4 wide as opposed to 16, and many people not understanding that Xenos groups based on instructions not triangles, so the 16 ALU's don't need to be working on pixels from the same triangle, they just need to be working with the same instruction.
 
So now floating point matters again? Just curious, after the campaign started by MS to inform us that floating point doesn't matter since 80% of game code is integer...

Hardware is becoming more ambiguous than fashion, one day something's cool, the next it's crap.
 
london-boy said:
So now floating point matters again? Just curious, after the campaign started by MS to inform us that floating point doesn't matter since 80% of game code is integer...

Hardware is becoming more ambiguous than fashion, one day something's cool, the next it's crap.
Flavour of the month. We all know what's important and not depends not just on what your system's good at but all so what opponent's isn't. These companies probably have whole departments given over to fine-toothing enemy PR announcements and spec-fests and inventing imaginitive counter-arguments to delude the masses with. Realy all PR annuncements come down to a paraphrase of

"Our system's better than theirs at this and that. And though theirs is good at that and this, those features are useless."
 
london-boy said:
So now floating point matters again? Just curious, after the campaign started by MS to inform us that floating point doesn't matter since 80% of game code is integer...

Hardware is becoming more ambiguous than fashion, one day something's cool, the next it's crap.
I dont think it matters. They are just explaining the 1 TFlop figure in their spec sheet. Atleast they arent doing stupid comparisions like Sony or majornelson. :LOL:

Wish they'd explained their general purpose/integer performance figures.
 
Don't want to revisit that topic, but remember that Major Nelson article was official talk from MS engineers. The IGN article that covered it explained they were approached at the end of E3 by these MS guys with a comparison. They are no saints...
 
london-boy said:
So now floating point matters again? Just curious, after the campaign started by MS to inform us that floating point doesn't matter since 80% of game code is integer...

Hardware is becoming more ambiguous than fashion, one day something's cool, the next it's crap.


Any raw, theoretical number is worthless IMO. They are both simply spouting off unachievable figures for marketing purposes, and comparing them to each other is quite silly.



RSX Flop numbers assume that all of the Vertex and Pixel Shader units are being pushed to the maximum at 100% efficiency. That's not even remotely close to what is actually achievable.

Cell's numbers are assuming that all 7 SPE's and the PPE are all being pushed to their limits simultaneously and at 100% efficiency. We are more likely to see the start of the next ice age before that happens.

No different with the 360. No CPU is going to be pushed to 100%, expecially a multi-core in-order design. The 360 GPU is probably the component that will come closest to it's theoretical numbers, but even ATI admits that it will only be in the 80-90% efficiency range, meaning their numbers are still 10-20% higher than what is achievable.

Raw power is meaningless, it's how efficient it is, and how well it does it's specific job that matters.


My car has a fraction of the power of an 18-wheeler, but guess which one is faster and more efficient at driving around town.
 
Powderkeg said:
My car has a fraction of the power of an 18-wheeler, but guess which one is faster and more efficient at driving around town.

That's a completely useless comparison. One could say the 18-wheeler can carry a load that's many times bigger than your fast car.

Nothing to do with what's being talked out here, a car is made to be driven fast round the city, and an 18-wheeler is made to carry huge loads. Each does what they are supposed to do better than the other obviously.
 
Back
Top