Does Cell Have Any Other Advantages Over XCPU Other Than FLOPS?

aaronspink said:
The only lesson I've learned is that forum posters like you lack the proper experience and understanding to make an engineering decision.

Aaron Spink
speaking for myslef inc.

Do you think that Sony/IBM/Toshiba chose wrong with the Cell?
 
aaronspink said:
Physics and collision detection should both be fairly easy to do with vector engines, they have after been done with vector engines since the 60s and in general have the highest efficiencies on vector engines.
If all you have is a hammer ... this is no more relevant than the fact that so much of it moved to clusters in the last decade.
 
Last edited by a moderator:
Guilty Bystander said:
I dunno if this have been debated before on this thread but instead of comparing Xenos with a single SPE you guys should be comparing the Cell with the Xenon.
Both CPU articles are out now on IBM.com and both should roughly be compared on paper ofcourse.

We now know the Cell is able to do about 198GFlop/s in realtime while Xenon can only hit 100-110 GFlop/s at best (probably only 80-90GFlop/s in realtime).
What should developers do with all that extra computational power?
That's the real question that should asked and answered here.

Every techhead should know by now the PS3 will much more powerfull than the Xbox 360.
And for every techhead that doesn't the Cell has more raw power (more floating point calculations, more integer calculations, more hardware threads, more cores, more vector calculations etc.), more bandwidth and more flexibility (spread code over more hardware threads and SPE's and more specialised at whatever task you give them) than the Xenon and the RSX (if we think of as a GTX 512 core) has more raw power (more floating point calculations, more integer calculations, more verticle calculations) and higher fillrate's (13,2GTexel/s and 8,8GPixel/s).


Lol Far more powerfull. Can't wait to see the look on the faces of some of you, 3 years down the road, we are talking less than Gc-Xbox difference in graphics here lol
 
Unknown

Guilty Bystander said:
I dunno if this have been debated before on this thread but instead of comparing Xenos with a single SPE you guys should be comparing the Cell with the Xenon.
Both CPU articles are out now on IBM.com and both should roughly be compared on paper ofcourse.

We now know the Cell is able to do about 198GFlop/s in realtime while Xenon can only hit 100-110 GFlop/s at best (probably only 80-90GFlop/s in realtime).
What should developers do with all that extra computational power?
That's the real question that should asked and answered here.

Every techhead should know by now the PS3 will much more powerfull than the Xbox 360.
And for every techhead that doesn't the Cell has more raw power (more floating point calculations, more integer calculations, more hardware threads, more cores, more vector calculations etc.), more bandwidth and more flexibility (spread code over more hardware threads and SPE's and more specialised at whatever task you give them) than the Xenon and the RSX (if we think of as a GTX 512 core) has more raw power (more floating point calculations, more integer calculations, more verticle calculations) and higher fillrate's (13,2GTexel/s and 8,8GPixel/s).

Have you seen Heavenly Sword discussion in this thread?

http://www.beyond3d.com/forum/showthread.php?t=26468

It is quite fascinating to have inside look inside mind of PS3 developer in process of determining how to use CELL. EA has claimed to have made this decision with 3 SPE for graphics and 4 for other tasks. Heavenly Sword team seems to have different approach (from EA) to CELL utilization.

For information on utilization of Xbox360 CPU (Xenon), I point you to this thread:

http://www.beyond3d.com/forum/showthread.php?t=26487

In link within thread, developer discusses use of CPU for post effects, HDR, etc. Maybe different implementation of HDR than expected. Despite smaller number of available hardware threads developers are able to use Xenon for some graphics tasks.
 
Mintmaster said:
Every uneducated wannabe techhead. ;)
scratch the "much", and there you go, an educated wannabe techhead.

six months....

hardware wise, time was as HUGE advantage to GC/Xbox over ps2, but somehow some of you ppl think Sony wont do nothing with six months advantage to stay ahead of competition.
sorry but i'm going to refuse the idea that sony wont do a thing to improve ps3.


Anyway, moving along
Nobody remembers back at E3 2005, that it was very very very convinient for sony to:

-post cell speed at 3.2ghz ?

-just a few hours after the Microsoft revealing its own speed of 3.2ghz?

-and sony presenting rushed powerpoint screens with comparasions made with X360 cpu (they dont even know the cpu name) and cell at 3.2ghz?
ps3%20pic.jpg


-and even more rushed powerpoint sceens showing 2Tflops in hard red letters barely visible in some random screen?
ps3%20pic%202.jpg


call me weird, but its damn....weird lol
 
Last edited by a moderator:
it's funny that despite being overwhelmingly effective sony's presentations look like they're made by a 1st year business student learning powerpoint for the first time.

Maybe it's the secret ti their success? Got break it down in a level the journalists can understand?

Like this:
ps3%20pic%202.jpg


lol, that's classic. I don't know if I could make a worse slide if I tried.
 
thats what i mean lol....
my guess (without taking sides here) is that sony made up cell speed and Tflop rating during the microsoft presentation and the sony one.

The Tflop rating was obvious since the begining, the double....

but CELL at 3.2ghz may have more to do with X360 also being 3.2ghz...
its weird because months before ibm and sony were showing cell at 4.6ghz just like that, 4.6ghz

i wonder what the specs were before sony knew X360 specifications.....

RSX Nvidia Slides were very very well done, alot of work went into those, so that could mean that RSX was figured out way before X360 presentarion.
CELL speed was not, i think
 
Theoretical Gflops

Although the SPU double-precision (DP) floating-point is not as high as the single-precision performance, it is still good. Each SPU is capable of executing two DP instructions every seven cycles. With Fused-Multiply-Add, an SPU can achieve a peak 1.83GFLOPS at 3.2GHz. With eight SPUs and fully pipelined DP floating-point support in the PPE's VMX, the Cell BE is capable of a peak 21.03GFLOPS DP floating-point, compared to a peak of 230.4GFLOPS SP floating point.

http://www-128.ibm.com/developerworks/power/library/pa-cellperf/

What is interesting about this statement for theoretical performance estimate is:

1) IBM consideres Single Precision capability of PPE = 25.6 Gflops, not 38.4Gflops (so only VMX is counted). This is based on statement that 8 SPE CELL has SP capability of 230.4, or 9x25.6. So maybe IBM considers 7 SPE CELL has SP capability of 204.8 (not 217.6 Gflops) Xenon has theoretical SP capability is not 115.2 Gflops but 76.8 Gflops.

2) IBM consideres Double Precision capability of PPE is 6.4Gflops. This is based on statement that 8 SPE CELL has DP capability of 21.03 Gflops. So maybe Xenon theoretical DP capability is 19.2 Gflops, same as 7 SPE CELL of PS3.
 
dskneo said:
six months....

hardware wise, time was as HUGE advantage to GC/Xbox over ps2, but somehow some of you ppl think Sony wont do nothing with six months advantage to stay ahead of competition.
sorry but i'm going to refuse the idea that sony wont do a thing to improve ps3.

Yep, but the time difference was much bigger then, they had hardly started designing the hardware of the xbox while the PS2 was already out. And the other thing is that I think that Sony does take it a bit "slower" from the hardware is ready till they launch on the market, so those six months shouldn't make much of a difference, especially if what they do during this time is more of a modification/redisign of an existsing design, so that it can fit to their console rather then upping the specs, afterall I think they are more concerned about yields, having many chips quite "cheap", rather than adding features in the last minute, not to say that RSX might have loaned some stuff from the future gen, just like the xbox chip did...
 
i love post of aaron spink he knows a lot of things.
i'm very far to understand all what he is saying, hence my question.

Aaron he 's your point that the cell architecture is some what a waste?
edit : i know you say cell is an understable design choice for the PS3
I explain myself from what i understand of your comments it wouls be better to have a more simplistic core (one or to ppe) a dsp like pu for vertex physic lihting and a nvidia pixel shader rop monster?
it's the way i understand your comparaison of xenos/7spe who are very close in number of transistors, the efficience of xenos in math thing is clear (hence it run 500Mhz)

don't hurt me if post is stupid i try to have a simplier view of opinion.

edit : talking about RD cost would be interesting in that discussion.
edit2 : ageia have technologie, ibm too , nvidia the same, and for bus memory and memory controler flex IO it 's up to rambus. A lot of saving in RD costs should have been done, no?
edit : sorry for so much edit
 
Last edited by a moderator:
aaronspink said:
And this isn't even taking into account the process differences between the TSMC's 90 nM process and IBM's 90 nM SOI process which would likely give the Xenos another 20-25% frequency boost at the same power levels. Or the redundant alu slice. If their redundancy system had been more advanced, it would have been possible that we would have seen a Xenos with a peak performance of 320 GFLOPs.

So... uhmm.... seriously, WTF??

Everybody let's conjecture!!
 
dskneo said:
thats what i mean lol....
my guess (without taking sides here) is that sony made up cell speed and Tflop rating during the microsoft presentation and the sony one.

The Tflop rating was obvious since the begining, the double....

but CELL at 3.2ghz may have more to do with X360 also being 3.2ghz...
its weird because months before ibm and sony were showing cell at 4.6ghz just like that, 4.6ghz

i wonder what the specs were before sony knew X360 specifications.....

RSX Nvidia Slides were very very well done, alot of work went into those, so that could mean that RSX was figured out way before X360 presentarion.
CELL speed was not, i think
That's just silly. You really think they had no slides whatsoever prepared until the end of xbox360 conference?
Even if they had tweaked the numbers to better compete with MS, don't you think they'd just made the slides ready well beforehand and just replaced the numbers and columns respectively.
That one slide is not that well made, that's true, but it hardly is of any proof or indication that the slides were fabricated on impact of MS unveiling.
 
dskneo said:

8GFLOPS for a P4 but 115GFLOPS for 3 Powerpc cores, you telling me VMX extensions account for the gap? It doesn't add up, its not like Cell 7SPEs designed ground up to process 3D maths. aaronspink you dislike Cell, but will you explain to us the XCPU? Do you buy the 115GLOPS processing power in a game? Why are 360 games cpu bound and choking on the framerate and physics? :LOL:
Why is it that everyone and their damn dog keep questioning can Sony do good on their promises but nt a single soul question ms? nt a single one damnit! :rolleyes:
 
Vaan said:
So... uhmm.... seriously, WTF??

Everybody let's conjecture!!

There is a difference between conjecture and enlightened reasoning. I have a pretty good feel for the differences between the TSMC and IBM processes and it certainly wouldn't be supprising to get a ~25% performance boost from the IBM process and design flow vs the TSMC process and design flow. If you have other data, I'd love to hear it.

As far as the redundancy, it was I believe stated by Dave that there are actually 4 SIMD alu arrays on the die and that 1 of the arrays is used for redundancy. It is therefore certainly possible that a better pre array redundancy mechanism could have been designed that would of allowed 4 functional SIMD arrays vs the current design that allows 3 functional simd arrays.

Aaron Spink
speaking for myself inc.
 
ihamoitc2005 said:
http://www-128.ibm.com/developerworks/power/library/pa-cellperf/

What is interesting about this statement for theoretical performance estimate is:

1) IBM consideres Single Precision capability of PPE = 25.6 Gflops, not 38.4Gflops (so only VMX is counted). This is based on statement that 8 SPE CELL has SP capability of 230.4, or 9x25.6. So maybe IBM considers 7 SPE CELL has SP capability of 204.8 (not 217.6 Gflops) Xenon has theoretical SP capability is not 115.2 Gflops but 76.8 Gflops.

2) IBM consideres Double Precision capability of PPE is 6.4Gflops. This is based on statement that 8 SPE CELL has DP capability of 21.03 Gflops. So maybe Xenon theoretical DP capability is 19.2 Gflops, same as 7 SPE CELL of PS3.

What's the double precision ability of current x86 cpus? I believe I've seen upwards of 10Gflops for x86 cpus, but is that single or double precision?
 
tema said:
8GFLOPS for a P4 but 115GFLOPS for 3 Powerpc cores, you telling me VMX extensions account for the gap? It doesn't add up, its not like Cell 7SPEs designed ground up to process 3D maths. aaronspink you dislike Cell, but will you explain to us the XCPU? Do you buy the 115GLOPS processing power in a game? Why are 360 games cpu bound and choking on the framerate and physics? :LOL:
Why is it that everyone and their damn dog keep questioning can Sony do good on their promises but nt a single soul question ms? nt a single one damnit! :rolleyes:

I think once youve been around a little longer youll find that poeple on this forum are equally skeptical/critical of MS, NIntendo, and Sony.

In this particular discussion, when you make the claims for the Cell that STI has made, it is bound to draw some skepticism.
 
Guess

Fox5 said:
What's the double precision ability of current x86 cpus? I believe I've seen upwards of 10Gflops for x86 cpus, but is that single or double precision?

I am not certain but here is guess for Pentium 4 and ratings based on IBM document for single precision performance:

P4 single precision capability is 15.2Gflops. For comparison 8 SPE CELL is 230.4 Gflops, 7 SPE CELL is 204.8 Gflops, Xenon is 76.8 Gflops. For PS3 or Xbox360 if all prceossor units except one is for floating point and remaining unit has not floating point use, then CELL is 179.2 Gflops and Xenon is 51.2 Gflops.

It is important to remember that numbers are only "theoretical" capability so real performance is different for many reasons such as software and hardware efficiency and true nature of use.
 
What's the double precision ability of current x86 cpus? I believe I've seen upwards of 10Gflops for x86 cpus, but is that single or double precision?
Well, vector ops (SSE) are all single precision (and sometimes less than that when getting into *effective* precision). Scalar ops on any IEEE-compliant FPU are inherently double precision. Even if you're using single precision data types, the operations will be done internally in doubles and rounded down when actually stored.

15.2 GFLOPS for P4 comes from SSE (which is single precision). But you can theoretically get half that out of scalar ops which are double.
 
Back
Top