Does Cell Have Any Other Advantages Over XCPU Other Than FLOPS?

I think Toshiba presented the demo as... a demo. I don't think they ever intend to use it in cosmetics or sell it, you'll probably never see it again. I think it's really just to show off the CELL. Which IMO was an impressive demo.
 
drpepper said:
I think Toshiba presented the demo as... a demo. I don't think they ever intend to use it in cosmetics or sell it, you'll probably never see it again. I think it's really just to show off the CELL. Which IMO was an impressive demo.

I know, but sometimes when certain products get alot of praise the company notices a demand and changes their mind. I just hope the demand is high enough to change their mind.
 
mckmas8808 said:
Again we have to resort to saying something negative about the topic before acknowledging that better things can happen. Do you wear make-up scooby? I know my mom and my girlfriend would love an application like this at some malls that they go to.

Why litterly put make-up on your face when the half a mirror can do it for you? Can you imagine how great it would be if a women's beauty salon had this built into a mirror at the shop? Why fight it you and everyone here knows it's a good idea. It would come down to the price that could hurt its acceptance.

Go gotta have vision.

(I'm sure youre going to take this more negativity but oh well.... :( )

Yes but you continually link vision with this processor as if these applciations ONLY come about becuase the Cell hath been born. Personally, i dont see any evidence that this makeup thing was "just not possible" before the cell processor.

You need to think why this processor was invented by STI. They wanted a processor that they OWNED, that was SCALABLE, and that they could use in EVERYTHING they make. They also wanted to be able to write software that was portable and would run on all versions of this scalable chip. It wasnt created for the SOLE purpose of being able to make things possible that had never been possible before. I dont know why people feel the need to extol it as if thats the case.
 
nAo said:
We agree to disagree Dave but maybe we have read different threads cause if I start to call you retard I think you would call that an abuse, wouldn't you?

I've only every told ihamoitc2005 to stop being a tard after he repeatedly made incorrect and unfair comparrisons between a single SPE and the whole of the Xenos die. I've given up even reading ihamoitc2005's posts because he is incapable of reason.

The only other comment that I've made that could be construded negatively is the flipant comment direct at version after he once again responded with a non-sequitor with significantly incorrect information. And as I said, reading Versions posts are a waste of time, so I don't read them anymore either.

Aaron Spink
speaking for myself inc.
 
Last edited by a moderator:
expletive said:
(I'm sure youre going to take this more negativity but oh well.... :( )

Yes but you continually link vision with this processor as if these applciations ONLY come about becuase the Cell hath been born. Personally, i dont see any evidence that this makeup thing was "just not possible" before the cell processor.

You need to think why this processor was invented by STI. They wanted a processor that they OWNED, that was SCALABLE, and that they could use in EVERYTHING they make. They also wanted to be able to write software that was portable and would run on all versions of this scalable chip. It wasnt created for the SOLE purpose of being able to make things possible that had never been possible before. I dont know why people feel the need to extol it as if thats the case.

That's the thing though expletive. They made the CELL chip for everything that you named and because they need a chip with a high FP number rating (which will allow them to do things that are not being done today). You are always giving 80% truth of CELL then you leave off the last 20% for some reason. Why?
fragend013.gif
 
mckmas8808 said:
That's the thing though expletive. They made the CELL chip for everything that you named and because they need a chip with a high FP number rating (which will allow them to do things that are not being done today). You are always giving 80% truth of CELL then you leave off the last 20% for some reason. Why?
fragend013.gif

I figured since we've beaten the FP performance of the thing to death on this board it didnt need to be repeated, again.
 
I was just thinking . How well can the cell handle tasks like face matching .

I.e running a picture of a face through a data base of those most wanted and matching it ?

Would this be better suited for cell than for a x86 chip ? So much so that it can be cheaply implemented at air ports and police stations ?
 
Analysis tools for voice and image would benefit crime scene investigators most by being portable for on-location use. Medical functionality is already being considered for fourth generation mobile phones, like blood sample testing for diabetics and automatic linking with physician databases.
 
Another false statement by Spink

aaronspink said:
I've only every told ihamoitc2005 to stop being a tard after he repeatedly made incorrect and unfair comparrisons between a single SPE and the whole of the Xenos die. I've given up even reading ihamoitc2005's posts because he is incapable of reason.

The only other comment that I've made that could be construded negatively is the flipant comment direct at version after he once again responded with a non-sequitor with significantly incorrect information. And as I said, reading Versions posts are a waste of time, so I don't read them anymore either.

Aaron Spink
speaking for myself inc.

In fact after changing your mind on what comparison you would like to make (originally Xenos vs SPE, then Xenos vs entire CELL) I provided comparison of Xenos with entire CELL including all components including ones such as bus to PPE and "dead" 8th SPE yet even with extra "baggege", programmable floating point density advantage was with group of SPE as implemented in PS3 CELL.

With all 8 SPE active as with non-PS3 CELL implementation, CELL had very large advantage.

You have said many times I did not provide this information but my posts with this information are available in this thread and as you know repeated more than once.

You must stop making false statements to cover up your mistaken comparison and admit that your mistake with honor. Every time you make false statements you are hurting your honor and insulting others who participate in this forum.
 
ihamoitc2005 said:
In fact after changing your mind on what comparison you would like to make (originally Xenos vs SPE, then Xenos vs entire CELL) I provided comparison of Xenos with entire CELL including all components including ones such as bus to PPE and "dead" 8th SPE yet even with extra "baggege", programmable floating point density advantage was with group of SPE as implemented in PS3 CELL.

With all 8 SPE active as with non-PS3 CELL implementation, CELL had very large advantage.

You have said many times I did not provide this information but my posts with this information are available in this thread and as you know repeated more than once.

You must stop making false statements to cover up your mistaken comparison and admit that your mistake with honor. Every time you make false statements you are hurting your honor and insulting others who participate in this forum.

Since you obviously haven't been able to comprehend nor understand the numbers, here they are for you...

Single SPE:
4 FMAC * 2 Flops/FMAC * 3.2 GHz = 25.6 GFLOPS/Sec
SPE die size 14.5-15.5 sq mm
GFLOPS/sq mm = 1.76 - 1.65

Xenos computation core:
5 FMACs * 2 Flops/FMAC * 48 ALUs * .5 GHZ = 240 GFLOPS/Sec
Xenos die size: 180-220 sq mm
Xenos computational core: 90-132 sq mm
GFLOPS/sq mm = 2.66 - 1.81

And this isn't even taking into account the process differences between the
TSMC's 90 nM process and IBM's 90 nM SOI process which would likely give the Xenos another 20-25% frequency boost at the same power levels. Or the redundant alu slice. If their redundancy system had been more advanced, it would have been possible that we would have seen a Xenos with a peak performance of 320 GFLOPs.

The numbers aren't all that supprising. The Xenos architecture is much closer to a pure dataflow/stream architecture relying on minimal control covering a vast amount of FP resources. In an SPE, the control logic has responsability for 4 SP FMACs, in Xenos, the control logic has responsability for 80 SP FMACs. This low control overheads does have some drawbacks, but for the types of workloads in use in a console (as well as the types of workloads cell will be good at), the drawbacks should be fairly minimal.

Aaron Spink
speaking for myself inc.
 
What problems a game engine has to deal with are regular problems exactly? I just see lots of either plain irregular ones (AI, etc) or ones where you have to iteratively refine parts of the solution (collision detection and physics in general). The only thing I could see Xenos doing well is non-adaptive procedural generation of geometry (or rather, adaptive on a batch scale driven by the CPU). You can gloss over the irregularity of the problems by brute force solutions, but you are going to be wasting floating point power.

About the percentage taking up by control logic in the SPE, wasn't storage a much bigger factor? I think I remember a quote from an IBM researcher to the effect that only a very small part of the core was taken up by control logic, but my memory might be deceiving me.
 
Last edited by a moderator:
Pandora's Box

aaronspink said:
Since you obviously haven't been able to comprehend nor understand the numbers, here they are for you...

Single SPE:
4 FMAC * 2 Flops/FMAC * 3.2 GHz = 25.6 GFLOPS/Sec
SPE die size 14.5-15.5 sq mm
GFLOPS/sq mm = 1.76 - 1.65

Xenos computation core:
5 FMACs * 2 Flops/FMAC * 48 ALUs * .5 GHZ = 240 GFLOPS/Sec
Xenos die size: 180-220 sq mm
Xenos computational core: 90-132 sq mm
GFLOPS/sq mm = 2.66 - 1.81

Are you unaware that once again you compare apples with oranges or do you do this with intention? Also, what purpose is of comparing merely computational core when computational core is not what is Xenos which is large GPU unit. You said Xenos is superior for coprocessor than SPE yet now you extract specific component of Xenos, your "computational core" for such comparison.

Well, I can extract specific component as well.

Since you change your mind and now prefer "computational core" comparison (with no interest in if it is viable processor without supporting components), then we shall do a real one using "computational core" of SPE including corrections to your SPE die size and additional aspect of computational core figure for SPE (which you did not include) known as SPU which, as with your "computational core" for Xenos excludes supporting components.

Because I feel sorry for you due to you again embarrassing yourself with careless or desperate statements, I shall use your apples vs oranges number for Xenos gflops so you have some advantage in this meaningless comparison.

Single SPE computational core (SPU):
4 FMAC * 2 Flops/FMAC * 3.2 GHz = 25.6 GFLOPS/Sec
SPE die size: 14.53 sq mm (5.81mm x 2.51mm)
Computational Core: ~7 sq. mm
GFLOPS/sq mm = ~3.66

Xenos computation core:
5 FMACs * 2 Flops/FMAC * 48 ALUs * .5 GHZ = 240 GFLOPS/Sec
Xenos die size: 180-220 sq mm
Xenos computational core: 90-132 sq mm
GFLOPS/sq mm = 2.66 - 1.81

So even with apples to oranges comparison of your choosing, Xenos "computational core" is not so effective as SPE or CELL "computational core." Even if I include LS (I already included LS control logic and channels), then the figure for "computational core" is still impressive 2.31 gflops/sq. mm, or approximately same as Xenon figure provided by you.

I hope you have learned your lesson that if you choose to make meaningless comparison for purporses of propoganda using meaningless criteria, then others can also do so using same meaningless criteria. To use poetic reference it is "Pandora's Box" you should not open.
 
ihamoitc2005 said:
I hope you have learned your lesson that if you choose to make meaningless comparison for purporses of propoganda using meaningless criteria, then others can also do so using same meaningless criteria. To use poetic reference it is "Pandora's Box" you should not open.

Aaron, Ihamoitc,

Come on guys, give it a rest. This "I'm going to show you" and "I hope you learned your lesson" attitude is getting nowhere.
 
You are right.

AlgebraicRing said:
Aaron, Ihamoitc,

Come on guys, give it a rest. This "I'm going to show you" and "I hope you learned your lesson" attitude is getting nowhere.

You are right my friend, this back and forth between Spink and I is not contributing to understanding and we are both now in the "realm" of meaningless comparisons for the purpose of showing large numbers sometimes even personal insults. Therefore I agree to "give it a rest".
 
ihamoitc2005 said:
Are you unaware that once again you compare apples with oranges or do you do this with intention? Also, what purpose is of comparing merely computational core when computational core is not what is Xenos which is large GPU unit. You said Xenos is superior for coprocessor than SPE yet now you extract specific component of Xenos, your "computational core" for such comparison.

The Xenos contains a GPU (the computational core) as well as a lot of other logic (bus interfaces, memory controllers, system interfaces, etc). The GPU portion of Xenos looks to be roughly 50-66% of the die area with other functionality consuming the rest of the die.


Because I feel sorry for you due to you again embarrassing yourself with careless or desperate statements, I shall use your apples vs oranges number for Xenos gflops so you have some advantage in this meaningless comparison.
I'm glad you feel sorry for me, I can at least say the feeling is shared.

Single SPE computational core (SPU):
4 FMAC * 2 Flops/FMAC * 3.2 GHz = 25.6 GFLOPS/Sec
SPE die size: 14.53 sq mm (5.81mm x 2.51mm)
Computational Core: ~7 sq. mm
The SPE IS the computational core. So, we'll keep it at the appropriate number of 14.5 sq mm.


I hope you have learned your lesson that if you choose to make meaningless comparison for purporses of propoganda using meaningless criteria, then others can also do so using same meaningless criteria. To use poetic reference it is "Pandora's Box" you should not open.

The only lesson I've learned is that forum posters like you lack the proper experience and understanding to make an engineering decision.

Aaron Spink
speaking for myslef inc.
 
MfA said:
What problems a game engine has to deal with are regular problems exactly? I just see lots of either plain irregular ones (AI, etc) or ones where you have to iteratively refine parts of the solution (collision detection and physics in general).

AI likely will always have to be done on the main core. The programming isn't such that something like a vector engine or limit core (spe) will likely be able to handle it sufficiently. Physics and collision detection should both be fairly easy to do with vector engines, they have after been done with vector engines since the 60s and in general have the highest efficiencies on vector engines.

The only thing I could see Xenos doing well is non-adaptive procedural generation of geometry (or rather, adaptive on a batch scale driven by the CPU). You can gloss over the irregularity of the problems by brute force solutions, but you are going to be wasting floating point power.

This is going to be true no matter what the solution is though the brute force solutions have significant area advantages due to the low control overhead.

About the percentage taking up by control logic in the SPE, wasn't storage a much bigger factor? I think I remember a quote from an IBM researcher to the effect that only a very small part of the core was taken up by control logic, but my memory might be deceiving me.

From looking at the die photos of the SPE, there is a significant amount of area that is no the SIMD FP unit. When you do moderate length vectors, the other functions (loads/stores, branch, control, etc) become a much smaller percentage of the real estate. I personally think that vectors/wide simd will make a come back over the next 5-10 years do to their increased efficiency vs small simd designs. Vectors/wide simd coupled with streaming/dataflow engines are very suited to a wide variety of game code related to both graphics and physics. In addition, programming models for vector/wide simd are well understood with decades of research behind them.

Aaron Spink
speaking for myself inc.
 
I dunno if this have been debated before on this thread but instead of comparing Xenos with a single SPE you guys should be comparing the Cell with the Xenon.
Both CPU articles are out now on IBM.com and both should roughly be compared on paper ofcourse.

We now know the Cell is able to do about 198GFlop/s in realtime while Xenon can only hit 100-110 GFlop/s at best (probably only 80-90GFlop/s in realtime).
What should developers do with all that extra computational power?
That's the real question that should asked and answered here.

Every techhead should know by now the PS3 will much more powerfull than the Xbox 360.
And for every techhead that doesn't the Cell has more raw power (more floating point calculations, more integer calculations, more hardware threads, more cores, more vector calculations etc.), more bandwidth and more flexibility (spread code over more hardware threads and SPE's and more specialised at whatever task you give them) than the Xenon and the RSX (if we think of as a GTX 512 core) has more raw power (more floating point calculations, more integer calculations, more verticle calculations) and higher fillrate's (13,2GTexel/s and 8,8GPixel/s).
 
The Xenos contains a GPU (the computational core) as well as a lot of other logic (bus interfaces, memory controllers, system interfaces, etc). The GPU portion of Xenos looks to be roughly 50-66% of the die area with other functionality consuming the rest of the die.

Why are you guys even arguing about this? A core's performance depends on it's ability to get data from memory. FLOPS per sq mm is a meaningless measurement of a core if it can't be fed.

In any case Xenos and Cell are not designed to do the same thing so what's the point of comparing them anyway? Each one is going to do some things better than the other, end of story.
 
ADEX said:
Why are you guys even arguing about this? A core's performance depends on it's ability to get data from memory. FLOPS per sq mm is a meaningless measurement of a core if it can't be fed.

That would be correct if you are doing nothing but floating point, but your not. You doing lots of other things, and the faster you get your floating point work over with, the more time you have for other things, and the better you can utilize your full bandwidth.

Anyway, lots of algorithims will be using floating point and then passing data on to the RSX, and external memory bandwidth will not even be touched, except probably for bringing in matrices.

I agree there will be cases were you are bandwidth starved, but not all the time.
 
Back
Top