End of Cell for IBM

From a technical POV, I think Cell very successful. What orher processor has been able to offer the price/performance/power consumption in realworld task over the past few years? And what was on the table as a viable platform in 2000/2001 when Sony were looking for a next-gen performance CPU?
Well they may have develop something else but as I stated in another post it may not have please all of three parts in this association.
The real reason Cell has died was STI failed to get other parties on board. If they had cultivated a development community, the value of Cell would have reached its potential that we were talking about in the early days. However, they didn't encourage developers, which didn't cultivate the software knowhow, which made Cell too difficult so it was avoided. If they had actually got behind Cell with full software support, it'd be a different story now.

You can't just supply hardware and expect it to make it on its own. PS2 got lucky. Software and support is 2/3rds the experience.
Well my opinion (from reading other persons opinion) is that it was not worse give up on memory coherency that early (I mean while still dealing on a reasonable number of cores). In regard to scalability and HPC realm limiting the FlexIO scalability to two Cells wasn't may be the proper move, four may have be better, etc.
I'm sure something like the CPU have describe even using a new ISA and with few support would have done better in the HPC realm.
 
From a technical POV, I think Cell very successful. What orher processor has been able to offer the price/performance/power consumption in realworld task over the past few years? And what was on the table as a viable platform in 2000/2001 when Sony were looking for a next-gen performance CPU?

Damn Right.

The real reason Cell has died was STI failed to get other parties on board. If they had cultivated a development community, the value of Cell would have reached its potential that we were talking about in the early days. However, they didn't encourage developers, which didn't cultivate the software knowhow, which made Cell too difficult so it was avoided. If they had actually got behind Cell with full software support, it'd be a different story now.

If you look at the trajectories of GPUs vs cell, it is pretty clear that GPU's have won as the compute accelerators because they had a programming model (ie glsl/ogl 2.0), no matter how restrictive, before they started thinking of gpgpu, so they had some ideas to shoot for.

STI had squat, so when they made the 1+8 core monster, they just threw a pthread/MIMD equivalent (in it's very raw form) at the devs. No wonder the devs don't like it. Also, the need for async DMA for all i/o makes it hard to program, while GPU's have a cpu like load-store to simplify the programming model.
You can't just supply hardware and expect it to make it on its own. PS2 got lucky. Software and support is 2/3rds the experience.
 
Last edited by a moderator:
CELL was never competitve in any category of processing unit. HPC would be the closest fit, but I think Clearspeed's architecture offered better possibilities, not to mention other designs which could be targeted at that sector..

Agreed, Clearspeed owned Cell...
 
If you look at the trajectories of GPUs vs cell, it is pretty clear that GPU's have won as the compute accelerators because they had a programming model (ie glsl/ogl 2.0), no matter how restrictive, before they started thinking of gpgpu, so they had some ideas to shoot for.
I remember a post where Nao wondered about the width use for SPU in regard to their intended use.
May have changed nothing in regard to the software but it may have create more intensive to use the chip.
 
What do you mean with unused die space? Are you talking about the sides that don´t have any pins? Those will likely disappear in the 32 nm shrink.
the IO areas arent scaling down as nicely as the rest of Cell, I dont think this will be less troublesome on 32nm (either more relative space wasted or more work necessary for redesign).

Unused FlexIO lanes? Are you sure about that, I think they are all being used.
FlexIO is divided into 8 cache-coherent and 4 noncoherent lanes. the coherent is (fully) attached to RSX, but only 1 noncoherent lane is used for the "Southbridge" (giving it 2GB/s which is more than enough). So 3/12 lanes arent used, not as significant as my memory made me believe but still wasted space, especially as those interfaces dictate the width of the die in the 45nm version.
 
Am I the only one that finds it strange that there is no official IBM press release on this, in English?

Probably because the story is flat out wrong.

All that's been announced (by an IBM guy in an interview) it that PowerXCell32i (a chip that was never formally announced anyway) has been cancelled. That somehow got translated into IBM has cancelled Cell.

He also said Cell will continue in other forms, possibly in POWER and Bluegene.

The original is here:http://www.heise.de/newsticker/meldung/SC09-IBM-laesst-Cell-Prozessor-auslaufen-864497.html
 
Probably because the story is flat out wrong.

Well I'd definitely appreciate any clarity we can shed on the matter; my take-away, even if Cell itself wasn't canceled, was that the conceptual evolutions would take place on other chips using the lessons of Cell.
 
CELL was never competitve in any category of processing unit. HPC would be the closest fit, but I think Clearspeed's architecture offered better possibilities, not to mention other designs which could be targeted at that sector.

I disagree with this outright - this forum has been rife with technical documentation over the last several years of areas in which Cell provided superior real world performance to competing solutions. Signal processing, Monte Carlo, and all such similar algorithmic work saw incredible speed-ups on Cell on both an absolute and per watt/processor level.

And of course, nearly *all* code could see improvement when ported over. The problem has always been the amount of work needed on the human side of things to tap into that power, but on the theoretical vs achievable FLops metrics, Cell has always ranked extremely high (in both).

GPGPU is in that space full bore now, and was entering it when Cell launched. The economies of scale and rapid evolution timelines always made it the obvious and present 'threat' to Cell's target market. Clearspeed though... we've had that conversation like a dozen times - Clearspeed is more limited than simple Flops and watts would imply.

I'm happy to discuss Cell's slow motion obsolescence in the face of rapid evolutions in other architectures, but Clearspeed is a non-factor in Cell's would-be demise.
 
Last edited by a moderator:
Who is in charge of die shrinks :?:

It was just an automated 'dumb shrink' situation in both cases I believe, with some SRAM power improvements along the way, and I would imagine it was IBM mostly in charge. It was always them releasing the documentation at least!

For 32nm and the process shift I think that they would have gone with a total process optimization for the chip though - it simply would have been the logical time to do so.
 
I remember reading that they optimise the shrink for power consumption not size. Cell could be smaller. It's around 250M transistors. The PS3 only used 7 SPUs, when they restructure it for 32nm they should be able to remove one and make it smaller, there is no need to keep disabling one SPU for yield.

That's not the 'dead' silicon I was talking about though. As the chip has shrunk, there has actually been proliferation of truly unused silicon - not unused transistors, simply space with no transistors at all! That's due to the reduced scaling on the I/O vs the rest of the chip. If you look through the RWT article that was linked on the subject, you should find a picture illustrating.

Here's an article I put up on the subject back in the day that should summarize the changes:

http://www.beyond3d.com/content/news/582
 
the IO areas arent scaling down as nicely as the rest of Cell, I dont think this will be less troublesome on 32nm (either more relative space wasted or more work necessary for redesign).
Do you think they need to scale down the IO area at all? I mean if they wrap it around the chip it should be plenty of space.

FlexIO is divided into 8 cache-coherent and 4 noncoherent lanes. the coherent is (fully) attached to RSX, but only 1 noncoherent lane is used for the "Southbridge" (giving it 2GB/s which is more than enough). So 3/12 lanes arent used, not as significant as my memory made me believe but still wasted space, especially as those interfaces dictate the width of the die in the 45nm version.
Yeah, seems you are right I thought the southbridge used up more lanes.
 
There is hope for a Cell 2 CPU then, well i guess that there always was hope. It´s just a question of Sony asking IBM to make another one?
 
Do you think they need to scale down the IO area at all? I mean if they wrap it around the chip it should be plenty of space.
That's what the article at rwt said. They kept the same floorplan, so they could not get the right scaling.

I/O's are known for not scaling with process geometries.
 
That's not the 'dead' silicon I was talking about though. As the chip has shrunk, there has actually been proliferation of truly unused silicon - not unused transistors, simply space with no transistors at all! That's due to the reduced scaling on the I/O vs the rest of the chip. If you look through the RWT article that was linked on the subject, you should find a picture illustrating.

Here's an article I put up on the subject back in the day that should summarize the changes:

http://www.beyond3d.com/content/news/582

If I remember correctly the engineer was quiz about the dead silicon area or why Cell hasn't shrinked like EE and that's the answer they gave, they optimize for power and were not worried about the dead area at that point.

Perhaps because the yield was already good and dealing with the dead area won't get them many more chips per wafer.
 
Probably because the story is flat out wrong.

All that's been announced (by an IBM guy in an interview) it that PowerXCell32i (a chip that was never formally announced anyway) has been cancelled. That somehow got translated into IBM has cancelled Cell.

He also said Cell will continue in other forms, possibly in POWER and Bluegene.

The original is here:http://www.heise.de/newsticker/meldung/SC09-IBM-laesst-Cell-Prozessor-auslaufen-864497.html

The plan Cell32i would be underpower compare to the likes of GPUs and Larrabee. They need Cell64 or Cell128 to compete with those.
 
If I remember correctly the engineer was quiz about the dead silicon area or why Cell hasn't shrinked like EE and that's the answer they gave, they optimize for power and were not worried about the dead area at that point.

Perhaps because the yield was already good and dealing with the dead area won't get them many more chips per wafer.

Redesigning the layout of the chip for silicon efficiencies would have cost quite a bit. After the initial 90nm design work, it has all been about low-hanging fruit improvements. I agree though that chips per wafer wouldn't have gone up much, even if it had been addressed.

The plan Cell32i would be underpower compare to the likes of GPUs and Larrabee. They need Cell64 or Cell128 to compete with those.

I don't know that Cell32i would have been 'under' powered vs those solutions per se; if we're talking HPC it's all about dollars, FLops, and watts. On 32nm with a complete redesign and a metal gate process let's say that Cell32i comes in at ~150mm^2. With its ability to scale near linearly and the 'glueless' feature for their blades, I wouldn't be surprised if on a cost/watts basis it was still very competitive, even if on a chip v chip basis it would obviously fall short. The much smaller die size though would be a big advantage in yields, presumably power/cooling/thermals, and supposedly costs. In addition it should be more than sufficient CPU power for the PS4 w/much better launch metrics associated.

Not that any of the above deals with Cell as a GPU inside the console.
 
Last edited by a moderator:
The much smaller die size though would be a big advantage in yields, presumably power/cooling/thermals, and supposedly costs. In addition it should be more than sufficient CPU power for the PS4 w/much better launch metrics associated.
Indeed. Xenon was good enough for XB360 despite being much weaker on paper. Cell 2 would be ample performance if coupled with a well balanced overall system, and offer developers the benefit of a familiar environment. If Sony instead go with a GPU, all those developers will then have to learn CUDA or such to use it for general processing. And we don't know what exactly Larrabee will offer in terms of tools and developer environment, but it will be something of a blank slate just as Cell launched. So a Cell PS4 would hit the ground running, which is a very good thing. I imagine developers are fed up of being thrown eclectic hardware each generation!
 
Back
Top