Enhanced Cell B.E. for HPC at Cool Chips X

one

Unruly Member
Veteran
As announced a month ago, yesterday At Cool Chips X, IBM did a presentation about the SPE in the new 65nm Cell B.E. for HPC with DP enhancement. Tech-on has an article (reg required).

The Enhanced BE supports DDR2 (DDR2-800) up to 16GB. The DP FLOPS increased from 25.6 Gflops to 102 Gflops, the DP latency is reduced from 13 cycles to 9 cycles with a full pipeline and dual issue. It supports denormal and expected NaN to be more IEEE compliant. Its SPU ISA is v1.2, with 5 new DP instructions. The transistor count is 250 million (from 241 million for 90nm Cell), the chip area is 212 mm2 (from 235 mm2), and it consumes 100 watts (from 110 watts).

Apparently there was a question from the audience that the memory bandwidth for DDR2 may hobble its actual application performance.

P1030838.jpg

P1030856.jpg

P1030852.jpg
 

nutball

Veteran
Subscriber
Hmmm. How bizarre. I doubt IBM see Cell HPC as console-specific technology. Ah well, it's the mod's site, not mine.

Anyway, this looks rather interesting :)
 

3dilettante

Legend
Alpha
The transistor count is 250 million (from 241 million for 90nm Cell), the chip area is 212 mm2 (from 235 mm2), and it consumes 100 watts (from 110 watts).

The area of the 65nm chip is 212 mm2 compared to 235 mm2 for the 90nm version?
And it burns 100 watts from 110 watts?

Are you sure this is the 65nm version they're talking about?

(edit: it seems from the presentation pic that it is, but I still can't believe it)

That's terrible density and power scaling scaling for a process transition.
 
Last edited by a moderator:

deathkiller

Newcomer
The area of the 65nm chip is 212 mm2 compared to 235 mm2 for the 90nm version?
And it burns 100 watts from 110 watts?

Are you sure this is the 65nm version they're talking about?

(edit: it seems from the presentation pic that it is, but I still can't believe it)

That's terrible density and power scaling scaling for a process transition.
I think that having 25+GB/s using DDR2 is not exactly cheap area wise for the memory controller in the HPC Cell.
 

Neb

Iron "BEAST" Man
Legend
Since you cannot really compare those Flop numbers to CELL Flops (or other general purpose CPUs) anyways, it's useless to even do so. :)

But then why did they include it in the comparision chart, unless it is a non serious comparsision? ;)
 

Jawed

Legend
4x more DP FLOPs is nothing to be sniffed at (though counting original Cell's PPE DP FLOPs seems pretty pointless - it's really a 5x gain in just the SPEs).

Dumping the XDR interface (presumably), realising that the real world wants to attach lots of DDR was also a good move, 16GB for the win. Pity they didn't aim for more bandwidth though.

Jawed
 

3dilettante

Legend
Alpha
I think that having 25+GB/s using DDR2 is not exactly cheap area wise for the memory controller in the HPC Cell.

Is there a die shot of the whole chip?

The number of transistors went up by 9 million to hit 250 million transistors. That's peanuts.

Ideal scaling should have cut die size by 1/2, though it would be less than ideal.

A drop in die size of 10% is way below ideal.

The drop in power is 10 watts, while we've seen chips cutting power consumption by more than a third, sometimes by half.
DDR2 controllers have been estimated to consume watts in the single-digits.
AMD's dual-channel DDR2 controller is probably less than 10 watts, and two would likely be sufficient for 25 GB/sec.
There's no way the DDR2 controller is burning 40 watts to counteract what a good process transition would bring, especially not considering that Cell already has a memory controller.

That shows near zero scaling at a target frequency that was well within reach at 90nm.
 

Crossbar

Veteran
I agree that the shrink of the actual die size seems pretty low. Is there any information about any changes in the cache size of the PPE or the LS of the SPEs? If there have been some increase and those transistors were excluded in the count it could be an explaination.

The relatively high power draw could partly be explained by the fact that the new chip actually has a 4 times increase in DP performance that means that it is excersising a few more transistors harder than the 90 nm counter part.
 

3dilettante

Legend
Alpha
There's no good reason to fib on the transistor count.
The count for the 90nm version is the total count for Cell. What good is omitting equivalent structures in the 65nm version?
 

Carl B

Friends call me xbd
Legend
Thanks for the heads up One, nice to see it finally arrive. Cell's been doing well enough outside gaming that I'm sure this chip will find a ready audience - afterall it improves DP performance, and that was basically the primary weak spot relative to its other advantages. But I view the move to GDDR2 with some puzzlement - I guess Rambus simply hasn't been trying to increase the memory FlexIO is able to address over the last several years? Obviously for an HPC-targeted product, more memory support is required, but you'd think the pin/packagign advantages offered by XDR would have seen further build-out down that road. I wonder if it's the Rambus GDDR2 design that gets used here in Cell as well.

If this chip were on 90nm, I'd certainly be lauding the achievement, but the weak transition to 65nm just reminds me of all of IBM's fabbing issues of the past. To be fair, it seems Cell is a harder chip than most to shrink - or at least that was part of the premise for pursuing a separate on-chip supply for the SRAM for this latest generation. Of course with the power performance gains IBM was touting ISSCC 07, I would have expected to see these power figures for 4GHz rather than 3.2. But power aside, that die size is still absurdly large.
 

Shifty Geezer

uber-Troll!
Moderator
Legend
This raises a serious concern for PS3's price drops. If 65nm saves so little on the area, the price drop won't be significant, and PS3 will remain expensive for Sony. Bad, bad news for them. Hell, we could have seen the 65nm transistion in the EU PS3 without ever knowing, because the package ended up the same size!
 

Carl B

Friends call me xbd
Legend
Well the package could be the same size anyway... we really don't know much without someone taking the heatspreader off. That said, I would imagine that the Cell chips being used by Sony on 65nm are of a different revision... unless the inclusion of these DP considerations is so negligible in its effect on die size that they feel there is a benefit from just maintaining a unifed 65nm production front at this point. But... I imagine that Nagasaki won't be making these. And in that vein, I wonder if perhaps they've achieved something a little better in terms of die/wattage.

What I'm interested in is in seeing the progress Toshiba/Sony has made in moving Cell to a bulk process for CE, though the speed requirements for PS3 would likely keep that version from going bulk anytime soon.

In fact, CE applicationa and revisions are something I would be more interested in learning about period - supposedly Toshiba is moving ahead on that front.
 

Kryton

Regular
This raises a serious concern for PS3's price drops. If 65nm saves so little on the area, the price drop won't be significant, and PS3 will remain expensive for Sony. Bad, bad news for them. Hell, we could have seen the 65nm transistion in the EU PS3 without ever knowing, because the package ended up the same size!

I guess you missed the 5 new instructions and radical DP performance increase in this version of the chip?
 
Top