Enhanced Cell B.E. for HPC at Cool Chips X

Discussion in 'CellPerformance@B3D' started by one, Apr 19, 2007.

  1. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,823
    Likes Received:
    153
    Location:
    Minato-ku, Tokyo
    As announced a month ago, yesterday At Cool Chips X, IBM did a presentation about the SPE in the new 65nm Cell B.E. for HPC with DP enhancement. Tech-on has an article (reg required).

    The Enhanced BE supports DDR2 (DDR2-800) up to 16GB. The DP FLOPS increased from 25.6 Gflops to 102 Gflops, the DP latency is reduced from 13 cycles to 9 cycles with a full pipeline and dual issue. It supports denormal and expected NaN to be more IEEE compliant. Its SPU ISA is v1.2, with 5 new DP instructions. The transistor count is 250 million (from 241 million for 90nm Cell), the chip area is 212 mm2 (from 235 mm2), and it consumes 100 watts (from 110 watts).

    Apparently there was a question from the audience that the memory bandwidth for DDR2 may hobble its actual application performance.

    [​IMG]
    [​IMG]
    [​IMG]
     
  2. nutball

    Veteran Subscriber

    Joined:
    Jan 10, 2003
    Messages:
    2,115
    Likes Received:
    436
    Location:
    en.gb.uk
    Why is this in Console technology?
     
  3. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,823
    Likes Received:
    153
    Location:
    Minato-ku, Tokyo
    According to a mod:
    http://forum.beyond3d.com/showpost.php?p=959222&postcount=18
    To add my view, it's not at all unrelated to the 65nm Cell for games and future SPE ISA.
     
  4. nutball

    Veteran Subscriber

    Joined:
    Jan 10, 2003
    Messages:
    2,115
    Likes Received:
    436
    Location:
    en.gb.uk
    Hmmm. How bizarre. I doubt IBM see Cell HPC as console-specific technology. Ah well, it's the mod's site, not mine.

    Anyway, this looks rather interesting :)
     
  5. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,583
    Likes Received:
    703
    Location:
    Guess...
    Loving the way they claim the GTX peaks at only 350 GFLOPs and ignores the R580 vertex shaders.
     
  6. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,069
    Likes Received:
    2,739
    Location:
    Well within 3d
    The area of the 65nm chip is 212 mm2 compared to 235 mm2 for the 90nm version?
    And it burns 100 watts from 110 watts?

    Are you sure this is the 65nm version they're talking about?

    (edit: it seems from the presentation pic that it is, but I still can't believe it)

    That's terrible density and power scaling scaling for a process transition.
     
    #6 3dilettante, Apr 19, 2007
    Last edited by a moderator: Apr 19, 2007
  7. deathkiller

    Newcomer

    Joined:
    Jul 24, 2005
    Messages:
    186
    Likes Received:
    4
    I think that having 25+GB/s using DDR2 is not exactly cheap area wise for the memory controller in the HPC Cell.
     
  8. Jesus2006

    Regular

    Joined:
    Jul 14, 2006
    Messages:
    506
    Likes Received:
    10
    Location:
    Bavaria
    Since you cannot really compare those Flop numbers to CELL Flops (or other general purpose CPUs) anyways, it's useless to even do so :)
     
  9. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    But then why did they include it in the comparision chart, unless it is a non serious comparsision? :wink4:
     
  10. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    Its for HPC, those GPUs are competition.

    IBM needs to do better IMO. That's a really poor process shift.
     
  11. Neb

    Neb Iron "BEAST" Man
    Legend

    Joined:
    Mar 16, 2007
    Messages:
    8,391
    Likes Received:
    3
    Location:
    NGC2264
    Exactly, GPGPU project.
     
  12. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    4x more DP FLOPs is nothing to be sniffed at (though counting original Cell's PPE DP FLOPs seems pretty pointless - it's really a 5x gain in just the SPEs).

    Dumping the XDR interface (presumably), realising that the real world wants to attach lots of DDR was also a good move, 16GB for the win. Pity they didn't aim for more bandwidth though.

    Jawed
     
  13. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,069
    Likes Received:
    2,739
    Location:
    Well within 3d
    Is there a die shot of the whole chip?

    The number of transistors went up by 9 million to hit 250 million transistors. That's peanuts.

    Ideal scaling should have cut die size by 1/2, though it would be less than ideal.

    A drop in die size of 10% is way below ideal.

    The drop in power is 10 watts, while we've seen chips cutting power consumption by more than a third, sometimes by half.
    DDR2 controllers have been estimated to consume watts in the single-digits.
    AMD's dual-channel DDR2 controller is probably less than 10 watts, and two would likely be sufficient for 25 GB/sec.
    There's no way the DDR2 controller is burning 40 watts to counteract what a good process transition would bring, especially not considering that Cell already has a memory controller.

    That shows near zero scaling at a target frequency that was well within reach at 90nm.
     
  14. Crossbar

    Veteran

    Joined:
    Feb 8, 2006
    Messages:
    1,821
    Likes Received:
    12
    I agree that the shrink of the actual die size seems pretty low. Is there any information about any changes in the cache size of the PPE or the LS of the SPEs? If there have been some increase and those transistors were excluded in the count it could be an explaination.

    The relatively high power draw could partly be explained by the fact that the new chip actually has a 4 times increase in DP performance that means that it is excersising a few more transistors harder than the 90 nm counter part.
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,069
    Likes Received:
    2,739
    Location:
    Well within 3d
    There's no good reason to fib on the transistor count.
    The count for the 90nm version is the total count for Cell. What good is omitting equivalent structures in the 65nm version?
     
  16. Carl B

    Carl B Friends call me xbd
    Moderator Legend

    Joined:
    Feb 20, 2005
    Messages:
    6,266
    Likes Received:
    63
    Thanks for the heads up One, nice to see it finally arrive. Cell's been doing well enough outside gaming that I'm sure this chip will find a ready audience - afterall it improves DP performance, and that was basically the primary weak spot relative to its other advantages. But I view the move to GDDR2 with some puzzlement - I guess Rambus simply hasn't been trying to increase the memory FlexIO is able to address over the last several years? Obviously for an HPC-targeted product, more memory support is required, but you'd think the pin/packagign advantages offered by XDR would have seen further build-out down that road. I wonder if it's the Rambus GDDR2 design that gets used here in Cell as well.

    If this chip were on 90nm, I'd certainly be lauding the achievement, but the weak transition to 65nm just reminds me of all of IBM's fabbing issues of the past. To be fair, it seems Cell is a harder chip than most to shrink - or at least that was part of the premise for pursuing a separate on-chip supply for the SRAM for this latest generation. Of course with the power performance gains IBM was touting ISSCC 07, I would have expected to see these power figures for 4GHz rather than 3.2. But power aside, that die size is still absurdly large.
     
  17. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    40,346
    Likes Received:
    10,681
    Location:
    Under my bridge
    This raises a serious concern for PS3's price drops. If 65nm saves so little on the area, the price drop won't be significant, and PS3 will remain expensive for Sony. Bad, bad news for them. Hell, we could have seen the 65nm transistion in the EU PS3 without ever knowing, because the package ended up the same size!
     
  18. Frank

    Frank Certified not a majority
    Veteran

    Joined:
    Sep 21, 2003
    Messages:
    3,187
    Likes Received:
    59
    Location:
    Sittard, the Netherlands
    How much has the pin count increased? Pins consume lots of area.
     
  19. Carl B

    Carl B Friends call me xbd
    Moderator Legend

    Joined:
    Feb 20, 2005
    Messages:
    6,266
    Likes Received:
    63
    Well the package could be the same size anyway... we really don't know much without someone taking the heatspreader off. That said, I would imagine that the Cell chips being used by Sony on 65nm are of a different revision... unless the inclusion of these DP considerations is so negligible in its effect on die size that they feel there is a benefit from just maintaining a unifed 65nm production front at this point. But... I imagine that Nagasaki won't be making these. And in that vein, I wonder if perhaps they've achieved something a little better in terms of die/wattage.

    What I'm interested in is in seeing the progress Toshiba/Sony has made in moving Cell to a bulk process for CE, though the speed requirements for PS3 would likely keep that version from going bulk anytime soon.

    In fact, CE applicationa and revisions are something I would be more interested in learning about period - supposedly Toshiba is moving ahead on that front.
     
  20. Kryton

    Regular

    Joined:
    Oct 26, 2005
    Messages:
    273
    Likes Received:
    8
    I guess you missed the 5 new instructions and radical DP performance increase in this version of the chip?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...