Cell

Discussion in 'Architecture and Products' started by bbot, Aug 6, 2002.

  1. PC-Engine

    Banned

    Joined:
    Feb 7, 2002
    Messages:
    6,799
    Likes Received:
    12
    Today's highest performing CPU core can do 8 GFLOPS at 333 MHz. If clocked at 1 GHz, you get 24 GFLOPS per core, so 64 cores would get you 1.5 TFLOPS.
     
  2. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    Thats gonna be one huge die though.
     
  3. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    I think it's interesting that a paper coming from IBM talking about the CELL chips of 2011 uses a performance figure which is a third of of the rumored 1TFlop for the very first CELL chip. Not to mention it's talking about 16 cores on a die, not 64.

    Serge
     
  4. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    Yeah, even stripping down the MIPS64-20Kc core and clocking it at 1ghz yeilds only 4GFlops for ~3M Transistors. Their going to have to resort to a more vector/scientific processor approach. Which is why it irks me when people say 'General Computing' - the line between 'dedicated computing' (VS) and 'general computing' (VU) is narroring quickly.

    Their ability to hit 1TFlop will be totally dependant upon the process size they can use. It's difficult to imagine that preformance on .10um just due to transistor limitations. Sub-10 micron looks to be necessesity, but just how far south is a good question.
     
  5. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    A better question might be how far south of .10 micron can we possibly go? We're definitely getting close to the limits of silicon transistor technology.

    For example, it's fairly unrealistic to believe that chips will be able to ever reach 100GHz, and I doubt silicon processes will get below 0.01 microns (I'd say these are very conservative estimates, too...we'll probably not even get this far).

    Hopefully quantum technologies will become viable before we hit these limits.
     
  6. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    What FLOPS they will achieve depend largely on what kind of processor they are going to put into that cellular arrangement. If the processor is more general purpose, than they will have less FLOPS performance compare to special purpose one like say Vector Unit.

    Also, I thought that 64 number refered to 64 chips in a package of 20x20cm, not 64 cores in a single die.

    The chip they are projecting by 2008 seems to be around 400mm2, that's one huge chip.
     
  7. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    V3 that's fair. As far as the article goes it refers to 64 chips, with 16 cores each. I definitely am not knowledgeable enough to say "1TFlop on a chip in 2005, without a doubt that's BS". I do think it sounds very fishy though...
     
  8. megadrive0088

    Regular

    Joined:
    Jul 23, 2002
    Messages:
    700
    Likes Received:
    0
    As far as what will be used for a rasterizer in PS3.... Well my understanding was that Sony would have EE3 (500M transistors) and Graphics Synthesizer 3 (unknown number of transistors but likely astronomical) chips by 2005 or 2006. these chips would form the basis of the PS3.

    this infomation, although 2-3 years old now, was from Sony/Ken K.

    The EE3, according to Sony, would be a "radically new architecture". I think that sounds like CELL to me. Where as the EE2 of 2002, which is intended for workstations, part of Phase II of Sony's long term plans and unreleased afaik, would be an enhanced version of the EE1 architechure.

    The rasterizer in PS3, I would think, is going to be a fairly traditional rasterizer. I do think Sony will opt for a GS3 as they have said in the past. At least the rasterizng portion of GS3 will be a traditional rasterizer--Meaning that I do not think that CELL will be used as a rasterizer, or that a second CELL will do the rasterizing. Either the main CELL CPU in PS3 will do the transform and lighting for a GS3 (as EE's Vector Units do for GS in PS2) or a second CELL will be bolted onto a GS3 rasterizer to act as the T&L/VU/Vertex Shader portion of GS3.

    The rastering portion of GS3, or the whole GS3 will most likely be a massively parallel version of GS2 with more features and image processing enhancements (AA, texture compression, etc).
     
  9. PC-Engine

    Banned

    Joined:
    Feb 7, 2002
    Messages:
    6,799
    Likes Received:
    12
    What kind of heat will a 16+ core chip generate?
     
  10. phynicle

    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    127
    Likes Received:
    0

    didn't IBM make a chip for communications that could do 200gighertz???
     
  11. phynicle

    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    127
    Likes Received:
    0
    btw the blue gene project
    when is IBM going to deliver the machine for the medication research on gene functions???don't they have a deadline?
     
  12. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,661
    Likes Received:
    1,114
    For a IEEE compliant FPU count about 1/2 million transistors.

    Let's be generous and say that each FPU can do a MADD each cycle. Let's say this chip runs a 1 GHz.

    Then we get 5*10^5 transistors/FPU * 1500GFLOPS / (2GFLOPS/FPU ) =375*10^6 transistors. Not bloody likely, is it ?

    Either SONY/IBM will be nowhere near 1TFLOPS or CELL will be full of special purpose logic (and not general purpose).

    Cheers
    Gubbi
     
  13. ushac

    Newcomer

    Joined:
    Feb 6, 2002
    Messages:
    50
    Likes Received:
    0
    Location:
    Linköping, Sweden
    Gubbi,

    is it really that unlikely to achieve 375M transistors by 2005? I mean R300 already has 117M at .15u. Assuming equal transistors/area ratio you could fit 375M into the same area as an R300 by shrinking the die to ~.08u. Intel and others are already working on .09u and .065u processes... And with the kind of paralellism in we'll see in cell, won't there be a bunch of opportunities to exploit synergetic effects? Anyway, all these estimates have been kind of generous. If they DO achieve 1TFlops I'll be mighty impressed.


    On an OT note, I thought this was kinda cool:
    http://eet.com/at/news/OEG20020806S0030


    Regards / ushac
     
  14. Gubbi

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,661
    Likes Received:
    1,114
    Well, you have to consider that we're talking 375M logic transistors. And that's just for the FPUS, then you need the apparatus to issue instructions to these, you need to move data to/from these etc. So we'ere talking close to 500M.

    Also SRAM arrays have much higher transistor density than logic (2-4 times). I'd love to know how many transistors of the R300 is in caches, FIFOs and other SRAM arrays. An uninformed guess is 50M (around 1 Mbyte in total).

    Cheers
    Gubbi
     
  15. eSa

    eSa
    Newcomer

    Joined:
    Jun 24, 2002
    Messages:
    121
    Likes Received:
    1
    With forthcoming 0.09 micron tech, you can build chip with, say 350 million transistors. It's just completely other issue what clock rate will be :wink:

    Also this whole PS3/Cell hype is bit pointless. There is no reason why you couldn't build massivly paraller system. The real question is, how you are going to fit linear/general program flow to run on it. There have been a lot of research about paraller algorithms and they work well WITH SPECIAL CASES. It is completely another thing to get some general game engine with physics/AI etc. to run even half decently with the paraller architecture. No middleware, no compiler is gonna offer any foolproof solution to this. Sony's PS2 is good example what happens when you give developers paraller system to program with. You have a lot of hidden power that is very difficult dig up and use.

    Besides, the calculation power in use with paraller system doesn't grow with linear fashion... With 2 processors/cores you have, say ~1.8 speedup, with four ~3.24 speedup etc.
     
  16. ushac

    Newcomer

    Joined:
    Feb 6, 2002
    Messages:
    50
    Likes Received:
    0
    Location:
    Linköping, Sweden
    I'm no expert at this, but doesn't ~50M (~43%) sound a little high concidering that the GF4Ti has ~14M (~22%)? Anyway I'd say it's questionable if they can deliver what they promise, but not impossible.

    Does anyone know what feature width they are targeting cell at? How will the fact that they are tuning a manufacturing process explicitly for the cell chip affect size/yield/clock etc? Have anyone seen any roadmaps of when the .1u - .06u category of sizes are estimated to be ready for commercial production?

    BTW, phynicle, those extremely high clocked chips are usually a single transistor - a transmission amplifier or so in exotic materials which often don't lend themselves well to microprocessor manufacturing techniques.

    Regards / ushac
     
  17. phynicle

    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    127
    Likes Received:
    0
    yup thanx ushac
     
  18. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    I think it was closer to 20GHz, based on a silicon-germanium process.

    Anyway, I didn't explain the full extent of the limitation. It's essentially based on the size of the circuit. Yes, it may be possible to build a 1THz processor, but that processor would be exceptionally tiny. The barrier here is essentially just the speed of light. As soon as you attempt to produce clock speeds that get too high for the size of the chip, the chip will start radiating like crazy, and the resistance of the metal will also increase significantly.
     
  19. V3

    V3
    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    3,304
    Likes Received:
    5
    That depend on the core, the process they are using and the clock speed, etc.

    Here that article predicted,

    From that article, it seems that Cell is just a way to utilise the high transistors count that they are going to achieve in the future. Instead of making more complex processor with fancy things to get more performance (which not efficient according to them), they propose to use simpler processor, but many of them, to get the performance. Well that's the vibe I am getting from that article anyway.

    Are they right, what do you think ?

    They also have prediction of transistor density in one of the table in that article.
     
  20. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    Yes, I do.

    Games have large data sets that can be broken up and then processed by different processors easily. That's just games. In the real world, when was the last time you ran a single program at once? It must have been years ago really, because, windows and a lot of other programs run on your machine at once - virus scanner, fire wall, ICQ, got knows what else... and that's just on your desktop. The thing is that there are so many little programs running, it seems weird to have one CPU jump around doing all these tasks, rather than executing them at the same time. It'd be nice not to have my HDD accesses slow computers to a crawl, yeah, yeah that's more IDE's fault.

    Regardless we constantly hear that we don't need faster CPUs and I agree, we need wider CPUs. Ones that can do a lot of things at once. We're used to doing things in the foreground while things are being done in the back ground. Think about how one works in the kitchen, I'm sure most people don't do one task at a time, they let one happen on all by it's lonesome and then they do something else.

    We live in a world where many things happen at the same time, it's time CPUs began to take large steps into this world.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...