ISSCC 2005

ChryZ said:
People seem to forget, that this $300 box is not just made of PEs .... there will be a GPU, some mem, a BlueRay drive, network interfaces, etc etc
Let's not forget the greatest voodoo of them all - the unbelievable memory/bus interfaces/connections/bandwidth.
 
Panajev2001a said:
It would be more than 256 GFLOPS at 4 GHz, there is a VMX unit attached to the PU, this Vector Unit will not sit there twiddling its thumbs I'd think ;).

yeah but not much more. the APU`s ~ SPU`s ~ SPE`s provide the overwelming majority of the flops performance, no?

the VMX unit attached to the PU would be like the equivalent of the FPU attached to the MIPS core in Emotion Engine, would it not? yeah it would add to the total flops performance, but not significantly increase it from ~256 Gflops.

we are ALL scrounging for flops, :LOL:
 
from the Microsoft Word document

CELL...bringing supercomputer power to everyday life with latest technology optimized for compute-intensive and broadband rich media applications

SUMMARY:
• Cell is a breakthrough architectural design -- featuring 8 Synergistic Processing Units (SPU) with Power-based core, with top clock speeds exceeding 4 GHz (as measured during initial laboratory testing).
• Cell is OS neutral - supporting multiple operating systems simultaneously
• Cell is a multicore chip comprising 8 SPUs and a 64-bit Power processor core capable of massive floating point processing
• Special circuit techniques, rules for modularity and reuse, customized clocking
structures, and unique power and thermal management concepts were
applied to optimize the design

CELL is a Multi-Core Architecture
• Contains 8 SPUs each containing a 128 entry 128-bit register file and 256KB Local Store
• Contains 64-bit Power ArchitectureTM with VMX that is a dual thread SMT design – views system memory as a 10-way coherent threaded machine
• 2.5MB of on Chip memory (512KB L2 and 8 * 256KB)
• 234 million transistors
• Prototype die size of 221mm2
• Fabricated with 90nanometer (nm) SOI process technology
• Cell is a modular architecture and floating point calculation capabilities can be adjusted by increasing or reducing the number of SPUs
CELL is a Broadband Architecture
• Compatible with 64b Power Architecture™
• SPU is a RISC architecture with SIMD organization and Local Store
• 128+ concurrent transactions to memory per processor
• High speed internal element interconnect bus performing at 96B/cycle
CELL is a Real-Time Architecture
• Resource allocation (for Bandwidth Management)
• Locking caches (via Replacement Management Tables)
• Virtualization support with real time response characteristics across multiple operating systems running simultaneously
CELL is Security Enabled Architecture
• SPUs dynamically configurable as secure processors for flexible security programming
CELL is a Confluence of New Technologies
• Virtualization techniques to support conventional and real time applications
• Autonomic power management features
• Resource management for real time human interaction
• Smart memory flow controllers (DMA) to sustain bandwidth

in bold = 8)



btw, what is the word on the eDRAM ????????????????
(not counting the 2.5 MB of on-chip cache + local storage)
 
Megadrive..take a rest :)
please don't cry..but..there is NO 4 PE CPU...and there is NO EDRAM!!!!!! :devilish:
 
passerby said:
*Patiently waits for coverage of tomorrow's presentation*

Indeed, I've lost track of which presentation was yesterday, and what's coming next. I want to know more about the PPE o_O

I think, according to Siboy, Cell "overall" is being presented next ("The Design and Implementation of a First-Generation CELL Processor (STI) ")- maybe we might see a working cell chip being demoed at that presentation? Or is all hope of that lost?

edit - And that presentation is today! Earlier too, at 9am local time (5pm GMT).

edit 2 - Also, a question: are ALL the conference papers available now? Or are they only being made available on the day by day basis? i.e. is the "The Design and Implementation of a First-Generation CELL Processor (STI) " paper available already, or can we expect some new info today?
 
That makes it just more than 3 hours to the start of the presentation at the time I submit this post. More info soon!
 
oh gawd. spong.com rears its head on Cell.

A year behind schedule, can the PlayStation 3’s Cell shrug off vapourware claims?
Bold claims back strong ISSCC showing – PlayStation 3 launch timeframe revealed
8th Feb 2005

Link to This Article http://news.spong.com/x?art=8309
At the International Solid State Circuits Conference (ISSCC) today, the PlayStation 3-powering Cell processor developed jointly by IBM, Sony and Toshiba was finally shown to an expectant crowd.

The unveiling has been a long time coming, with most estimates putting Cell development around 12 months behind schedule.

And down to business. The first point of note is that the PlayStation 3, arguably the Cell’s flagship host, will receive a prototype of the processor, hinting at Sony’s strong desire to expedite the launch of its next home console. The version to power the PS3 will have a 221mm² die and use 234 million transistors, made using ‘Holy Grail’ 90nm process technology.

It will contain eight 64-bit floating point processors, referred to as synergistic processor elements (SPEs), running along side a 64-bit Power processor capable of running two threads. The SPEs take will 128-bit operands and split them into four 32-bit words. Up to 128 operands can be stored in the Cell register file.

"Today, we are very proud to share with you the first development of the Cell project, initiated with aspirations by the joint team of IBM, Sony Group and Toshiba in March 2001," said Ken Kutaragi, executive deputy president and COO, Sony Corporation, and president and Group CEO, Sony Computer Entertainment Inc. "With Cell opening a doorway, a new chapter in computer science is about to begin."

Initial production of Cell microprocessors is expected to begin at IBM's 300mm wafer fabrication facility in East Fishkill, N.Y., followed by Sony Group's Nagasaki Fab plant, later this year.

For an idiot’s guide to Cell expectations, see our article here.

So we are left with the question, now that the Cell is more than a vapourware dream (various dies were shown at the event as pictured) where does this leave the PlayStation 3 and its launch? SPOnG considers it unlikely that the East Fishkill will manufacture PlayStation 3 chips, with that being left to Sony’s Nagasaki facility. Production there, as outlined above, commences at an unspecificed point this year. So essentially, the PlayStation 3 could be on shelves within 2005 though of course, that’s wildly unlikely.

What is more likely is first-generation hardware dev kits shipping in time for Christmas, replacing the high-end PC with port guidelines adopted by studios across the world. SPOnG then estimates that Sony could realistically see a PlayStation 3 launch in time for Easter 2006. Although software would be very thin on the ground, SCEI has never seen this as an obstacle to launching a home machine. Option two would be to launch the PS3 late 2006 in time for the holiday period, though with Microsoft already racking up a year of Xbox 2 by then, timeframes may have to overpower a credible launch line-up.

And of course, Sony still has its Dreamcast-killing trick up its sleeve. The dark horse of ‘wait and see’ which worked with lethal efficiency at the dawn of the current generation of platforms.

Expect updates on all things PlayStation 3, right here, as they break.

Sony and STI are right on schedual are they not? and wow. does Spong.com suck at downplaying PS3. wow. just wow.
 
and now a more interesting article from GI.biz



Cell consortium reveals chip details, claim 4GHz + clock speeds

Rob Fahey 12:11 08/02/2005

Few surprises at unveiling, but eight-SPU design and high clock speeds are confirmed

Official details of the Cell microprocessor have been revealed by partners IBM, Sony and Toshiba, with the multi-core architecture set to be capable of processing ten threads on a single chip clocked at over 4Ghz.

The chip package will consist of a 64 bit Power processor - similar to the CPUs being used in the Xbox 2 and PowerMac G5 systems - which can process two threads simultaneously, along with eight "synergistic processing units".

These SPUs are the real horsepower behind the chip; each one has 256KB of its own memory and can handle computing tasks separately from the main processor, which will be responsible for dividing up tasks between the SPUs and running the operating system.

While clock speeds are an almost entirely meaningless measurement of processor performance, especially when comparing chips as radically different as Cell and the existing Intel / AMD families, much attention has been focused on the claim that the Cell could start out at speeds of over 4GHz.

Despite not being a clear indicator of actual performance, the speed is still a PR coup for IBM and its partners - since Intel's range of chips currently maxes out at 3.8GHz, while Cell may go as high as 4.6GHz in its early incarnations.

More useful as a performance measurement is the chip's rating in terms of calculations per second, or "gigaflops", with Cell rated at 256 gigaflops according to IBM - a fair bit short of an entry in the Top 500 Supercomputers list, which starts at 851 gigaflops, but still enormously powerful for a single chip, and of course the chips are designed to operate efficiently in clusters.

Indeed, it's widely expected that the PlayStation 3 could boast as many as four Cell chips, which would give a theoretical CPU performance of over 1000 gigaflops, or one teraflop - a very theoretical measure, admittedly, but still enough to earn the PS3 a place on the supercomputer list.

Another aspect of the performance which IBM has been quick to champion is the memory bandwidth available to the Cell, with the design utilising RAMBUS interface technology that delivers an unprecedented one hundred gigabytes per second of bandwidth to the chip, with separate interfaces for communicating with system memory and with other CPUs.

Despite Sony's claims, one thing we won't be seeing in the near future is Cell being used in portable devices such as mobile phones - according to an IBM spokesperson, the chip, which is initially being manufactured on a 90 nanometre process but will eventually move down to 65 nanometre, runs hot enough to require a cooling fan, like most desktop CPUs.

Spokespeople from the Cell consortium were quick to point out the flexibility of the system, saying that the multi-processor architecture could be used in a variety of different ways by game developers or other software creators.

However, game developers contacted by GamesIndustry.biz downplayed speculation that the PS3 would be incredibly difficult to program as a result of the new architecture, saying that the main difficulty would be the move to a multi-core system - a design shared by the Xbox 2 and almost certainly by the Nintendo Revolution.

The game development model which is used for PlayStation 2, where a few programmers work directly with the low level code to create libraries for specific functions and other developers simply use those libraries, masking the complexity of the underlying system, is likely to work just as well on PlayStation 3, while the prevalence of middleware such as Criterion's RenderWare or the Havok physics engine will also make the transition less painful.

Another factor fingered by developers is the fact that Sony's PlayStation Portable libraries and documentation have been widely praised by those working on the system, indicating that Sony has learned an important lesson from the PS2 launch - where much of the development difficulty lay not with the system itself, but with poorly translated (or un-translated) documentation and difficult to use libraries.

Along with the Cell processors, the PlayStation 3 is also set to use a graphics chipset from NVIDIA, which will be based on the company's next generation of GPU, following on from the hugely successful 6000 series of PC graphics cards.
 
Indeed, it's widely expected that the PlayStation 3 could boast as many as four Cell chips, which would give a theoretical CPU performance of over 1000 gigaflops, or one teraflop - a very theoretical measure, admittedly, but still enough to earn the PS3 a place on the supercomputer list.
Imagine if this were true, what that Top500 computers list would look like...

478 - The NEC Megabrain at the Jingo Heistemer Institute for Aquatic Plant Simulation : rated at 1.034 Teraflops
479 to 21,635,764 - The PlayStation3 in homes all over the world : rated at 1.00 teraflops

Despite Sony's claims, one thing we won't be seeing in the near future is Cell being used in portable devices such as mobile phones - according to an IBM spokesperson, the chip, which is initially being manufactured on a 90 nanometre process but will eventually move down to 65 nanometre, runs hot enough to require a cooling fan, like most desktop CPUs.
Lose 4 SPUs and drop the speed to 2 GHz, you've still got 64 Gflops. Drop down to 1 GHz and 32 GfLops still kicks mathematical bottom!
 
Shifty Geezer said:
Indeed, it's widely expected that the PlayStation 3 could boast as many as four Cell chips, which would give a theoretical CPU performance of over 1000 gigaflops, or one teraflop - a very theoretical measure, admittedly, but still enough to earn the PS3 a place on the supercomputer list.
Imagine if this were true, what that Top500 computers list would look like...

478 - The NEC Megabrain at the Jingo Heistemer Institute for Aquatic Plant Simulation : rated at 1.034 Teraflops
479 to 21,635,764 - The PlayStation3 in homes all over the world : rated at 1.00 teraflops

Boo, I feel like writing to GI.biz about that. I don't wanna hear any "what happened to 4 PEs?" type comments after the PS3 unveiling ;)
 
It's weird no one started to yell out crazy performance figures..so let's start! :LOL:
In the simplest, classic, vertex transformation from homogeneous coordinates to viewport coordinates often computations are bound by div latency.
We don't know how many cycles does take a division on a SPU..so I'm going to make an assumption here. Let's say a div takes 8 cycles (even if I fear this an overomptimistic estimation..).
If a div takes 8 cycles and everything it's needed to tranform a vertex can be done at the same time the hw perform a division, and all the SPUs are doing the same thing without stalling, and etc..etc..etc.. we have 1 new transformed vertex per clock.
If we render a triangle strip we have a 4 GPoly/s figure! are you happy now? :LOL: :devilish:

ciao,
Marco
 
MfA said:
SiBoy said:
Branch mis-predicts are 18 cycles, so this has to be carefully managed in S/W. Mux instruction is used to avoid branches (compute both sides of an if-then and select the result instead of branching around one).
Oh god, fugly ... so the software can know 18 cycles ahead of time which way a branch will go, but will not have a way to tell the hardware that? Personally I would even prefer delay slots over this (split branches are even better, dont really expose the pipeline and they save you the headache of pro/epilogue code, but I guess they might be patented). I still prefer a disposeable ISA over inefficient hardware.

Sorry, should have included this.

The SPU includes an instruction line buffer (3.5 lines storage). "One half-line holds instructions while they are sequenced into the issue logic; as another line buffer holds the single entry software managed branch target buffer (SMBTB) and two lines are used for inline prefetching."
 
Titanio said:
I think, according to Siboy, Cell "overall" is being presented next ("The Design and Implementation of a First-Generation CELL Processor (STI) ")- maybe we might see a working cell chip being demoed at that presentation? Or is all hope of that lost?

edit - And that presentation is today! Earlier too, at 9am local time (5pm GMT).

They don't do demo's at ISSCC. Presentation is in a couple hours though, I'll post my notes after that.

Titanio said:
edit 2 - Also, a question: are ALL the conference papers available now? Or are they only being made available on the day by day basis? i.e. is the "The Design and Implementation of a First-Generation CELL Processor (STI) " paper available already, or can we expect some new info today?

Yes all the papers are available. But as in the case of the SPU, the speaker fills in a lot that's not in the written paper (usually).
 
nAo said:
It's weird no one started to yell out crazy performance figures..so let's start! :LOL:
In the simplest, classic, vertex transformation from homogeneous coordinates to viewport coordinates often computations are bound by div latency.
We don't know how many cycles does take a division on a SPU..so I'm going to make an assumption here. Let's say a div takes 8 cycles (even if I fear this an overomptimistic estimation..).
If a div takes 8 cycles and everything it's needed to tranform a vertex can be done at the same time the hw perform a division, and all the SPUs are doing the same thing without stalling, and etc..etc..etc.. we have 1 new transformed vertex per clock.
If we render a triangle strip we have a 4 GPoly/s figure! are you happy now? :LOL: :devilish:

ciao,
Marco

on PS2 's VU processor a DIV was 7 cycle,
yes about 4 gigavertex /sec on cell , but if use BEZIER then 250 million beziers/sec,if tesselate it with GPU about to 16*16 polygon then max performance is 128 gigapoly/s :)
 
Megadrive said:
the VMX unit attached to the PU would be like the equivalent of the FPU attached to the MIPS core in Emotion Engine, would it not? yeah it would add to the total flops performance, but not significantly increase it from ~256 Gflops.
Actually, the announced EE GFlops numbers counted all the FMAC+FDIV units spread over the system (when you consider all the FPU units are basically the same, this makes sense - the FMAC&FDIV making up the R5900 FPU are equivalent to those found in VUs).
So you have:
10FMACs + 4FDIVs = (10*2Flop/cycle + 4*0.16flop/cycle) * 300mhz ~ 6.2GFlops.
R5900 core FPU thus contributed ~ 10% of overall FPU rating, which may not be a lot, but it's definately something.

In case of this Cell, VMX would potentially increase the rating another 32GFlops - which is also just over 10% of the cumulative rating, amusingly enough :p
 
Times Online Q & A:

Q&A: The superchip
Holden Frith answers some of the questions raised by the launch of the Chip

How soon will it be before I can buy a product that uses the Cell?

Its first scheduled appearance will be in the Sony PlayStation 3 games console, which should go on sale in early 2006. A prototype is expected to feature at the E3 computer fair in Los Angeles this May.

What difference will it make to the equipment?

The Cell chip is faster than existing microchips because it can work on several tasks at once. Computers with Cell chips can also share processing power so that if one computer is not working at full speed, another connected to it by a network or the internet can make use of its spare computing capacity.

What difference will Cell users notice?

In games consoles, faster microchips will allow game designers to employ higher quality sound and smoother, more realistic graphics. There is already suggestions that greater use of graphics from Hollywood movies may be possible.

In PCs, the main use will be multimedia applications as Cell chips will be better able to process the vast amount of information delivered by ever-faster broadband internet connections.

According to IBM, today's microchips were created with word processors and spreadsheets in mind and therefore struggle to cope with tasks such as downloading music and displaying video. The Cell chip has been designed to address this shortcoming.

And what difference might it make to the price of these goods?

It will make them more expensive, at least in the early stages of production. Financial analysts have suggested that the high cost of the PlayStation 3, which they predict will have to be priced at between $500 and $750 when it is launched in the United States, will deter potential buyers. As always with computers, though, the price is likely to fall quite rapidly with time.

What is meant by "clock speed" and "flash memory"?

The clock speed of a microchip is a measure of how quickly it can perform calculations and therefore how powerful it is. The 4 Gigahertz Cell chip will be able to perform four billion calculations per second.

Flash memory refers to the ability of a chip to store information so that it doesn't have to send it to another part of the computer for safe keeping. The more information it can store, the faster it will complete its work. Flash memory is familiar to many people who own digital cameras, which use Compact Flash cards to store photographs.

Will the Cell make my existing home PC or games console obsolete, or can I upgrade them?

The PlayStation 3 is likely to replace the PS2 just as that replaced the initial PlayStation in 2000, but there will probably be a crossover period when new game releases will continue for the older model.

Upgrading existing PCs is unlikely to be possible as the new processor requires completely different software. However, since many people do not push their PCs to the limits, the mass desertion of traditional microchips is unlikely in the immediate future.

http://business.timesonline.co.uk/article/0,,9075-1475581,00.html

Interesting. :)

Fredi
 
Back
Top