Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 29-Nov-2004, 05:43   #26
one
Unruly Member
 
Join Date: Jul 2004
Location: Minato-ku, Tokyo
Posts: 4,705
Default

The CELL related 5 programs @ ISSCC 2005 -

10.2 The Design Implementation of a First-Generation CELL Processor

This is the main overview program.

7.4 A Streaming Processing Unit for a CELL Processor

26.7 A 4.8GHz Fully Pipelined Embedded SRAM in the Streaming Processor of a CELL Processor

These are for streaming processors.

20.3 A Double-Precision Multiplier for a First-Generation CELL Processor

This is by IBM,

28.9 Clocking and Circuit Design for a Parallel I/O on a First-Generation CELL Processor

and this is by Rambus, Inc. & Stanford University.

Note that all programs are supposed to describe First-Generation CELL Processor, which were already in production early this year. ISSCC doesn't adopt armchair theories, but actual samples.
one is offline   Reply With Quote
Old 29-Nov-2004, 05:50   #27
Vince
 
Join Date: Apr 2002
Posts: 2,158
Default

Quote:
Originally Posted by one
Note that all programs are supposed to describe First-Generation CELL Processor, which were already in production early this year. ISSCC doesn't adopt armchair theories, but actual samples.
Cool, so I'd assume we can say with reasonable probability that the first generation parts are based off the 10S, 90nm sSOI, process. Which would bode well indeed based off the speeds attained if they scale as expected to 65nm. Oh, and thanks for the round-up One.

EDIT: Nevermind, it states as much on the first page. Sorry for the useless post.
Vince is offline   Reply With Quote
Old 29-Nov-2004, 05:52   #28
j^aws
Senior Member
 
Join Date: Jun 2004
Posts: 1,908
Default

Quote:
Each processing element comprises a Power-architecture 64-bit RISC CPU,
This seems to have been overlooked, Power and NOT PowerPC for the PUs? Hmmm...
j^aws is offline   Reply With Quote
Old 29-Nov-2004, 05:53   #29
mozmo
Member
 
Join Date: Jul 2002
Posts: 124
Default

Quote:
Also, I wouldn't put much faith in Mr. Zimmon's comments, he's doing the same bandwith/flop math that Deadmeat used a year or so ago. External bandwith isn't an indication of calulation ability (ala contemporary GPUs).
Isn't this on the basis of constantly using data that isn't cached in memory. It's probably worst case, for data that can be cached on chip the number would be higher, I guess the performance will hinge on how much onchip cache memory cell's have. I'm suspecting it won't be huge, so for large data sets which is what lot of next gen games will be using, the memory bandwidth may hinder a lot of the potential performance of the cell units. I guess we'll see if the 128k is enough. I guess it could all just be another case of sony concentrating on theoretical numbers instead of balancing the chip for maximum real world performance.
mozmo is offline   Reply With Quote
Old 29-Nov-2004, 06:24   #30
Fafalada
Senior Member
 
Join Date: Feb 2002
Posts: 2,767
Default

Quote:
Isn't this on the basis of constantly using data that isn't cached in memory.
No, it's based on doing exactly one Floating point operation for every data read/write, which is much more extreme then simply not hitting the cache often. Even a cpu with no local memory or cache should easily beat that number with a reasonably sized register set.

Case in point - applying this logic to current Xenon specs, you top the machine out at 6.2GFlops...
__________________
"I see Subversion as being the most pointless project ever started."
Linus Torvalds
Fafalada is offline   Reply With Quote
Old 29-Nov-2004, 06:41   #31
Brimstone
B3D Shockwave Rider
 
Join Date: Feb 2002
Posts: 1,800
Default

The memory hierarchy is configured differently on a Stream processor. Multimedia applications don't run well on standard cache configurations like an x86 CPU.

Quote:
Imagine overcomes the bandwidth bottlenecks of global register files and memory systems by using a three-level bandwidth hierarchy organized to support stream operations. Streams are transferred between memory and a stream register file (SRF) by a four-bank streaming memory system (2GB/s) that reorders references to improve bandwidth. Once a stream is loaded from memory, it is typically circulated between the SRF and the arithmetic clusters several times before returning the result to memory, exploiting the 32GB/s bandwidth of the SRF. Finally, during a computation kernel, intermediate results are forwarded directly between local register files associated with the arithmetic units without need to return to the global register file, using the 544GB/s local register bandwidth. On representative benchmark programs, exploiting the locality inherent in stream applications in this manner reduces bandwidth demands on global register ports by a factor of 20 compared to a typical scalar architecture.

Imagine overcomes the performance limiting effects of conditional operations by sorting streams according to a conditional variable rather than through conditional control flow. These conditional stream operations divide data into homogenous sets that can then be processed without the overhead of conditional control instructions. Compared to conventional approaches of branch prediction or predication, conditional stream operations enable very high levels of instruction and data parallelism to be exploited without incurring a large penalty on every unpredictable conditional operation.
http://cva.stanford.edu/imagine/proj..._overview.html
Brimstone is offline   Reply With Quote
Old 29-Nov-2004, 06:47   #32
Brimstone
B3D Shockwave Rider
 
Join Date: Feb 2002
Posts: 1,800
Default

Quote:
Originally Posted by Jaws
Quote:
Each processing element comprises a Power-architecture 64-bit RISC CPU,
This seems to have been overlooked, Power and NOT PowerPC for the PUs? Hmmm...
The 64bit Power core running in CELL will probably be a simple scalar cpu based off the PowerPC family. And the same 64bit Power core will be Xbox 2 and Revolution, but configured differently. Thats my best guess.
Brimstone is offline   Reply With Quote
Old 29-Nov-2004, 06:49   #33
FatherJohn
Junior Member
 
Join Date: Aug 2004
Posts: 33
Default

Now we still don't know if Sony's going to put 1 cell, 2 cells, or 4 cells in the PS3.

Comparing apples to pears:
Code:
Item                1 cell                       Xbox 2
CPUs             1 @4.8GHz                   3 @3.5+GHz
ALUs             8  @4.8GHz                 48 @500+MHZ
On-CPU RAM     128K*8 = 1MB                    1 MB
One Cell seems very roughy in the same ballpark as an Xbox 2. (Especially if you think the GHz numbers end up converging, which is likely because both companies are buying their CPU technology from the same vendor.)

Sony seems to have a tough choice to make: launch at the same time as Xbox 2, but roughly at performance parity, or ride Moore's Law for one or two extra generations to get a more powerful product.

It looks like they're planning on waiting one year, which is 2/3rds of a Moore's Law cycle. That would tend to indicate that the PS3 will be a 2 Cell system. (And would tend to indicate that PS 3 will be 2 to 3 times more powerful than Xbox 2.)[/code]
FatherJohn is offline   Reply With Quote
Old 29-Nov-2004, 06:53   #34
Vince
 
Join Date: Apr 2002
Posts: 2,158
Default

Quote:
Originally Posted by FatherJohn
Sony seems to have a tough choice to make: launch at the same time as Xbox 2, but roughly at performance parity, or ride Moore's Law for one or two extra generations to get a more powerful product.
Why would they wait? Most importantly, we don't know the area of one of these 90nm fabbed PE's yet, so we can't make a prediction for a yeildable 65nm IC. And, secondly, they have already been sampling 65nm, 2nd generation ICs; so whatever will end up in the PS3 likely exists already. The ones being discussed here have been around for around for awhile. And I also have my reasons to think the GHz won't converge, namely I believe there are architectural features that make Cell inheriently faster. And word on the street recently is that synthesis and placement is basically fucked for 65nm at IBM; as I told nAo, I see this as hurting the XCPU project more than Cell.

PS. You just compared a single Cell Processor to the entire X2 system and called that parity. Do you think the PS3 won't have a "GPU"? Think about it.
Vince is offline   Reply With Quote
Old 29-Nov-2004, 07:05   #35
one
Unruly Member
 
Join Date: Jul 2004
Location: Minato-ku, Tokyo
Posts: 4,705
Default

Quote:
Originally Posted by FatherJohn
Now we still don't know if Sony's going to put 1 cell, 2 cells, or 4 cells in the PS3.
Streaming processors will consume very little wattage even at high frequency 8) 4 PEs in a 65nm Broadband Engine is very likely, I guess.
one is offline   Reply With Quote
Old 29-Nov-2004, 07:09   #36
FatherJohn
Junior Member
 
Join Date: Aug 2004
Posts: 33
Default

Yeah, I don't think PS3 will have a full GPU. There's certainly no reason to have any vertex processors. And depending on how the second cell's APUs are configured they may be able to use them as some sort of renderer. (The Stanford Imagine group tried an experiment where they configured their stream processor as a Reyes-style renderer. It worked, but it was abysmally slow -- 20 time slower than a contemporary Z-buffer-based renderer, but using 3x the transistors. The dirty secret of stream-based processors is that they are very hard to get useful work out of.)

Sony's a fine company, but they can't magicly put more transistors into their products for the same price than other people can. So they are limited in what they can do. They have to make engineering trade offs, the same as their competitors.

They've chosen to invest heavily in APUs, and there isn't any practical use for APUs in a game machine other than rendering. So I assume that's what they're there for.

(People can wave their hands and say that the extra compute power is for physics or AI, or brand new uses that nobody's thought of yet. But I think that's pure wishful thinking. It's a waste of money to put in so much extra power unless most games can take advantage of it.)
FatherJohn is offline   Reply With Quote
Old 29-Nov-2004, 07:26   #37
V3
Senior Member
 
Join Date: Feb 2002
Posts: 3,266
Default

Quote:
The dirty secret of stream-based processors is that they are very hard to get useful work out of.
Are you forgetting that the programmable parts in GPU are aslo base on stream processors ?
V3 is offline   Reply With Quote
Old 29-Nov-2004, 07:27   #38
one
Unruly Member
 
Join Date: Jul 2004
Location: Minato-ku, Tokyo
Posts: 4,705
Default

http://pc.watch.impress.co.jp/docs/2004/1129/cell.htm

A prototype Cell workstation by Sony/SCE/IBM is already up and running, its estimated performance is 16TFlops @ 1 rack. Now how many Cell does it contain?
one is offline   Reply With Quote
Old 29-Nov-2004, 07:27   #39
Vince
 
Join Date: Apr 2002
Posts: 2,158
Default

Nevermind... It has begun. No use arguing about it now.
Vince is offline   Reply With Quote
Old 29-Nov-2004, 07:35   #40
V3
Senior Member
 
Join Date: Feb 2002
Posts: 3,266
Default

http://www.gfdata.de/gamefront-temp/sonycell1.pdf

http://www.gfdata.de/gamefront-temp/sonycell2.pdf
V3 is offline   Reply With Quote
Old 29-Nov-2004, 08:03   #42
j^aws
Senior Member
 
Join Date: Jun 2004
Posts: 1,908
Default

Quote:
The companies expect that a one rack Cell processor-based workstation will reach a performance of 16 teraflops or trillions of floating point calculations per second.
I want one!!11!1!!!!!!1!!1!!1
j^aws is offline   Reply With Quote
Old 29-Nov-2004, 08:18   #43
j^aws
Senior Member
 
Join Date: Jun 2004
Posts: 1,908
Default

Quote:
Cell is optimized for compute-intensive workloads and broadband rich media applications, including computer entertainment, movies and other forms of digital content. Other highlights of the Cell processor design include:

• Multi-thread, multicore architecture.

• Supports multiple operating systems at the same time.

• Substantial bus bandwidth to/from main memory, as well as companion chips.

• Flexible on-chip I/O (input/output) interface.

• Real-time resource management system for real-time applications.

• On-chip hardware in support of security system for intellectual property protection.

• Implemented in 90 nanometer (nm) silicon-on-insulator (SOI) technology. Additionally, Cell uses custom circuit design to increase overall performance, while supporting precise processor clock control to enable power savings.
What's with the multiple OS's? Virtual Os's?
j^aws is offline   Reply With Quote
Old 29-Nov-2004, 08:31   #44
Jov
Member
 
Join Date: Dec 2002
Posts: 503
Default

Quote:
Originally Posted by Jaws
• Supports multiple operating systems at the same time.
What's with the multiple OS's? Virtual Os's?[/quote]

It’s called virtualisation... it’s big thing that’s happening in the server world. Goal is to optimally utilise available h/w resources.

Think VMWare, Sun Domains/Zones, HP n/vPar, and IBM's LPar (whatever its called).
__________________
Jov
Jov is offline   Reply With Quote
Old 29-Nov-2004, 08:32   #45
Guden Oden
Senior Member
 
Join Date: Dec 2003
Posts: 6,201
Default

Any specifics given so far, apart from 4.8GHz (!) SRAM clock? Dare we even hope for 4.8GHz ALU clock as well? I myself won't believe that just yet, lest I be disappointed when the actual speed is revealed and it turns out to be lower, say, 2.4GHz. :P
__________________
Top one reason why capital punishment is immoral and wrong:
You can release an innocently convicted man from jail,
but you cannot release an innocently convicted man from death.
Guden Oden is offline   Reply With Quote
Old 29-Nov-2004, 08:33   #46
Jov
Member
 
Join Date: Dec 2002
Posts: 503
Default

Quote:
Originally Posted by Jaws
Quote:
The companies expect that a one rack Cell processor-based workstation will reach a performance of 16 teraflops or trillions of floating point calculations per second.
I want one!!11!1!!!!!!1!!1!!1
Now all we need to know is how many Cells are there in the WS... and at what speed?
__________________
Jov
Jov is offline   Reply With Quote
Old 29-Nov-2004, 08:33   #47
j^aws
Senior Member
 
Join Date: Jun 2004
Posts: 1,908
Default

Quote:
With the capability to support multiple operating systems, Cell can perform both PC/WS operating systems as well as real-time CE/Game operating systems at the same time.
Sounds like Longhorn... :P
j^aws is offline   Reply With Quote
Old 29-Nov-2004, 08:33   #48
one
Unruly Member
 
Join Date: Jul 2004
Location: Minato-ku, Tokyo
Posts: 4,705
Default

Quote:
Originally Posted by Jaws
What's with the multiple OS's? Virtual Os's?
Maybe something like Intel's Vanderpool. Now you have a Cell workstation, you assign 80% for Linux, and 20% for another real-time OS, I assume like that.
one is offline   Reply With Quote
Old 29-Nov-2004, 08:36   #49
Jov
Member
 
Join Date: Dec 2002
Posts: 503
Default

Quote:
Originally Posted by Guden Oden
Any specifics given so far, apart from 4.8GHz (!) SRAM clock? Dare we even hope for 4.8GHz ALU clock as well? I myself won't believe that just yet, lest I be disappointed when the actual speed is revealed and it turns out to be lower, say, 2.4GHz. :P
Even if at that speed (2.4GHz), the overall performance is not significantly dropped, how disappoint will you be the?
__________________
Jov
Jov is offline   Reply With Quote
Old 29-Nov-2004, 08:43   #50
j^aws
Senior Member
 
Join Date: Jun 2004
Posts: 1,908
Default

Quote:
Originally Posted by Jov
Quote:
Originally Posted by Guden Oden
Any specifics given so far, apart from 4.8GHz (!) SRAM clock? Dare we even hope for 4.8GHz ALU clock as well? I myself won't believe that just yet, lest I be disappointed when the actual speed is revealed and it turns out to be lower, say, 2.4GHz. :P
Even if at that speed (2.4GHz), the overall performance is not significantly dropped, how disappoint will you be the?
Also the PE bus is 6.4GHz and the PUs are 64bit Power cores as opposed to PowerPC.
j^aws is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PS3's CPU, GPU, RAM and eDRAM configuration? j^aws Console Technology 81 19-Feb-2005 17:23
The 7 Myths of the Cell Processor McFly Console Technology 15 09-Feb-2005 18:21
CELL REVEALED (this time it's official) Josh378 Console Technology 34 08-Feb-2005 05:57
Kutaragi Ken: "Relation Between Cell & PS3" ChryZ Console Technology 11 11-Sep-2003 08:41
Microsoft to own every GPU? Cyborg 3D Architectures & Chips 26 14-Jul-2002 11:15


All times are GMT +1. The time now is 11:21.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.