Details trickle out on CELL processor...

Such signalling rate would push 6.4 Gbps with 2 pins per bit (differential signalling).

Sounds about right which makes it 100GB/sec for a 128-bit memory bus though I heard the XDR going into PS3 will be a downgraded version.
 
PC-Engine said:
Such signalling rate would push 6.4 Gbps with 2 pins per bit (differential signalling).

Sounds about right which makes it 100GB/sec for a 128-bit memory bus though I heard the XDR going into PS3 will be a downgraded version.

You hear wrong IMHO... last I checked they cut down main RAM and doubled bandwidth so we must hear different things.
 
version said:
Vince said:
Bohdy said:
Its seems unlikely (silly even) that the overhead is referring to "per-pin" bandwidth.

Um, well, correct me if I'm wrong, but how likely is it that Cell's off-die communication (6.4Gbit/sec) is less than the infamous EE -> GS bus in the PlayStation2?

For a 4.6GHz processor, I'm going to guess that the fact that Yellowstone/XDR just happens to be 6.4Gbit/pin when the base clock is 800MHz is the more likely scenario.


between cell and gpu will 128 pin
6.4gb/pin*128= 102.4 Gbyte/s

Isn't Cell>GPU Redwood? Are both Redwood and Yellowstone 6.4GBits/pin?
 
london-boy said:
Errm... How can this thing run and not blow up after 0.00001 secs at 85C??
Maybe the chip was designed to run at that temperature? Do you realize how hot Apple's G5's get?
The 2.5GHz reaches 85C after 20 minutes of load in some cases. The chip however is designed for temperatures up to 120C, so it's not a problem, that's just how it was designed.
Meanwhile Intel CPU's are usually spec'ed for around 76C (for the Xeon at least) and AMD's for around 90C (though that's for the AthlonXP, i haven't seen numbers for the Athlon64).

L.
 
Sounds about right which makes it 100GB/sec for a 128-bit memory bus though I heard the XDR going into PS3 will be a downgraded version.
Memory bus and GPU-CPU interconnect aren't the same thing though - just look at PS2 if you need evidence :p

Except that THIS time - I sincerely hope they make sure the interconnect bandwith is "higher" then the main memory bandwith, not the other way around.
We're supposed to be outputting generated data from all those APUs, which means MORE traffic to the GPU not LESS Sony :oops:
 
Fafalada said:
We're supposed to be outputting generated data from all those APUs, which means MORE traffic to the GPU not LESS Sony :oops:

Given that this thread, -as all other CELL threads, is 98% speculation there's still room for hope :) And for Sony to b0rk it all.

Cheers
Gubbi
 
Fafalada said:
Sounds about right which makes it 100GB/sec for a 128-bit memory bus though I heard the XDR going into PS3 will be a downgraded version.
Memory bus and GPU-CPU interconnect aren't the same thing though - just look at PS2 if you need evidence :p

Yes..that's why I said XDR instead of Visualizer..

XDR ~ Yellowstone

CPU->GPU = Redwood

Correction: Yellowstone is not XDR. Yellowstone use 4 level signaling while XDR uses 2.
 
one said:
Brimstone said:
So Microsoft, Nintendo, and Sony all have the same basic processor? They all have POWER.

IBM, Sony, Sony Computer Entertainment Inc. and Toshiba Unveil Cell Processor


Companies Released First Details of Multicore Chip Comprising Power Architecture and Synergistic Processor

CELL is just a codename name like Nintendo's Gekko CPU. POWER is used explicitly.





So the system software which controls hardware is much more important in the next-gen. I bet Cell is useless without smart OS.


Right now I suspect Microsoft is going to release a PC based around Power, to go along with the Xbox 2 $300 console. The reports of MS going to release 3 versions of Xbox Next plus ATI and VIA working on XDDR memory standard are hints at this possibility. Also the "open platform" statements from IBM.
 
Brimstone said:
Also the "open platform" statements from IBM.

That'd be STI sell Cell to other companies along with a Linux version for it... MS's position in the picture? Go figure.
 
So why no mention of eDRAM? Stream processors don't need eDRAM if their memory heirachies are tiered?

We have 128*128bit APU registers, 128KB SRAM LS, PU (Power core) cache, L1/L2/L3, and Yellowstone RDRAM? Is that sufficient in the absence of eDRAM?
 
PC-Engine said:
Just wanted to point out that Yellowstone uses 4 level signaling while XDR use 2 levels.

Yellowstone and XDR are the same thing with a name switch basically.

In both you take the base clock on the XDR I/O interface on the DRAM chip and the XDR Memory Controller and multiply it with a PLL (the PLL is "programmable"... can do 4x and 8x IIRC with the setting of 4x being the one they talked about the most so far). WE use from this point DDR signalling using both falling and rising edges of the clock to transmit data.

We basically encode 8+ bits in each of the clock-pulses at the extenral clock-speed.

4x is the rating they prefer for the PLL so it is likely that 6.4 GHz of signalling data-rate implies a 800 MHz external clock.
 
Jaws said:
So why no mention of eDRAM? Stream processors don't need eDRAM if their memory heirachies are tiered?
well..a lot of stuff is not being mentioned, even 128KB sram was not mentioed (AFAIK there is 256 kb SRAM per APU ;) ) and no other kind of L1/L2/L3 cache..

We have 128*128bit APU registers, 128KB SRAM LS, PU (Power core) cache, L1/L2/L3, and Yellowstone RDRAM? Is that sufficient in the absence of eDRAM?
It could be enough..but who knows? we really need details we don't have at the moment to make an educated guess, even if without eDRAM I don't know how they think to keep that thing feeded with fresh data to process :)

ciao,
Marco
 
nAo said:
well..a lot of stuff is not being mentioned, even 128KB sram was not mentioed (AFAIK there is 256 kb SRAM per APU ;) ) and no other kind of L1/L2/L3 cache..

New source needed ;):

  • EETimes said:
    They include a 128-kbyte local pipe-lined SRAM that goes between the stream processor and the local bus, a bank of one hundred twenty-eight 128-bit registers and a bank of four floating-point and four integer execution units, which appear to operate in single-instruction, multiple-data mode from one instruction stream. Software controls data and instruction flow through the processor.
 
nAo said:
(AFAIK there is 256 kb SRAM per APU ;) )

Where did you read 256KB SRAM? The EEtimes article states 128KB SRAM?

They include a 128-kbyte local pipe-lined SRAM that goes between the stream processor and the local bus, a bank of one hundred twenty-eight 128-bit registers and a bank of four floating-point and four integer execution units, which appear to operate in single-instruction, multiple-data mode from one instruction stream. Software controls data and instruction flow through the processor.

http://www.eet.com/semi/news/showAr...BCCKH0CJUMEKJVN?articleId=54200580&pgno=2
 
Oops :oops:
[edit] I never read any figure about a 256Kb local sram.. it's just a rumour
 
Vince said:
nAo said:
well..a lot of stuff is not being mentioned, even 128KB sram was not mentioed (AFAIK there is 256 kb SRAM per APU ;) ) and no other kind of L1/L2/L3 cache..

New source needed ;):

  • EETimes said:
    They include a 128-kbyte local pipe-lined SRAM that goes between the stream processor and the local bus, a bank of one hundred twenty-eight 128-bit registers and a bank of four floating-point and four integer execution units, which appear to operate in single-instruction, multiple-data mode from one instruction stream. Software controls data and instruction flow through the processor.

Well, for who ? For the EET ;) ?
 
Its seems unlikely (silly even) that the overhead is referring to "per-pin" bandwidth.

Belive it or not, when you describe the performance of external interface solution, that's what you give, and its generally understood. I am suprise you lot are discussing about this.

And I would like to hear something more solid than conjecture about "stream-processors" as right now they are sounding like they are right up there with Supeman and Batman

When people said stream processor, they meant the processor cannot create additional data, data are just passing through and gets processed. Example, would be a vertex shader.
 
From EEtimes,

And each processing element is connected to its neighbors in the cell by high-speed "highways." Designed by Rambus Inc. with a team from Stanford University, these highways — or parallel bundles of serial I/O links — operate at 6.4 GHz per link.

Redwood between Cell PE's...You know there might not even be a BE? ...Is this PE>PE 128bit or 1024bit? My memories going badly! :p
 
Back
Top