Details trickle out on CELL processor...

Guden Oden said:
why would Cell in your opinion use a cache that runs at 4x clock of the ALU fed from the cache? You never answered this.
nitpicking mode: it's not a cache, it's a local sram. SPUs have a L1 cache too.
 
one said:
Brimstone said:
but I'm still very skeptical.

ISSCC is the most authoritative conference in the semiconductor academic society and a paper without an actually running sample is not accepted. Don't confuse it with self-proclaimed press releases.

Just FYI, this isn't true.

Aaron Spink
speaking for myself inc.
 
From ISSCC site:
The most common reason for paper rejection is a lack of clear circuit evidence of what is novel in the work and the extent to
which it advances the state of the art. Successful submissions contain specific new results, sufficient detail and data to be
understood, and schematics and measured results for key circuits when appropriate.
 
aaronspink said:
one said:
Brimstone said:
but I'm still very skeptical.

ISSCC is the most authoritative conference in the semiconductor academic society and a paper without an actually running sample is not accepted. Don't confuse it with self-proclaimed press releases.

Just FYI, this isn't true.

Where and why? :rolleyes:
 
Guden Oden said:
Brimstone said:
I ask you once more: why would Cell in your opinion use a cache that runs at 4x clock of the ALU fed from the cache? You never answered this.

Because a Stream Processor is constantly prefetching data unlike a regular CPU cache. Also every port on SRAM has the possibliity of being accessed simultaneously on a stream processor.
 
Jaws said:
Jov said:
Cryect said:
I'm curious where did this 15 TFLOPs workstation come from?

And why would Sony be sending 15 TFLOP workstations for a console thats only 1/15th that at most? (heh I sure hope its not so we can have really nice CGI cutscenes)

The Register link Jaw's points to sounds definately more believable.

I have to say that rack looks nice and maybe the multi-threading is their idea of hyperthreading? Just throwing out a wild guess with that one.

At 90nm, was it 16 TFlops for a rack mountable machine and 2 TFlops for the Workstation?

STI press release, 16TFlops "will reach", implies 2nd gen. The Register mention "prototype" is 2TFlops, implies, they're the ones going to devs as 1st gen.
"IBM, SONY AND SCEI POWER-ON CELL PROCESSOR-BASED WORKSTATION"

http://www.anandtech.com/news/shownews.aspx?i=23434
 
Brimstone said:
Because a Stream Processor is constantly prefetching data unlike a regular CPU cache
We are not talking about a cache, but a local ram. SPUs already have a prefetch cache
Also every port on SRAM has the possibliity of being accessed simultaneously on a stream processor.
who are the actors that needs to simultaneously make accesses to APU local sram?
 
heh, when I first read that "Power On Cell Workstation...", I thought they have actually flipped the power switch to ON on first Cell Workstation.

But it seems the "Power-On" in "Power-On Cell Processor based Workstation" is actually like "Power-PC" :LOL: ....Right? Or isn't it????

Edit: :oops: .... actually when I read that news, it says they have indeed powered-on a Cell workstation... bad braincell!! try to think before postin!!
 
Please be more careful next time, since you only have one CELL left.
Actually I'm only using one.
I've been afraid to enable the other one, as I fear it might turn me into some superintelligent madman!
I might start to use it if the PS3 does not live up to hype, then I'll make my own superconsole and everybody will want my super powerful StronG3000 instead of the 10000 times less powerful Revoultion, xenon or PS3
 
"IBM, SONY AND SCEI POWER-ON CELL PROCESSOR-BASED WORKSTATION"

That is just what I was wondeirng about a few pages back.

BM, Sony Corporation (Sony) and Sony
Computer Entertainment Inc. (SCEI) announced today that they have powered-on the first Cell* processor-based workstation.

If it was just powered on, then how were workstations on already in developrs hands? more prototypes possibly? Or perhaps they powered it on before the annoucement. But if they did I'm sure they would provide a date for that (you know if it was 5 months ago I'm sure they would say that). I really don't know to think
 
..or it's the first Cell Processor based workstation IBM and Sony powered on, the second, third and fourth may have been powered-on earlier and fifth, sixth, seventh... have been also powered-on already by developers.

It's just this first Cell Processor based Workstation that had for some reason (did they accidentally forget it in some corner while poering-on those other workstations?) been left without powring-on, and they are doing it now.
 
You image link didn't work.

Yeah I was thinking if they did power it on before this day they would say that in the PR.
 
It wasn't supposed to be an image link, but italics.
You really didn't think I'd had found a pic of the first Cell prosessor based workstation. I'd really like to see one, links anyone????
 
Vince said:
Pana said:
The SPU (in CELL related IBM literature and patent portfolio) would be the Synergistic Processing Unit, another name for the APU we have seen in the CELL patents which is as patents say a versatile "stream processor with 4-way SIMD engine"

I've heard that an Synergistic PU is an APU 'core' with a local flow-controller and some sort of cache. Something alone the lines of what I stated here. We'll see if it turns out to be true?

Regarding what you have heard...

specifically here:

http://appft1.uspto.gov/netacgi/nph...N/"International+Business+Machines"+AND+Kahle

and here:

http://appft1.uspto.gov/netacgi/nph...N/"International+Business+Machines"+AND+Kahle

We can distinguish from APU/SPU and SPC.

10. A multiprocessor computer system comprising: one or more processors, each processor having a local store; one or more memory flow controllers (MFCs) each included in each processor, a first MFC having a load access pattern leading to a prediction of at least one potential load of data; a system memory; and a cache coupled between at least one processor and the system memory, wherein, in response to the prediction, the data is prefetched from the system memory to the cache before the first DMAC requests the data.

11. The multiprocessor computer system of claim 10, wherein at least one of the processors is a synergistic processor complex (SPC).

12. The multiprocessor computer system of claim 11, wherein the synergistic processor complex (SPC) includes a synergistic processor unit (SPU).

Basically now each APU has a DMA engine in addition to the Local Storage and there is a shared L1 cache for all APUs: seems like a good trade-off having that instead of e-DRAM.

I imagine that the MFC is not as large and complex as the single DMAC for each PE that was shared by all PE's APUs even though without e-DRAM on the CPU we certainly have space to be used by more DMA logic and caches.

I do like the idea of having an MFC per APU and having an L1 cache shared between the PE's APUs :).

To tell you the truth I think this e-DRAM-less version of CELL (e-DRAM on the GPU only) might even be better over-all.

We have very fast external bandwidth (thanks to XDR), we have fast Local Storage for each APU and we have an L1 cache shared between the APUs of the same PE (while also having L1+L2 caches for the PU): this should improove performance (it would reduce latency for random memory accesses to main RAM) when the APUs need to perform random accesses to memory and not streaming large quantities of data while also allowing APUs to share data at high speeds.
 
Back
Top