Cell specifications released

Silkworm · Aug 25, 2005

Quite a nice surprise. Christmas in August?

http://www-128.ibm.com/developerworks/power/cell/
(free registration required to get at the pdfs)

_leech_ · Aug 25, 2005

Very nice :up:

Paul · Aug 25, 2005

I knew it the whole damn time

seismologist · Aug 25, 2005

Because the latency
and instruction overhead associated with DMA transfers exceeds that of the
latency of servicing a cache miss, this approach achieves an advantage only if
the DMA transfer size is sufficiently large and is sufficiently predictable (that is,
DMA can be issued before data is needed).

Could someone elaborate on the implications of this? Would this be practical for game programming?

rabidrabbit · Aug 25, 2005

There's quite a bit of information there

of which I don't think I really understand 1%

but will read anyway, 'cos there could be worse ways to spend a day at work

ChryZ · Aug 25, 2005

Same documents, no registration:

http://cell.scei.co.jp/e_download.html

Xenus · Aug 25, 2005

Any one who can actually understand this. Maybe a trans into laymans terms from faf, deano, or nao. It might take a while, who has the time to read 319 pages anyway and thats just the first pdf!!!

version · Aug 25, 2005

spe's SUMB instruction a monster 24 ops/cycle

pcostabel · Aug 25, 2005

seismologist said:
Could someone elaborate on the implications of this? Would this be practical for game programming?

Of course. You typically use a double buffer approach, e.g. if you have to transform a bunch of vertices, you DMA in data on one buffer while processing the other. That's the way you do it on PS2 VU1.

creon100 · Aug 25, 2005

version said:
spe's SUMB instruction a monster 24 ops/cycle

I see why you're saying that, since the instruction requires 24 summation operations, but is that actually what it means. I'm not sure what their notation means in the ISA document. I mean, the instuction operands table for the sumb, shows 8 3-summation operations. Are we supposed to assume it does all of that in one clock cycle, or are we to assume each row in that table takes a clock cycle so the entire sumb instruction takes 8 clock cycles? I haven't read any of the documentation, so forgive me if I'm asking something completely ridiculous that is cleared up somewhere earlier in the docs.

MfA · Aug 25, 2005

Description of microarchitecture seems completely missing, let alone Latency/throughput information ... very incomplete.

Silkworm · Aug 26, 2005

MfA said:
Description of microarchitecture seems completely missing, let alone Latency/throughput information ... very incomplete.

You're right. However, IBM does explicitly mention in the documents that the specifications are for the overall "Cell Broadband Engine" architecture, and not implementation specific. This looks to be consistant with their documentation structure for the PowerPC architecture, where they had 3 volumes for defining the overall archictecture, and then an implementation specific "user's manuals" for individual parts like the PPC 750 and 970. So there should be a "PS3 Cell" user's manual somewhere down the line. Something else to look forward to, I guess.

Barbarian · Aug 26, 2005

Throughput is 1 cycle for everything except double precision.
Float ops are 6 cycles, conversions to/from float 7, integer complex (madds etc) 7, integer and logic simple 2, shift and shuffle 4, load/store 6, branch misspredict 18, doubles are 13 cycles non pipelined for the first 6.
Floats and integers go in pipe 0, load/store branch and shuffles go in pipe 1.
All in all, pipes seem quite well balanced. Select bits in 2 cycles is quite nice. Every branch that can be converted to a conditional move should be.
Branch hint must be issued quite a lot of cycles in advance (13 i believe), even for unconditional jumps. There is a possibility to stall till the hint arrives, rather than filling with tons of nops.

one · Aug 28, 2005

Some interesting discussion going on there

http://www-128.ibm.com/developerwor...reeDisplayType=threadmode1&forum=739#13750566

Panajev2001a · Aug 29, 2005

one said:
Some interesting discussion going on there
http://www-128.ibm.com/developerwor...reeDisplayType=threadmode1&forum=739#13750566

Let's hope it stays. I do not see the need of GAF/B3D regular discussion being brought to a forum where STI engineers are taking time to read the posts and answr questions.

Shifty Geezer · Aug 29, 2005

Interesting how the calibre of posting keeps away the trouble makers. That was what was intended for here. The brainiacs discuss and the looker-ins think 'blimey, they're all smart I feel intimidated and will post in GAF instead' and then goes off and posts 'Revolution IS da SUXXORZ!!11!'

Not that this is at all on topic for the thread.

loekf2 · Aug 29, 2005

Panajev2001a said:
Let's hope it stays. I do not see the need of GAF/B3D regular discussion being brought to a forum where STI engineers are taking time to read the posts and answr questions.

Interesting.... H_Peter_Hofstee in that IBM forum really who he is supposed to be ?

Peter Hofstee = dutch = one of the founders of "Cell" ...

Panajev2001a · Aug 29, 2005

I do think so

.

I hope these people stick around and keep posting.

speng · Aug 30, 2005

Another article:

http://www-128.ibm.com/developerworks/power/library/pa-cbea.html

Speng.

Cell specifications released

Silkworm

_leech_

Paul

seismologist

rabidrabbit

A Reformed Member

ChryZ

Xenus

version

pcostabel

creon100

MfA

Silkworm

Barbarian

one

Unruly Member

Panajev2001a

Shifty Geezer

uber-Troll!

loekf2

Panajev2001a

speng

Similar threads