Cell details (Nikkei Electronics)

nAo · Feb 26, 2005

version said:
if you read vertexstream (16bit,y,z,normalX.Y,Z,uv1,uv2=128bit ,16 byte)
and rotated, projected vertexs that about 10 cycle on SPE
then 1 SPE load 6.4 GB/s data from memory, 4 SPE kill the xdr bandwith
what doing other 4 SPE and PPE ?

Decent vertex shaders will take much more than 10 cycles, moreover vertices are not unique, most vertices are going to be re-used.

version · Feb 26, 2005

nAo said:
version said:

if you read vertexstream (16bit,y,z,normalX.Y,Z,uv1,uv2=128bit ,16 byte)
and rotated, projected vertexs that about 10 cycle on SPE
then 1 SPE load 6.4 GB/s data from memory, 4 SPE kill the xdr bandwith
what doing other 4 SPE and PPE ?

Click to expand...

Decent vertex shaders will take much more than 10 cycles, moreover vertices are not unique, most vertices are going to be re-used.

25GB/s peak bandwith that will be 15-20 GB/s real bandwith (bottleneck, latency etc...)
if 8 SPE has 25GB/s memory bandwith it is totally suxx, xbox2 will be 10times faster

nAo · Feb 26, 2005

version said:
if 8 SPE has 25GB/s memory bandwith it is totally suxx, xbox2 will be 10times faster

Xbox will have about the same bandwith..shared with the GPU (+ edram bw) IIRC. (and maybe even CELL CPU will share that bw with the GPU..)
Even if it would be nice to have more bandwith do you have to understand that with just 256/512 Mb of main ram you would not have the space to store enough unique content to be transferred in a single frame from main memory! We are going to re use a lof of stuff (instancing) or to generate a lot of content (procedural geometry generation..). That's the way we'll use all the fp power. Console never had the memory to store huge datasets per frame, and never will.

j^aws · Feb 26, 2005

Noooooooo...think Onions!

Each SPE is a small onion (small pipeline) with concentric rings. Eight small onions (8 SPEs) work independantly...and...CELL is one big onion with concentric rings (big pipeline encompassing 8 SPEs)...these bigger concentric rings (large pipeline) link the small onions together (small pipelines)...

...I know...I've gone mad....

version · Feb 26, 2005

nAo said:
version said:

if 8 SPE has 25GB/s memory bandwith it is totally suxx, xbox2 will be 10times faster

Click to expand...

Xbox will have about the same bandwith..shared with the GPU (+ edram bw) IIRC. (and maybe even CELL CPU will share that bw with the GPU..)
Even if it would be nice to have more bandwith do you have to understand that with just 256/512 Mb of main ram you would not have the space to store enough unique content to be transferred in a single frame from main memory! We are going to re use a lof of stuff (instancing) or to generate a lot of content (procedural geometry generation..). That's the way we'll use all the fp power. Console never had the memory to store huge datasets per frame, and never will.

x2 use UMA with 75 GB/s bandwith and 756MB ram, i mean

nAo · Feb 26, 2005

version said:
x2 use UMA with 75 GB/s bandwith and 756MB ram, i mean

Yeah..and my real name is Rocco Siffredi

London Geezer · Feb 26, 2005

nAo said:
version said:

x2 use UMA with 75 GB/s bandwith and 756MB ram, i mean

Click to expand...

Yeah..and my real name is Rocco Siffredi

IS IT?!!

nAo · Feb 26, 2005

london-boy said:
IS IT?!!

LOL 8)

Panajev2001a · Feb 26, 2005

nAo said:
london-boy said:

IS IT?!!

Click to expand...

LOL 8)

Hey hey... London-boy, stop drooling

.

Npl · Feb 26, 2005

Jaws said:
Noooooooo...think Onions!

Each SPE is a small onion (small pipeline) with concentric rings. Eight small onions (8 SPEs) work independantly...and...CELL is one big onion with concentric rings (big pipeline encompassing 8 SPEs)...these bigger concentric rings (large pipeline) link the small onions together (small pipelines)...

...I know...I've gone mad....

Onions... Concentric Rings.. Damn it now ive the desire to order a Pizza and atleast a 1:8 configuration.

Inane_Dork · Feb 26, 2005

Jaws said:
Noooooooo...think Onions!

Each SPE is a small onion (small pipeline) with concentric rings. Eight small onions (8 SPEs) work independantly...and...CELL is one big onion with concentric rings (big pipeline encompassing 8 SPEs)...these bigger concentric rings (large pipeline) link the small onions together (small pipelines)...

...I know...I've gone mad....

Ogres are like onions, too.[/Shrek]

Anyway, stream processing is a great solution if you can break your problem into 8*N consecutive stages of similar processing time. That is, if you're wanting to tap all your SPEs at the same time. You could run multiple streams if not. I don't think this is going to happen for most anything outside of Naughty Dog's and Polyphony's games, though.

cthellis42 · Feb 26, 2005

I hope the PS3 is like a parfait. Everybody loves parfait! A parfait may be the most delicious console in the world.

vliw · Feb 26, 2005

london-boy said:
nAo said:

version said:

x2 use UMA with 75 GB/s bandwith and 756MB ram, i mean

Click to expand...

Yeah..and my real name is Rocco Siffredi

Click to expand...

IS IT?!!

Il piÃ¹ famoso pornostar Italiano

ha..ha..ha..ha.......

PC-Engine · Feb 27, 2005

Npl said:
Jaws said:

Noooooooo...think Onions!

Each SPE is a small onion (small pipeline) with concentric rings. Eight small onions (8 SPEs) work independantly...and...CELL is one big onion with concentric rings (big pipeline encompassing 8 SPEs)...these bigger concentric rings (large pipeline) link the small onions together (small pipelines)...

...I know...I've gone mad....

Click to expand...

Onions... Concentric Rings.. Damn it now ive the desire to order a Pizza and atleast a 1:8 configuration.

Are you sure it was't because of nAo's Italian sausage?

j^aws · Feb 27, 2005

Inane_Dork said:
Jaws said:

Noooooooo...think Onions!

Each SPE is a small onion (small pipeline) with concentric rings. Eight small onions (8 SPEs) work independantly...and...CELL is one big onion with concentric rings (big pipeline encompassing 8 SPEs)...these bigger concentric rings (large pipeline) link the small onions together (small pipelines)...

...I know...I've gone mad....

Click to expand...

Ogres are like onions, too.[/Shrek]

Anyway, stream processing is a great solution if you can break your problem into 8*N consecutive stages of similar processing time. That is, if you're wanting to tap all your SPEs at the same time. You could run multiple streams if not. I don't think this is going to happen for most anything outside of Naughty Dog's and Polyphony's games, though.

Well, the ability to create multiple dynamic streaming pipelines is one of CELLs core/key abilities, so I'm hoping STI have had the foresight to create appropriate tools to take advantage of this so that more than Naughty Dog, Polyphony et al can take advantage of it and make it sing.

Mikage · Feb 28, 2005

More info from Nikkei Electronics:

FlexIO has a function to enable NUMA(non-uniform memory access).
An outside chip(another Cell or GPU?) can access main memory
(which is connected directly with Cell) via FlexIO.

Fafalada · Feb 28, 2005

PCEngine said:
Are you sure it was't because of nAo's Italian sausage?

Are you sure it's safe to mention sausages in the same thread London Boy is posting in?

version said:
if you read vertexstream (16bit,y,z,normalX.Y,Z,uv1,uv2=128bit ,16 byte)

How about I read a 7:1 compressed multiresolution mesh on one SPU, and transmit 7x data to all other SPUs on their internal bus, only using 1GB/s of main memory bus.

Cell details (Nikkei Electronics)

nAo

Nutella Nutellae

version

nAo

Nutella Nutellae

j^aws

version

nAo

Nutella Nutellae

London Geezer

nAo

Nutella Nutellae

Panajev2001a

Npl

Inane_Dork

Rebmem Roines

cthellis42

Hoopy Frood

vliw

PC-Engine

j^aws

Mikage

Fafalada

Similar threads