CELL configuration revisited....

Im not sure if this is any useful around, but we can just take it as added info about future consumer CPUs, or something...

AMD desired roadmap.
roadmap11_06_03.gif
 
my 4 TFLOPs Cell CPU for PS3 (4 dies) should sustain 2 TFLOPs in realworld use.


my 2 TGLOPs Cell GPU ( 4 dies) should sustain 1 TFLOP in realworld use.

thats 3 TFLOPs sustained. I'll bet PS2 does not sustain 3 GFLOPs


woohooo


:oops: :rolleyes:
 
Re: ...

DeadmeatGA said:
It makes me wonder how CELL will compare to Xbox2 CPU, which appears to be going with a Power5+ style dual-core device.

o_O HOW much would MS be willing to lose again? (Or are we planning on a 2007+ launch? ;) ) I'm figuring a scaled-back (but well-modified for a self-contained box) PowerPC 970 at the GREATEST, simply because I don't think they're willing to take immense losses on their hardware, and they're likely to want to put the most effort on the GPU. (And I rather imagine they have more and higher-quality innards to think about for added capabilities. If they ARE going to fight for the living room this way... There's the possibility they'd focus the Xbox2 for maximum gaming and push harder instead from the Media Center PC direction, but...? <shrugs> I don't think they want to leave Sony alone there.)

If they are trying to reach a late-2005, early-2006 launch, Power5 could not remotely appear on a consumer device.
 
Re: ...

cthellis42 said:
If they are trying to reach a late-2005, early-2006 launch, Power5 could not remotely appear on a consumer device.

JFYI - AFAIK Power5 is a 130nm device still slated for release in 2H2004. It's pretty analogous to a CMT Power4 with SMT, Fast Path, and microarchitectual tweaks and is projected to hit 3GHz by 2005. Power5+ is a 90nm revision and should go north of there.

What 've heard of Power6 is kinda cool. It's the first post-Cell IBM architecture on the 65nm process and I've read it's suppose to show "very large frequency improvements" and start at 8GHz (?!), launch in late 2006.

Power5 is suppose to be large (area wise), but I don't see why a PPC-like derivative is out of the question. Could be wrong though.
 
UIUC has demonstrated a 509Ghz transistor

...and Intel and AMD has demonstrated THz transistors, however, put hundreds of millions of them on a die and they won't be running at that speed ;)


The G5 supercomputer they're building for Virginia Tech has a theoretical peak performance of 17.6 Tflops, and they hit 9.56 Tflops, or about 58%. That's impressive for off-the-shelf chips that were never designed to be clustered for this kind of supercomputing.

Were they running Linpack to get that percentage?

AFAIK the ES's efficiency drops to 65% with realworld apps. That 86% figure is from the Linpack benchmark. I'm guess that G5 supercomputer's 58% is from Linpack also therefore running real apps would further drop that down to 20-30% efficiency.
 
Re: ...

Vince said:
JFYI - AFAIK Power5 is a 130nm device still slated for release in 2H2004. It's pretty analogous to a CMT Power4 with SMT, Fast Path, and microarchitectual tweaks and is projected to hit 3GHz by 2005. Power5+ is a 90nm revision and should go north of there.

Hrm... maybe I'm mixing things up? (It's late enough. Hehe...) As I recall, 970 was built keeping a lot of Power4 in mind, and I was under the impression that Power5 is a substantial step forward. (Not to mention industrial-grade.) The Power4 and even the 970 are not cheap chips right now (does IBM actually have price lists for any of their chips like Intel and AMD do, or do they get to gloss over that since they offer system solutions or sell directly to OEM's?), so imagining a chip a step above, let alone an increased/dual-core version on a console trying to sell in the $300 range if they're going to start bulk production beginning of 2005...? o_O

Not to mention a console where they'll be spending the most on the GPU and have LOTS of bases to cover? Nuh-uh. (Unless someone used a mind-control device on Bill Gates and forced him to take hundreds of dollars in losses for per unit just to deliver the best gaming experience to us... in which case, GOOD JOB! ;) )
 
PC-Engine said:
UIUC has demonstrated a 509Ghz transistor

...and Intel and AMD has demonstrated THz transistors, however, put hundreds of millions of them on a die and they won't be running at that speed ;)

Um, the UIUC transistor is current record-holder. No, Intel and AMD do not have THz transistors. Intel is trying, as you can see here. They have designs and plans, but they have not demonstrated it. Your statement is false.

More info on UIUC: http://www.news.uiuc.edu/scitips/03/1106feng.html

I never said that they could make a chip out of these transistors, I was using this to show that we are far from the constraints of the laws of physics.

Were they running Linpack to get that percentage?

Probably.

AFAIK the ES's efficiency drops to 65% with realworld apps. That 86% figure is from the Linpack benchmark. I'm guess that G5 supercomputer's 58% is from Linpack also therefore running real apps would further drop that down to 20-30% efficiency.

Link please? Or did you make this up too?
 
PC-Engine said:
AFAIK the ES's efficiency drops to 65% with realworld apps. That 86% figure is from the Linpack benchmark. I'm guess that G5 supercomputer's 58% is from Linpack also therefore running real apps would further drop that down to 20-30% efficiency.

Er, perhaps it would be a good idea to not invoke such numbers in the console discussion at all. They are derived from applications that absolutely do not relate to console games, on machines whose topologies and implementation do not remotely resemble consoles, present or future.
 
Re: ...

randycat99 said:
You just got done saying it is only a design feature. If the design hasn't changed, then you would still have a 1.6 GHz P4 today. Fortunately, we are all aware that process has changed over the lifespan of P4, hence we have observed a scaling from 1.6 to 3.0+. It works like that for all processors, in general (but not necessarily the same degree of scaling, of course).

Why 1.6GHz? The original Willamette reached 2.0GHz retail and overclockers got it higher than that, even.
 
Heya, haerd that such massively parallel stuffs have very low efficiency? Wonder if Cell be different, dont be like PS2, all nice on paper but struggles with cranky and bottlenecked innards again.
 
Heya, haerd that such massively parallel stuffs have very low efficiency? Wonder if Cell be different, dont be like PS2, all nice on paper but struggles with cranky and bottlenecked innards again.

Two words: Embedded DRAM.
 
nah, often parrelell architecture do some contrained tasks and are rather efficent actually. on more general (read OS/desktop) stuff effiecy drops as expected.


dont be like PS2, all nice on paper but struggles with cranky and bottlenecked innards again.

er, if you mean PR sure it would be a bad idea, if you mean the paper released figues they are actually correct for the most part.

Why 1.6GHz? The original Willamette reached 2.0GHz retail and overclockers got it higher than that, even.

these aren't the same guys whom put liquid circulation heatsinks in their rigs are they :oops:
 
Isnt embedded ram expensive, large and heaty? Thats why PS2 only has measly 4mb and GC 3mb! :oops: of course, Nintendo did things a lil more elagantly.
 
chaphack said:
Isnt embedded ram expensive, large and heaty? Thats why PS2 only has measly 4mb and GC 3mb! :oops: of course, Nintendo did things a lil more elagantly.

it's expensive, not to clear about the heat tho (since we could always run them at lower clocks to the core).

oh and what do you mean the big N does eDRAM more elegently? look like a very similar implementation to me. (segmentation aside).
 
i dunno.

didnt Nintendo feed and suck the eDram with larger pipes or something, and added texture compression..somewhat a real cache while PS2 is more like very fast vram rather than cache...hence GC games, even though using 2mb(or was it 1mb, the rest buffer) for texture streaming, looking smoother....or whatever...something.. :LOL:
 
chaphack said:
i dunno.

didnt Nintendo feed and suck the eDram with larger pipes or something, and added texture compression..somewhat a real cache while PS2 is more like very fast vram rather than cache...hence GC games, even though using 2mb(or was it 1mb, the rest buffer) for texture streaming, looking smoother....or whatever...something.. :LOL:

no you are mistaken.

bandwidth is quite comparable between the 2 (their really both just very fast Vram). the difference is that bigN implements some neat mem management for hte framebuffer and tecture fetching (Flipper can request from sys ram, GS can't in effect it doesn't stream' anything. think XBOX UMA) they are rather differnt approaches appraoches .

but this is really internal memory management tho.

Edit: actually there's alot more to it than that, but there are plenty of threads archive which cover the cache sys for Flipper on these boards and we could always make a seperate thread for it. so let's just leave this one here shall we.
 
Isnt embedded ram expensive, large and heaty? Thats why PS2 only has measly 4mb and GC 3mb! of course, Nintendo did things a lil more elagantly.

It is expensive.

Toshiba's and SCE's DRAM Cell's are not large.

http://www.sony.net/SonyInfo/News/Press/200212/02-1203E/

Infact, some of the worlds smallest. God knows what they have today.

Tokyo - Toshiba Corp. has developed a new cell structure for embedded DRAM on silicon-on-insulator wafers that takes advantage of SOI's specific characteristics. The cell will be an essential technology for the company's system-on-chip designs and will make it possible to integrate larger DRAM cells with the Cell processor, a joint development project of IBM, Sony and Toshiba targeting teraflops performance.

http://www.eetimes.com/issue/mn/OEG20030623S0025

Using SOI heat won't be a huge issue.
 
how much smaller are they? any data/benchmark/setup/whacaca on how useful they compare to the next best alternative? i mean PR is well, PR?
 
chaphack said:
how much smaller are they? any data/benchmark/setup/whacaca on how useful they compare to the next best alternative? i mean PR is well, PR?

eh, your probably gonna have to read up on some technical journals for fabbing processes to find that kinda the thing.

I'd recommend not using benchmarks unless you know how they are run and what they represent otherwise.
 
Back
Top