Deadmeat's past CELL = BlueGene/L posts revisited.

Status
Not open for further replies.
...

BlueGene and Cell are different projects with their own R&D money, different performance numbers.
Believe what you must, but don't be dissapointed when Kutaragi Ken unveils CELL next year.

It is also the technology that will be the foundation of the next generation of gaming consoles from Nintendo
So we have a disagreement on what "it" refers to. This is one of things I hate about English language, pronouns are never certain about what they point to. But let us take a moment and think about the context of the article. The article was titled "A supercomputer built on a gaming chip." The Blue Gene L ASIC is also a technology, and it differs from your standard PPC in many ways, more specifically the built-in chip-toc-chip networking.

The problem I see with lots of hardware oriented people is that they focus too much on hardware; PPC is just a processor and make up a small bits of overall Blue Gene technology. Blue Gene is really about the distributed OS and processes and chip to chip networking. Hell, even the Blue Gene Cyclops, which does not even use PPC, is still called a Blue Gene. Why? Because it shares same systems architecture.

The result will be consumer devices that are more powerful than IBM's Deep Blue supercomputer, operate at low power and access the broadband Internet at ultra high speeds. Cell will be designed to deliver "teraflops" of processing power.
It is pretty much obvious that CELL will not deliver teraflop per chip, it is a physical impossibility.

Under the agreement, SCEI, IBM and Toshiba will each manufacture the product for a variety of consumer applications.
And what consumer level product does IBM build??? IBM has zero use for CELL. What's in it for IBM other than the license fee?

The result will be consumer devices that are more powerful than IBM's Deep Blue supercomputer, operate at low power and access the broadband Internet at ultra high speeds. Cell will be designed to deliver "teraflops" of processing power.
Yap. And Blue Gene was designed for Petaflops of processing power.
 
...

Anyhow, I now suspect that the performance of PE is really 32 GFLOPS. Why??

1. Kutaragi's little diagram keep saying that each processor "chip" is worth 32 GFLOPs.
2. Suzuoki never clarified in his patent applications what 32 GFLOPS rating was for, is it per APU or all the FPUs of APUs, in other word, the PE?

32 GFLOPS per APU is clearly illogical, but 32 GFLOPS per PE makes perfect sense. While some of Sony fans might balk at the idea of 32 GFLOPS per PE, it is still double the FLOPS of what the current state of art PPC 970 can deliver per chip and is nothing to laugh at, especially if this level of performance can be delivered cheaply, which is the whole point of CELL architecture.(cheap, low-power, and build performance by using lots of them)

Suppose each PE chip can be fabbed for something like $20(It can be done if the core is PPC440 + 4 APUs), then Kutaragi can stick the chip everywhere(even convince its archrival Sony Electronics to use some just as an ethernet adapter replacement to complement Sony Electronic's own home grown media processor), and use 8 of them in PSX3 to deliver 256 GFLOPS.
 
Gee, speculation from a Reuters article and that one shouldn't take everything at its face value? My, I am sh0xx0red.

I like the last descriptive line, tho: They will be in the same family, like cousins who shouldn't marry. :LOL:
 
Re: ...

DeadmeatGA said:
Anyhow, I now suspect that the performance of PE is really 32 GFLOPS. Why??

1. Kutaragi's little diagram keep saying that each processor "chip" is worth 32 GFLOPs.
2. Suzuoki never clarified in his patent applications what 32 GFLOPS rating was for, is it per APU or all the FPUs of APUs, in other word, the PE?

32 GFLOPS per APU is clearly illogical, but 32 GFLOPS per PE makes perfect sense. While some of Sony fans might balk at the idea of 32 GFLOPS per PE, it is still double the FLOPS of what the current state of art PPC 970 can deliver per chip and is nothing to laugh at, especially if this level of performance can be delivered cheaply, which is the whole point of CELL architecture.(cheap, low-power, and build performance by using lots of them)

Suppose each PE chip can be fabbed for something like $20(It can be done if the core is PPC440 + 4 APUs), then Kutaragi can stick the chip everywhere(even convince its archrival Sony Electronics to use some just as an ethernet adapter replacement to complement Sony Electronic's own home grown media processor), and use 8 of them in PSX3 to deliver 256 GFLOPS.

One thing... before painting your picture of little Ken Kutaragi and Sony Electronics ( which is not the beloved child in Sony and it is the head of SCE who assumed the control over the Semiconductor division in Sony ) developing this all powerful Media Processor in spite of Kutaragi I would like to repeat this: Ken Kutaragi is the HEAD of SSNC which is the consolidated unit of all the Semiconductor resources in the whole Sony group.

We will talk about the patent tomorrow.
 
Wasn't Ken Kutaragi a failed yoyoist-turned-plumber who's never done a thing in his life and only got his position at Sony from winning the lottery?
 
Re: ...

DeadmeatGA said:
BlueGene and Cell are different projects with their own R&D money, different performance numbers.
Believe what you must, but don't be dissapointed when Kutaragi Ken unveils CELL next year.

My friend, I suggest you choose your words carefully. I can disprove what you said right now if I chose to - although not yet. Cell is an architecture, the Broadband Engine is the OTSS partners iteration of it. Ken will rock your world... Ok, maybe not, I just thought that was funny. ;)

It is pretty much obvious that CELL will not deliver teraflop per chip, it is a physical impossibility.

This is incorrect. In fact, it's not impossible.. it's inevitable at 65nm and lower for dedicated ICs in the high-preformance realm.

And what consumer level product does IBM build??? IBM has zero use for CELL. What's in it for IBM other than the license fee?

IBM can scale it [as in the architecture, or perhaps the IC] up for use it's servers. Already we've seen presentations like this which have shown this: concurrent processing for CE is inevitable for it's intrinsic features, like it's architectural scaling.
Didn't IBM just comment on using the CE ICs economies of scale to leverage their use in massively parallel computing clusters. The work done at UIUC is neat in that trend, toward "commodity" IC's permeating new fields, is industry wide. The Linux cluster, 3D chips moving in on high-end visualization tasks.. it's happening.
 
Re: ...

DeadmeatGA said:
Anyhow, I now suspect that the performance of PE is really 32 GFLOPS. Why??

1. Kutaragi's little diagram keep saying that each processor "chip" is worth 32 GFLOPs.
2. Suzuoki never clarified in his patent applications what 32 GFLOPS rating was for, is it per APU or all the FPUs of APUs, in other word, the PE?

Usually, there comes a point where a person learns just enough of a new topic/field that he comes to the realization that he knows nothing. I suggest you think about that before commenting in such a manner again. I've been restraining from entering this debate to this point (since you started it up, yet again) but I feel the need for speed, or something like that.

So, the Suzuoki patent, if you chose to believe it, does have some varience in interpretation. Yet, it's decidely clear that at the APU level there is 32GFlops as a lower-bounds as per the "prefered embodiment". Here's the relevent passage:

[url=http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1&p=1&f=G&l=50&d=PG01&S1=20020138701&OS=20020138701&RS=20020138701 said:
Suzuoki Patent[/url]][0068] FIG. 4 illustrates the structure of an APU. APU 402 includes local memory 406, registers 410, four floating point units 412 and four integer units 414. Again, however, depending upon the processing power required, a greater or lesser number of floating points units 512 and integer units 414 can be employed. In a preferred embodiment, local memory 406 contains 128 kilobytes of storage, and the capacity of registers 410 is 128.times.128 bits. Floating point units 412 preferably operate at a speed of 32 billion floating point operations per second (32 GFLOPS), and integer units 414 preferably operate at a speed of 32 billion operations per second (32 GOPS).

I suppose you can argue in a linguistic manner that their referring to each FPU (singularly) as outputting 32GFlops, which oddly enough would lead to the 1GHz clock that it was once said to have, but nobody is debating that side of this issue.


Now, as for Ken's slide. So little is known about the base architecture, this is really open to debate. For example, does Cell have APUs in it's basic form? Why did a SCE employee patent this and not IBM? Which lies in direct opposition to your supposition that IBM will be collecting "royalties" from Sony.

My take on this actually allows for the above conditions and remains consistant. I view Cell as a basic architecture designed around the PPC derived core that can be scaled up and interoperate with other such devices - this is STI's work. I see the Broadband Engine as a Toshiba/Sony (sans IBM) IC that's built on Cell @ 65nm and is APU heavy.

So, for me, when I see an article on IBM's Cell it means nothing. It could just be the core and a single APU, it could be just the core. It could be an iguana's brain in a vat. We don't know enough, especially knowing that the BE is a discrete entity, to make a judgement call as you've done.

But, going on just precedence, Suzuoki's patent is roughly in line temporally and what's patented with what was done in early 1997 concerning the Emotion Engine.

32 GFLOPS per APU is clearly illogical, but 32 GFLOPS per PE makes perfect sense.

On the contrary, anything but 32GFlop per APU is unreasable for that time period. A Vector Unit in the Emotion Engine from 2000 (250nm) outputs 2.5-something GFlop. We're basically talking of a power of 10 here - Using Intel's clock scaling as a baseline, we'll see the same power of 10 leap in the same timeframe. Especially with IBM's process technology transfer and cooperative R&D with Toshiba & Sony...

While some of Sony fans might balk at the idea of 32 GFLOPS per PE, it is still double the FLOPS of what the current state of art PPC 970 can deliver per chip and is nothing to laugh at, especially if this level of performance can be delivered cheaply, which is the whole point of CELL architecture.(cheap, low-power, and build performance by using lots of them)

  • First off, 32GFlops is pathetic for a pseudo-dedicated 3D construct targeted for 2005. Stop comparing the BE against these superscalars and start looking at other massivly concurrent designs, like the suff nVidia and ATI are designing.
  • You use arbitrary language to describe this stuff that's wrong. Either call it cellular computing (eg. which you'd be correct in) or Cell as an STI architecture (which is somewhat correct), or the Broadband Engine which you'd be totally incorrect on.

    I posted a somewhat decent post on these distinctions here. Which I know you've read as you posted ignorant responces to - perhaps you should reread it, not with contempt this time.

    Suppose each PE chip can be fabbed for something like $20(It can be done if the core is PPC440 + 4 APUs), then Kutaragi can stick the chip everywhere(even convince its archrival Sony Electronics to use some just as an ethernet adapter replacement to complement Sony Electronic's own home grown media processor), and use 8 of them in PSX3 to deliver 256 GFLOPS.

    8 "ICs" have a really high fixed cost in the packaging, and PCB costs that don't scale with Moore's Law. Hell, they barely scale at all - which is bad.

    This is diametrically opposed with Sony's design choices in the past which has trended toward integration and SoC designs. The PSP itself is a great example of how their going to let Moore's Law reduce costs over time by going with a single SoC. Hell, no external RAM, nothing... it's beautiful.

    The BE is much the same in that integration is the future. Microsoft has basically admitted defeat on this front and taken a much better route this time around.
 
Re: ...

Vince said:
IBM can scale it [as in the architecture, or perhaps the IC] up for use it's servers. Already we've seen presentations like this which have shown this: concurrent processing for CE is inevitable for it's intrinsic features, like it's architectural scaling.

Thanks for that Dijkstra reference.
The "Hidden vs. exposed parallellism" discussion was good, and really should be read by anyone who is interested in future directions in computing and hasn't been exposed to this material before. It was both really lightweight but still had a little technical meat to it for those who have the necessary background.
(Note however that the predominantly business/utility oriented Wintel environment might not fully agree with your claims of inevitability, Vince. Office clerical work is well suited to the present paradigm, and may not necessarily ever go the MP route. In fact, you could say that Wintel of today is defined and optimised for clerical tasks.)
 
...

IBM can scale it [as in the architecture, or perhaps the IC] up for use it's servers.
Servers don't need APUs.

Usually, there comes a point where a person learns just enough of a new topic/field that he comes to the realization that he knows nothing.
A good description of yourself.

Yet, it's decidely clear that at the APU level there is 32GFlops as a lower-bounds as per the "prefered embodiment".
It is interesting to observe that no one was able to come up with a single 4-way vector processor capable of 32 GFLOPS, with its power consumption low enough(1~2 watts) to have 32 of them on a chip. Altivec doesn't do it, and not even the future NEC SX-7 supercomputer is capable of such a feat.

that their referring to each FPU (singularly) as outputting 32GFlops, which oddly enough would lead to the 1GHz clock that it was once said to have
??? Do your math for me. I don't get it.

For example, does Cell have APUs in it's basic form?
Absolutely. It is the APU that runs user processes, not PPC.

quote]Why did a SCE employee patent this and not IBM?
Because IBM has no use for APU oriented programming??? IBM runs its user processes on PPC.

Which lies in direct opposition to your supposition that IBM will be collecting "royalties" from Sony.
What IBM can collect loyalties on.

1. PPC
2. SOI
3. Systems software.
4. Programming knowhow.
5. Various tools.
6. Blue Gene API.

On the contrary, anything but 32GFlop per APU is unreasable for that time period.
If 32 GFLOPS were true, then there is no point of Blue Gene/L's existence; why use 65000k chips to reach 300 TFLOPS in 2006 when 300 CELL chips would do it???

A Vector Unit in the Emotion Engine from 2000 (250nm) outputs 2.5-something GFlop. We're basically talking of a power of 10 here - Using Intel's clock scaling as a baseline, we'll see the same power of 10 leap in the same timeframe.
And look at what Intel's clock scaling has brought us, 100+ watts power consumption at mere 3.4 Ghz. Now even "mobile" chips routinely burn 30~40 watts at the peak. Going 30 Mhz to 300 Mhz was easy, going from 300 Mhz to 3 Ghz will be 10 times as hard, as shown by struggles of Itanium2, Power5, Pentium4, etc...

8 "ICs" have a really high fixed cost in the packaging, and PCB costs that don't scale with Moore's Law. Hell, they barely scale at all - which is bad.
packaging & testing cost is around $5 per chip if you are willing to settle with low power consumption, low clockspeed.

This is diametrically opposed with Sony's design choices in the past which has trended toward integration and SoC designs.
Which does not necessarily result in higher pin count and expensive high thermal packaging.

The PSP itself is a great example of how their going to let Moore's Law reduce costs over time by going with a single SoC. Hell, no external RAM, nothing... it's beautiful.
And it is rated at only 2.6 GFLOPS. To say that single PE will be 100x as powerful as this is rather stupid.

Microsoft has basically admitted defeat on this front and taken a much better route this time around.
It is really single Power5 Vs quad PPC440s competition. Who will laugh all the way to bank is to be judged.
 
Believe what you must, but don't be dissapointed when Kutaragi Ken unveils CELL next year.

I imagine your going to be shitting your pants come March.

So we have a disagreement on what "it" refers to. This is one of things I hate about English language, pronouns are never certain about what they point to. But let us take a moment and think about the context of the article. The article was titled "A supercomputer built on a gaming chip." The Blue Gene L ASIC is also a technology, and it differs from your standard PPC in many ways, more specifically the built-in chip-toc-chip networking.

The problem I see with lots of hardware oriented people is that they focus too much on hardware; PPC is just a processor and make up a small bits of overall Blue Gene technology. Blue Gene is really about the distributed OS and processes and chip to chip networking. Hell, even the Blue Gene Cyclops, which does not even use PPC, is still called a Blue Gene. Why? Because it shares same systems architecture.

Bottom line. You read it wrong, they were talking about PowerPC.

It is pretty much obvious that CELL will not deliver teraflop per chip, it is a physical impossibility.

IBM claims a different story. Oh and, your not a CMOS engineer, so just shut up already.


And what consumer level product does IBM build??? IBM has zero use for CELL. What's in it for IBM other than the license fee?

We have been through this already, and I'm not going to repeat what IBM wants to use Cell for.
 
If 32 GFLOPS were true, then there is no point of Blue Gene/L's existence; why use 65000k chips to reach 300 TFLOPS in 2006 when 300 CELL chips would do it???

Because SCE/TOSHIBA Cell is different than IBM's Cell. You continue to forget, Cell is an ARCHITECTURE.

A good description of yourself.

Nah, it's a pretty good one of you.

And it is rated at only 2.6 GFLOPS. To say that single PE will be 100x as powerful as this is rather stupid.

It's a handheld, there are certain thermal/cost conciderations with it. Oh and to say a PE would be 100X as powerfull isn't out of the question, you are talking about an entirely new architecture in which Sony is investing billions, in which they will build on the worlds best fabrication processes and push the lithography until it screams for mercy, to the point of bad yields.
 
...

I imagine your going to be shitting your pants come March.
I won't, because I know exactly what Kutaragi is upto.

You read it wrong, they were talking about PowerPC.
The issue will be resolved before you come next April.

IBM claims a different story.
What does IBM claim then? Teraflop chip???

Oh and, your not a CMOS engineer, so just shut up already.
And you happen to be one by chance? If so, you should be the one out there explaining why it is impossible to do teraflop chip. I have seen some calculations done by professionals and the number runs like 500 watts for FPUs alone, God knows how much power the rest of chip will require...

We have been through this already, and I'm not going to repeat what IBM wants to use Cell for.
No use, of course.

Because SCE/TOSHIBA Cell is different than IBM's Cell. You continue to forget, Cell is an ARCHITECTURE.
Which includes Blue Gene Cyclops, Blue Gene/L, and STI CELL..... According to you, IBM has as much right to use CELL as Kutaragi Ken, so why aren't they using it right away? One teraflop CELL would replace 200 BlueGene/L ASICs and allow IBM to claim "Behold, we have put 1 PETAFLOP into a box the size of a dishwasher, now bow down and worship this God in a box. And it cost only $399,990.00 + tax at Wal-mart". Why is IBM spending $130 million to build BlueGene/P when SCEI CELL would deliver the same performance for only $399,990.00+tax???($399.99 * 1000 PSX3s)

It's a handheld, there are certain thermal/cost conciderations with it.
And CELL doesn't???? You really believe PSX3 will come with a huge fan and burns power like a microwave???

Oh and to say a PE would be 100X as powerfull isn't out of the question
Actually it is. According to sources CELL design was completed a while back but PSP is still not done(Kutaragi confirmed they still had no physical chip), so the PSP is actually considered a later design than CELL. The reason it is coming out first is because it is more of a traditional architecture and require less testing, whereas CELL is a BlueGene/L derivative and depends on the maturity of its mother's technology.

in which they will build on the worlds best fabrication processes
65 nm by 2005 is nothing special, everybody else's new products will start appearing on that process as well. SCEI is not doing anything out of ordinary.

to the point of bad yields.
You are missing the whole point of CELL architecture. CELL architecture is about aggregate power, never about the raw power of individual node. In fact, it makes sense that SCEI makes CELL as simple and inexpensive as IBM did to keep the cost as low as possible, so that it can use a whole bunch of them.
 
Re: ...

Why do I even spend time responding to you? You're so fucking inconsistent in your argument due to your "Anti-Sony above all else" mentality. For example:

DeadmeatGA said:
Servers don't need APUs.
&
DeadmeatGA said:
Because IBM has no use for APU oriented programming??? IBM runs its user processes on PPC.

Ok, IBM doesn't have a use for the BE in it's own future - we must assume there is other avenues of future develoment:

DMGA a few lines below said:
If 32 GFLOPS were true, then there is no point of Blue Gene/L's existence; why use 65000k chips to reach 300 TFLOPS in 2006 when 300 CELL chips would do it???

So wait, let me get this strait because I must be blinded by your devine brilliance. IBM doesn't need APUs - we must assume due to their task requirements which necessitate other architectures - and then you turn around and ask why Cell won't replace other IBM projects?

Seriously, go F- yourself.

DMGA said:
For example, does Cell have APUs in it's basic form?
Absolutely. It is the APU that runs user processes, not PPC.


First of all, the only source that could give you this conclusion is the Suzuoki patent. Which stated 32GFloip per APU - which you doesn't agree with. Thus, how the F- can you disagree with 70% of the patent and then agree with only that which supports your argument?

It's a PPC core, it can run anything AFAIK.

I also later stated:
Vince said:
So, for me, when I see an article on IBM's Cell it means nothing. It could just be the core and a single APU, it could be just the core. It could be an iguana's brain in a vat. We don't know enough, especially knowing that the BE is a discrete entity, to make a judgement call as you've done.

DeadmeatGA said:
What IBM can collect loyalties on.

1. PPC
2. SOI
3. Systems software.
4. Programming knowhow.
5. Various tools.
6. Blue Gene API.

Prove it. I want legal documents. The SEC would mandate public disclosure of said agreements - I want links.

And look at what Intel's clock scaling has brought us, 100+ watts power consumption at mere 3.4 Ghz. Now even "mobile" chips routinely burn 30~40 watts at the peak. Going 30 Mhz to 300 Mhz was easy, going from 300 Mhz to 3 Ghz will be 10 times as hard, as shown by struggles of Itanium2, Power5, Pentium4, etc...

And look at AMD's cores which utilize a first generation IBM process techology (which has a hell of alot more in common with Cell than Intels beast) you'll see that going from between a K6-2 @ 300MHz (~20watt) to the full 4GHz @ 85watt will have scaled in power dissipation under half of your projection.

And, we're also not full aquanted with the Cell microarchitecture and layout. Thus, any thermal talk is early/ignorant at best and plain old trollish at worst.

Packaging & testing cost is around $5 per chip if you are willing to settle with low power consumption, low clockspeed.

Why... why do I want someone elses input on this tidbit?


]And it is rated at only 2.6 GFLOPS. To say that single PE will be 100x as powerful as this is rather stupid.

Hardly. We can easily see a 100X+ leap in preformance between a GB-SP or PDA's 3D graphics IC and those found on a R3x0 or NV3x when it comes to computational ability.

It is really single Power5 Vs quad PPC440s competition. Who will laugh all the way to bank is to be judged.

What?!? You bore me with this rhetoric. It really hurts.
 
it is a physical impossibility.

There is no such thing.

Let's see how things unfold, and be not surprised if the expectations of those who're conservative... are exceeded... will history repeat itself again, as it often tends to do?

Only time will tell my friend... only time will tell...

PSX3 will see a memory capacity jump of only 8 times over PSX2.

That's news to me, when did they actually announce/say that... I'm curious...
 
Re: ...

DeadmeatGA said:
I won't, because I know exactly what Kutaragi is upto.

Yes, ever since that precious moment when their eyes crossed, a look exchanged, an ideology passionatly exposed... and then the moniter went into sleep mode.
 
I won't, because I know exactly what Kutaragi is upto.

Let me be the first to say that you don't know shit.

What does IBM claim then? Teraflop chip???

Cell will be designed to deliver "teraflops" of processing power.

http://www-3.ibm.com/chips/news/2001/0312_sony-toshiba.html

And you happen to be one by chance? If so, you should be the one out there explaining why it is impossible to do teraflop chip. I have seen some calculations done by professionals and the number runs like 500 watts for FPUs alone, God knows how much power the rest of chip will require...

I don't deem architectures failures, nor do I claim to know everything 3 multi billion dollar corporations do. Nor do I deem things imposible.


No use, of course.

Wrong.

Which includes Blue Gene Cyclops, Blue Gene/L, and STI CELL..... According to you, IBM has as much right to use CELL as Kutaragi Ken, so why aren't they using it right away? One teraflop CELL would replace 200 BlueGene/L ASICs and allow IBM to claim "Behold, we have put 1 PETAFLOP into a box the size of a dishwasher, now bow down and worship this God in a box. And it cost only $399,990.00 + tax at Wal-mart". Why is IBM spending $130 million to build BlueGene/P when SCEI CELL would deliver the same performance for only $399,990.00+tax???($399.99 * 1000 PSX3s)

Again, your wrong. Sony's implementation of Cell and IBM's are two totally different things.

Oh yea and your forgetting, IBM's own Cellular processor isn't done.


And CELL doesn't???? You really believe PSX3 will come with a huge fan and burns power like a microwave???

You can do more with a console.

Actually it is. According to sources CELL design was completed a while back but PSP is still not done(Kutaragi confirmed they still had no physical chip), so the PSP is actually considered a later design than CELL. The reason it is coming out first is because it is more of a traditional architecture and require less testing, whereas CELL is a BlueGene/L derivative and depends on the maturity of its mother's technology.

Actually it isn't.

And actually your wrong. I'm not even going to go into it. You know shit, this is the end of the story.

You are missing the whole point of CELL architecture. CELL architecture is about aggregate power, never about the raw power of individual node. In fact, it makes sense that SCEI makes CELL as simple and inexpensive as IBM did to keep the cost as low as possible, so that it can use a whole bunch of them.

Nope.
 
Re: ...

DeadmeatGA said:
IBM can scale it [as in the architecture, or perhaps the IC] up for use it's servers.
Servers don't need APUs.

Apparently they do as you yourself, the great and mighty Deadmeat judged the Broadband Engine as a Server Processor.

And Servers, even routers might need the APUs as a lot of the work they do supports a quite nice degree of parallelism.

Usually, there comes a point where a person learns just enough of a new topic/field that he comes to the realization that he knows nothing.
A good description of yourself.

You must love your wit a bit too much, but you should still need a bit of modesty in there... mr. "I am never wrong not even when I get my ass handed to me by two pro's on this board regarding game programming + GSCube ( DeanoC, Archie ), Sony history, etc..."


Yet, it's decidely clear that at the APU level there is 32GFlops as a lower-bounds as per the "prefered embodiment".
It is interesting to observe that no one was able to come up with a single 4-way vector processor capable of 32 GFLOPS, with its power consumption low enough(1~2 watts) to have 32 of them on a chip. Altivec doesn't do it, and not even the future NEC SX-7 supercomputer is capable of such a feat.

The NEC super-computer is interested in DP math and we have seen no Altivec-like implementation for processors targeted at the 65 nm node and heavy multi-media work.

quote]Why did a SCE employee patent this and not IBM?
Because IBM has no use for APU oriented programming??? IBM runs its user processes on PPC.

IBM is interested in STI's CELL: it allows them to enter the CE market and it is one more weapon in their MPU arsenal: they are converting their Server software to STI CELL for a reason.

It also would be good for setting up Server-farms for computing-for-hire scenarios given the great abilioty at parallel processing and the scalbility of STI CELL ( and the fact it is online ready ).

IBM has previous patents outlining the structure of the APU, the structure of a PE and the SMP set-up of different PEs in the same chip: all of this is in line of the snapshot of CELL Suzuoki took in his patent.

On the contrary, anything but 32GFlop per APU is unreasable for that time period.
If 32 GFLOPS were true, then there is no point of Blue Gene/L's existence; why use 65000k chips to reach 300 TFLOPS in 2006 when 300 CELL chips would do it???

Because BlueGene is aimed at peak utilization in mostly scalar Double Precision FP calculations: in what they do there will be parallelism, but not much at the data level to go SIMD nuts on it and Single Precision FP data types.

A Vector Unit in the Emotion Engine from 2000 (250nm) outputs 2.5-something GFlop. We're basically talking of a power of 10 here - Using Intel's clock scaling as a baseline, we'll see the same power of 10 leap in the same timeframe.
And look at what Intel's clock scaling has brought us, 100+ watts power consumption at mere 3.4 Ghz. Now even "mobile" chips routinely burn 30~40 watts at the peak. Going 30 Mhz to 300 Mhz was easy, going from 300 Mhz to 3 Ghz will be 10 times as hard, as shown by struggles of Itanium2, Power5, Pentium4, etc...

Hold on before you worry to much for IPF: it still has the x86 baggage which is shedding and it is still one manufacturong process behind x86.

About the Pentium 4: I would like to remind you that significant portions of the chip are indeed running at a mere 6.8 GHz ( ALUs, AGUs, Register File, etc... ) and they are still using bulk CMOS.

The PSP itself is a great example of how their going to let Moore's Law reduce costs over time by going with a single SoC. Hell, no external RAM, nothing... it's beautiful.
And it is rated at only 2.6 GFLOPS. To say that single PE will be 100x as powerful as this is rather stupid.

Considering the SoC the VFPU was targeted for ( Portable gaming devices ) and the fact the old Emotion Engine managed 6.2 GFLOPS I think there is the chance for a PE with 8xAPUs to be "that" powerful ( not the strict concersn of mobile applications ).

Microsoft has basically admitted defeat on this front and taken a much better route this time around.
It is really single Power5 Vs quad PPC440s competition. Who will laugh all the way to bank is to be judged.

It is a bit different than just a quad PPC440 and it is not yet clear that Microsoft will go for a dual core POWER5.
 
...

To Vince

Seriously, go F- yourself.
Hey mod, can we get a ban on Vince???

First of all, the only source that could give you this conclusion is the Suzuoki patent.
Actually I drew my conclusion from the BlueGene documents. Afterall, CELL is a Blue Gene derivative.

It's a PPC core, it can run anything AFAIK.
You simply don't understand how CELL works. The whole point of CELL is the separation between OS functionality and user process functionality.

And look at AMD's cores which utilize a first generation IBM process techology
And Opteron is a single core device. Now imagine how much power 32 APUs and 4 PPC cores will burn???

Oh yea and your forgetting, IBM's own Cellular processor isn't done.
What do you mean??? You saw the picture of completed 512 BlueGene ASICs powering BlueGene/L prototype just a couple days ago.

To Panajev

Apparently they do as you yourself, the great and mighty Deadmeat judged the Broadband Engine as a Server Processor.
Server for SCEI, not for IBM. IBM has no use for APU.

The NEC super-computer is interested in DP math and we have seen no Altivec-like implementation for processors targeted at the 65 nm node and heavy multi-media work.
Actually PPC980 G6+(or G7, depending on the time frame). I don't think G6 Altivec it will be rated 32 GFLOPS.

Because BlueGene is aimed at peak utilization in mostly scalar Double Precision FP calculations:
And why not use single precision? Or even modify CELL a bit to support double precision??? Afterall, 600 double-precision CELLs are still 1/100th the chip count of 65,000 chips required of full-blown BlueGene/L???

Hold on before you worry to much for IPF: it still has the x86 baggage which is shedding and it is still one manufacturong process behind x86.
And the chips without X86 baggages clock even slower.

and the fact the old Emotion Engine managed 6.2 GFLOPS
EE VUs were never rated at 6.2 GFLOPs; they were rated 2.4 GFLOPS each.

I think there is the chance for a PE with 8xAPUs to be "that" powerful
Unless Sony plans to put a liquid cooling device, hell no.

Anyhow, none of you Sony fanatics are answering my question; why is IBM sticking with BlueGene/L and Cyclops if STI CELL is as great as you folks claim it to be????
 
What do you mean??? You saw the picture of completed 512 BlueGene ASICs powering BlueGene/L prototype just a couple days ago.

That's not Cell. Hell it's not even IBM's own Cell. How many god damn times must we tell you this.

IBM has stated that Cell would be done in 2005, infact IBM has said they will manufacture it for their own uses come 2005.
 
Status
Not open for further replies.
Back
Top