How would you react when the PSX3 spec reads 240~480 GFLOPS?

Gubbi · Sep 19, 2003

Panajev2001a said:
About a CELL PDA advantage in a home CELL computing set-up ( btw with 55 Mbps of 802.11g we have quite a bit of bandwidth ), well it is true that you can use abstraction layers on the machine that would allow it to communicate and share data and even distributing the computing load.

But 55Mbit/s is ~6MB/s compared to 1TB/s bandwidth for the eDRAM (and 20-50GB/s for the external RAM in the box). That's 5 and 3 decimal orders of magnitude less. Latency will be ~1us compared to 3-5 ns for eDRAM and ~50ns for external RAM (with ondie ram controllers) or 2-3 decimal orders of magnitude worse. Even with GigE-net it's looking bad. Nobody is going to bother making an application that can utilize computational power on networked devices. At most use some other device as a controller.

Panajev2001a said:
CELL PDA chips would bring more bang for the same transistor budget as they would not need that big of a software abstactrion layer.

Few segments put more emphasis on specialization than the embedded battery-powered one. And with good reason. With CELL you get all the overhead of splitting the workload into fix sized chunks but without the advantage of parallel processing. So it will burn more power than a competing purpose-built architecture with the same level of performance. The entire premise for CELL in handhelds is based upon hardware being so cheap and low power that software will dominate solution costs completely. I'm not sure I agree with this.

Panajev2001a said:
On the application side it would be simplier for developers to get different CELL based devices to share data and inter-operate: the capabilit of the Apulets to freely travel over networks and the standard APU ISA would be welcome features to application programmers.

The idea of your home appliances sharing the same basic architecture and being inter-connected intelligently seems quite appealing.

This sounds like all the Bluetooth hype all over again. Just because some applications has similar physical fabrics, doesn't mean they will automatically be able to talk to each other.

Protocols are going to be defined to allow inter-communication. And once you have to marshall/demarshall data you no longer *need* the same architecture across all applications (again cost will play a factor).

I understand SONY would like all the worlds people to live in homes with all SONY appliances. But it's not going to happen.

Cheers
Gubbi

randycat99 · Sep 19, 2003

chaphack said:
What the heck are you rambling about this time.. :?

Very simple- how can you not have a good idea about what Cell in PS3 is all about now? The notion that you would have these "questions" at this juncture (after aaaaaaalllllll the info that has been posted so far here), suggests that all of your anti-PS3 comments and topics prior were made COMPLETELY out of ignorance (that would be lack of knowledge, more specifically). Is that really what you are going for? You cannot deny this consequence. It's all there, immortalized in the bowels of B3D.

The biggest reason i asked off pana, is because he seems to be the only nice guy around that bothers to do long and nice Cell posts, AND he doesnt do those needless holier than thou jabs at other posters.

...doesnt do those needless holier than thou jabs at you... Did you ever wonder why that could be, or did you just chalk it up to complete coincidence? Perhaps, your forum persona has a little something to do with that? (Like I said, turn it down another 60%, and you won't have to worry about people beating up on you. It's that simple, believe it or not.) If you are looking for information, you are better off piecing facts together that have been presented in a logical, feasible manner, not just relying on someone "nice" to spoonfeed something for you to "believe" in. (nothing against Pana, of course)

That will be all I say on this matter to you.

jvd · Sep 19, 2003

I actually feel a post about the cell chip and its pro's and con's. And what it could be like if everything goes right and if everything goes wrong and how it stacks up to current and future threads by someone with no bias should be posted and sticky and locked. That way we wont have new people coming in and asking questions and we wont get any double posts . If anyone wants to try it and send me a copy before its posted and I guess mabye fala so that he can look over it to make sure there are no wild claims of god like performance or pentium classic performance. I think this wll also stop us from argueeing over points that have been beaten to the ground. .

cthellis42 · Sep 19, 2003

So something like "CELL, in its nature, could deliver almost god-like performance for the next century, but since Sony SUCKS it will screw this up and you'll get yet another console that performs worse than the Dreamcast and will overcharge you for faulty components" just won't cut it, eh?

jvd · Sep 19, 2003

cthellis42 said:
So something like "CELL, in its nature, could deliver almost god-like performance for the next century, but since Sony SUCKS it will screw this up and you'll get yet another console that performs worse than the Dreamcast and will overcharge you for faulty components" just won't cut it, eh?

No that wont cut it . although if u add in the fact that every ps3 will come with a bag of sweet tarts and that will be thier main selling point it might work. Mmmmmm sweat tarts .....Mmmmmmm

Vince · Sep 19, 2003

cthellis42 said:
So something like "CELL, in its nature, could deliver almost god-like performance for the next century, but since Sony SUCKS it will screw this up and you'll get yet another console that performs worse than the Dreamcast and will overcharge you for faulty components" just won't cut it, eh?

cthellis42 · Sep 19, 2003

jvd said:
No that wont cut it . although if u add in the fact that every ps3 will come with a bag of sweet tarts and that will be thier main selling point it might work. Mmmmmm sweat tarts .....Mmmmmmm

Wow, true! Man, if only they'd filled their expansion bay in the PS2 with Sweet Tarts, they'd probably have sold over 100 million by now~!

I'll pass on the Sweat Tarts, though. That was a really BAD choice for a new taste sensation!

Panajev2001a · Sep 19, 2003

I do see a drive in the software by looking at the growth in computing power needed for multi-media ( video playback, music, 3D gaming, etc... ) is growing at a certain pace which makes the need of a power efficient solution desiderable.

What CELL would mean to a PDA could be, in worst case scenarios ( programs that are single threaded and use all scalar operation even if compiled for CELL ), a little bit more transistors on the CPU.

Not all those transistors will be constantly be switching on and off and thus consuming power.

The patent I posted a picture from describe that basically in such a scenario all the unused APUs would go to sleep and in the APU(s) that are running the programs' code would be in power saving mode as well: only one of the FXUs/FPUs would be active while processing scalar instructions.

Imagine a CELL chip with a single PE: 1 PU and 2 APUs and the Pixel Rendering part ( Pixel Engine, Image Cache and CRTC ) running at let's say 500 MHz.

Say we have like 4-8 MB of e-DRAM and some off-chip DRAM.

That is not impossible for 90 nm, but come the 65 nm SOI node ( which could be used for this PDA chips as well if the yelds ar nice enough ) and allocating the transistor budget for such a set-up will not be a problem.

In this sense I can see the ideas behind the "CPU cost will not be the determinant factor".

But, even if you have that kind of transistor budget the issue of maximizing the use of those transistors remains: setting up big high level software layers to enable neat and versatile data sharing and networking of these different devices would mean spending more transistors than what a CELL solution would need to do the same thing.

In such a worst case scenario ( as I was describing few lines above ) which would not be the norm, this would be equivalent to running the simple PU and 1 APU's FXU/FPU.

Still that would yeld a maximum peak of 1 GFLOPS or 1 GOPS which for a PDA would not be THAT bad.

With conventional ARM and SH or MIPS architectures you can certainly match with less transistors the worst case performance of CELL ( PDA solution ), but that is perfectly fine, even numbers like 1 GFLOPS and 1 GOPS would be fine in the right context: such a worst case scenario would mean users who only use a Word Processor or browse the web or send e-mails ( notice the use of "or" ) and that kind of processing power is more than what users would need.

The challenge lies in 3D graphics, multi-tasking and multi-media applications in general: those play to CELL's strenghts.

CELL was designed to do "ok" on those applications that are serial in nature and do not offer much chance to extract power from a parallel processors configuration and to do "very well" on very bandwidth and processing intensive applications that would tend to offer more extractible parallelism.

CELL was also designed to offer power savings features for the former applications: down to putting in sleeping mode the single unused FPUs and FXUs.

CELL is not truely a revolution that brings a totally new concept: while it has some innovative ideas, its base is in tons of the past 20-30 research projects that tried to shake the world and bring new computing paradigms.

What CELL designers decided, starting the architecture from scratch, was to take this immense amount of past research and put it to good use now that finally the implementation details were allowing those concepts tried in economically feasible projects.

E-DRAM is not a novel idea, highly parallel execution resources, modularity and scalability, distributed processing, etc... are not new concepts.

Put the right ideas together with the necessary fairy dust to aggregate it together and you would get something very interesting.

The new technology finally allows to put all those ingenious ideas that have been sitting in labs for over 10-20 years: asI said, choose the right ones, add some spark of genius and you will probably accomplish something ( sorry if repeat some concepts over and over ).

This is something that happens when you sit down, look at the manufacturing process you expect in the next 5-7 years ( leaving yourself headroom if the timeline gets pushed back or hopefully forward ) and design something new with the physical limitations you will have in 5-7 years and not currently.

You will see the good behind this once IPF, in its markets, spread its wings: carefully designed ISA will have its advantages in the long run.

The problem is scaling the existing architectures to match that kind of potential: they were not designed to work on these massive ( parallel ) workloads.

Pentium 4, ARM7-11, MIPS32-64, SH-4 and SH-5, all are the result of architectures and design methodology optimized to run the common case fast in a period where the common case did not involve processing tons of parallel data streams.

The Pentium 4 EE symbolizes this:ridicously fast for what 80% of the users that own PCs now would need ( e-mails, web browsing, Word Processing and use of Excel datasheets ), but not fast enough to drive 3D games without the help of expwnsive GPUs with powerful and highly parallel processing resources.

As they mention in the APU patent I linked previously ( if you do not have the link I will re-post it ), a big issue is to design a power efficient architecture that can deal with what this new, next generation computing problem is: it needs to be very fast in processing several parallel data streams and it needs to provide decent performance for people who just do web browsing and Word Processing, etc...

It was good for x86 where the world was beginning to see the birth of current multimedia applications and it was siccesful in scaling its performance to meet the demands from the content providers: right now, they have been already left behind.

If I can have a CELL CPU in a Dektop PC running JIT compiled x86 legacy code with 2-3 GFLOPS or GOPS of effective performance ( well below current Pentium 4 standards, but who needs much more for web browsing and word processing ), but it is able to reach 150-200 out of its 256-512 GFLOPS peak in complex 3D games, etc... for a similar price compared to a Pentium 4 EE ( by 2005 it should hit $400, it is starting later this year at $740 per 1,000 chips quantities which mean that you, the single consumer, will pay slightly more for your order, depending who you are buying the chip from though ).

Judging by the power they are trying to pack for $299 in PlayStation 3, a $400 CELL processor ( imagining making some sort of profit per CPU ) could be a decent processor.

Even if it only did, in PC games, 50 GFLOPS that would still be more than 4x the power of the Pentium 4 EE.

It could run single instances of Word and excel like a Pentium III 733 MHz for all I care ( even though if you added several instances of those applications or if you are running tons of tasks in the background you would probably get a nice speed-up ), but when you went to play in the area consumers are starting to buy their Hardware for ( 3D games, high quality video [how much performance do you have left when you run the Hi-def version of Terminator 2 on your PC ? Not much even on 3.0 GHz computers] ) and sound processing ) you would see this architecture leaving the common x86 in the ground.

I wanted to link this patent as well, it might be interesting to the discussions we are having:

http://appft1.uspto.gov/netacgi/nph...)&OS=an/"sony+computer"&RS=AN/"sony+computer"

A multi-processing computer architecture and a method of operating the same are provided. The multi-processing architecture provides a main processor and multiple sub-processors cascaded together to efficiently execute loop operations. The main processor executes operations outside of a loop and controls the loop. The multiple sub-processors are operably interconnected, and are each assigned by the main processor to a given loop iteration. Each sub-processor is operable to receive one or more sub-instructions sequentially, operate on each sub-instruction and propagate the sub-instruction to a subsequent sub-processor.

MfA · Sep 20, 2003

Video decoding is one of those things where ASICs get orders of magnitude better power-consumption/area-efficiency/whatever than any programmable architecture, Cell not excepted.

For nearly any SOC for mobile apps which is expected to be decoding video a significant amount of time (more than a couple of %) it makes much more sense to just put on an application specific video decoding block than to use Cell for it. It just needs a small corner of the IC.

SH wont be a conventional architecture for very long anymore BTW. They are going multicore. Although they are more pragmatic about the need for heterogenous designs for mobile apps (of course Sony is too when it gets right down to it, their actions belie their words ... hence the PSP

.

jvd · Sep 20, 2003

cthellis42 said:
jvd said:

No that wont cut it . although if u add in the fact that every ps3 will come with a bag of sweet tarts and that will be thier main selling point it might work. Mmmmmm sweat tarts .....Mmmmmmm

Click to expand...

Wow, true! Man, if only they'd filled their expansion bay in the PS2 with Sweet Tarts, they'd probably have sold over 100 million by now~!

I'll pass on the Sweat Tarts, though. That was a really BAD choice for a new taste sensation!

Hey i love sweat tarts but why complain. They can be like nintendo but instead of colors diffrent candy inside .

jvd · Sep 20, 2003

Panajev2001a - Your forgetting that everything that you'd use on a pc would need to be rewriten and years worth of games will no longer be playable. That is why x86 will be here for a very long time. Not only that but its very good for what it needs to do . As time goes on those very expensive x86 and gpus will be dirt cheap and there is so much money being made that more and more will be dumped into r&d . Cell is not going to over take anything . There are only three companys supporting it . Sony , toshiba and ibm. Ibm seems to want it for huge machines. Toshiba... well i dunno what htey are going to use it for . And sony is smoking alot of crack and talking about connecting everything to everything. The problem for sony is at anytime someone can come and take away the ps2 market share.

Citan · Sep 20, 2003

There are only three companys supporting it . Sony , toshiba and ibm. Ibm seems to want it for huge machines. Toshiba... well i dunno what htey are going to use it for . .

Hardware wise x86 really only has 3 companys pushing it at this point(on the cpu level) and of those only 2 of those companys even fab themselves. Software wise we don't really know who is going to support what at this point so it's not workth talking about IMO.

The problem for sony is at anytime someone can come and take away the ps2 market share.

This could be said for any market leader, it's just never quite that simple in reality.

Edit:
Gah that didn't come across right at all, chalk that up to far to long a day at work today :\

pcostabel · Sep 20, 2003

jvd said:
Panajev2001a - Your forgetting that everything that you'd use on a pc would need to be rewriten and years worth of games will no longer be playable. That is why x86 will be here for a very long time. Not only that but its very good for what it needs to do . As time goes on those very expensive x86 and gpus will be dirt cheap and there is so much money being made that more and more will be dumped into r&d . Cell is not going to over take anything . There are only three companys supporting it . Sony , toshiba and ibm. Ibm seems to want it for huge machines. Toshiba... well i dunno what htey are going to use it for . And sony is smoking alot of crack and talking about connecting everything to everything. The problem for sony is at anytime someone can come and take away the ps2 market share.

What applications would need to be rewritten? Every single application I can think of is available for Linux, and could be easily recompiled for the Cell version. As for games, how many year old games are you playing?
While I don't see Cell taking over the desktop any time soon, it could well become the standard for servers, and it will create a totally new market, where computing power becomes a commodity. You will buy FLOPS like you buy bandwidth today. No big deal for the average user that only needs a web browser or a world processor, but a huge improvement for engineering, GC and military outlets. Not to mention that you would be able to buy enough power to have your real-time Toy Story game.

Vince · Sep 20, 2003

akira888 said:
In my defense...<snip>

Works for me, I can agree with everything you said there.

Vince said:
I agree. Most computational tasks aren't that latency sensitive. Gaming however by its nature is. Photoshop is not required to process their various filters in 16.66ms, however consoles have exactly that long to process and raster one frame. That still leaves the nasty problem of raw bandwidth being miniscule in terms of the computational internal bus I/O though.

Yeah, totally. I'm also not saying that you'll be distributed rasterizing over the net. I'm much more interested in seeing what can be done with AI, World Sim, or inter-device sharing for non-RT tasks. Which brings me to this:

Mfa said:
The PDA would be a dumb console to the console, for that you dont really need Cell on the PDA.

You also told me this two years ago, when you told me they'd never attempt this. Two years later, and in light of what we know of APUlets, with their flexibility and inherient ability to make locality irrelevent, how can you say this?

How would the PDA be a "dumb" terminal when it's capable of pre/post-manipulating the same data that the supposed "smarter" terminal is processing? For an example, why couldn't you do voice/visual recognition by sampling, digitizing, and packaging into APUlets on the "dumb" device - then having them run on non-local APUs* - and results sent back in APUlet form that is further processed based on user input and then rendered by the "dumb" terminal for use in, say, a future eYeToy game?

You have basic processing on the "dumb" device with the same data being processed (in the middle of the traditional "pipeline") non-locally by a vastly more powerful resource?

I thought the whole ideology beyond the APUlet, was that it's self-contained and universal. Capable of being processed at any node along the Cell fabric?

*Non-local doesn't imply beyond a LAN; just beyond that device.

I dont think Intel is worried, they have plenty of years to make money off Xscale in which low power mobile applications will need local processing power. The future isnt now, or anytime soon.

I think you're mistaken. Then again, what other than a company like Sony (who themselves are fortunate for STI) could do this? Very few companies are in the home electronics, semiconductor and media production buisnesses.

So you're right, Intel could never feasible do this wrt 3rd party economics - and if the future was left upto them, it wouldn't be anytime soon.

Although, I'm not supporting this PC adoption of Cell talk. Nor am I predicting doom and gloom for Intel. But, IMHO, the PC is overrated in the grandscheme of things.

PS. Joao Magueijo is the VSL guy.

Panajev2001a · Sep 20, 2003

jvd said:
Panajev2001a - Your forgetting that everything that you'd use on a pc would need to be rewriten and years worth of games will no longer be playable. That is why x86 will be here for a very long time. Not only that but its very good for what it needs to do . As time goes on those very expensive x86 and gpus will be dirt cheap and there is so much money being made that more and more will be dumped into r&d . Cell is not going to over take anything . There are only three companys supporting it . Sony , toshiba and ibm. Ibm seems to want it for huge machines. Toshiba... well i dunno what htey are going to use it for . And sony is smoking alot of crack and talking about connecting everything to everything. The problem for sony is at anytime someone can come and take away the ps2 market share.

Code morphing, FX!32, JIT compilers... I am sure you have heard of them.

DeadmeatGA · Sep 20, 2003

...

To jvd

Panajev2001a - Your forgetting that everything that you'd use on a pc would need to be rewriten and years worth of games will no longer be playable.

Panajev is a hardware guy; he has little understanding in software development and optimization process. The "real" reason I oppose architectures like CELL is not the "hardware", but its "impossible" programming model which places the entire burden on developers.

To pcostabel

What applications would need to be rewritten?

Practically every single one of them out there.

Every single application I can think of is available for Linux, and could be easily recompiled for the Cell version.

Sure it will compile, but it won't run any faster than it did on good old X86(Probably slower because single APU cannot match the raw power of Pentium4).

To Panajev

Code morphing, FX!32, JIT compilers... I am sure you have heard of them.

It is impossible to automatically convert a single threaded application into a multithreaded one. Use these translators and you will be using only one APU out of 16. A complete rewrite from ground-up is required to utilize the CELL hardware properly. In other word, there is no easy way out.

MfA · Sep 20, 2003

Vince said:
You also told me this two years ago, when you told me they'd never attempt this. Two years later, and in light of what we know of APUlets, with their flexibility and inherient ability to make locality irrelevent, how can you say this?

Because it isnt, broadband access sucks and it will suck for a while longer ... especially as far as wireless is concerned.

How would the PDA be a "dumb" terminal when it's capable of pre/post-manipulating the same data that the supposed "smarter" terminal is processing?

Why would a toddler try to share a workload with a genius? The computing power of the two is so disproportionate ... the best way is for the PDA to get out of the way as much as possible.

For an example, why couldn't you do voice/visual recognition by sampling, digitizing, and packaging into APUlets on the "dumb" device - then having them run on non-local APUs* - and results sent back in APUlet form that is further processed based on user input and then rendered by the "dumb" terminal for use in, say, a future eYeToy game?

Yes that is what a dumb terminal would do. My point is that what you are actually defining there with the digitizing and packaging is a protocol from a programmers point of view, no compiler will do that for you. As long as you are making a protocol for a dumb terminal to communicate with the central computing resources the architecture used on the terminal itself becomes irrelevant.

I thought the whole ideology beyond the APUlet, was that it's self-contained and universal. Capable of being processed at any node along the Cell fabric?

Some things sound nicer on paper than in practice. The concept of explicit parallelism and area-efficient architectures is sound, the whole rest of the Cell vision is build on loose sand. Even they themselves will have had only guesses at the scalability and applicability of Cell beyond being simply a fast processor until they could run realistic simulations. Consider the amount of research you need before you can even do that BTW ... I wouldnt be convinced they are even at that stage now.

Vision is nice, but it isnt necessary realizeable.

Distributed computing fabrics and renting computer time might happen, again, I dont see any single company pulling it off alone though ... governments wouldnt allow that much concentration of power. Sony doesnt have the expertise to target coorperate clients anyway, which until the broadband situation improves are the only good target for these services. Also I dont see IBM basing their future in that regard on some low level instruction set either.

In the end it will all be much higher level IMO.

Non-local doesn't imply beyond a LAN; just beyond that device.

People arent waiting for a PDA which looses 95% of its functionality once you get out of range of your home.

Distributed computing might be nice for a couple of gimmicks at home, but in the end the only stuff which will be able to use the computing power at home is games. So there is no need to share.

chaphack · Sep 20, 2003

randycat99 said:
chaphack said:

What the heck are you rambling about this time.. :?

Click to expand...

Very simple- how can you not have a good idea about what Cell in PS3 is all about now? The notion that you would have these "questions" at this juncture (after aaaaaaalllllll the info that has been posted so far here), suggests that all of your anti-PS3 comments and topics prior were made COMPLETELY out of ignorance (that would be lack of knowledge, more specifically). Is that really what you are going for? You cannot deny this consequence. It's all there, immortalized in the bowels of B3D.

The biggest reason i asked off pana, is because he seems to be the only nice guy around that bothers to do long and nice Cell posts, AND he doesnt do those needless holier than thou jabs at other posters.

Click to expand...

...doesnt do those needless holier than thou jabs at you... Did you ever wonder why that could be, or did you just chalk it up to complete coincidence? Perhaps, your forum persona has a little something to do with that? (Like I said, turn it down another 60%, and you won't have to worry about people beating up on you. It's that simple, believe it or not.) If you are looking for information, you are better off piecing facts together that have been presented in a logical, feasible manner, not just relying on someone "nice" to spoonfeed something for you to "believe" in. (nothing against Pana, of course)

That will be all I say on this matter to you.

Now now randy....like i said i read enough of B3D Cell topics, cant say i caught all the technicalilities, but i do have some understanding on how Cell dreams to be.

BUT what i am asking is just how/where/what, specifically and easy enough to read, does Cell be more interesting than "normal" hardware? Because i just dont see it. Distributed computing? Hmmmhmmm...

At the end of the day(as a console), its many priority, is still to do great nextgen 3D video/audio. I rather be interested in the good rather than the unusual. :?

Panajev2001a · Sep 20, 2003

Deadmeat,

First: you can split a single thread in multiple ones, maybe some research about DMT by some fellow Intle researcher would help you in this regard ( new threads could be spawned for speculative execution of branches and for early EA calculation for several possible operands the program might need to use in the future filling the TLB with useful data ).

About the JIT/Code morphing issue, well the fact that on single threaded, legacy applications they only managed Pentium III 733 MHz performance would be more than good enough as for that kin d of processing you describe ( no heavvy multi-tasking either ) that processing power would be more than fine.

We do not need 6 GHz Pentium 4 CPUs for Web Brosing, Word processing, etc... and thnat 6 GHz Pentium 4 will not be able to touch the performance of a 2 GHz CELL processor ( 2-4 PEs, 8 APUs per PE ) when dealing with heavvy multi-tasking and complex 3D rendering.

Panajev2001a · Sep 20, 2003

Panajev is a hardware guy; he has little understanding in software development and optimization process.

Thank you for the compliments mr. "best coder in the world"

You judge me too hastily, you might be committing some mistakes

How would you react when the PSX3 spec reads 240~480 GFLOPS?

Gubbi

randycat99

jvd

cthellis42

Hoopy Frood

jvd

Vince

cthellis42

Hoopy Frood

Panajev2001a

MfA

jvd

jvd

Citan

pcostabel

Vince

Panajev2001a

DeadmeatGA

MfA

chaphack

Panajev2001a

Panajev2001a

Similar threads