Kentsfield as an alternative to Cell in PS3

TheChefO · Feb 14, 2007

inefficient said:
Terascale is about as exciting as a truck with 80 wheels each one with a small 2 stroke engine and about a liter of fuel.

Could you expound on that please? Could it not run simple programs deligated similar to spe's but obviously smaller/less complex?

3dilettante · Feb 14, 2007

The current Polaris implementation seems to rely on memory that is directly bonded to the processor die. I don't think Intel has given much detail on what exactly the memory is or its capacity, so it's hard to say if each mini-core can be treated like a very simple SPE or if there are other wrinkles to the design.

patsu · Feb 14, 2007

From front page of Beyond3D...

...the chip is merely 275mm2 on 65nm, at 98W at 3.2GHz, or 11W at 1GHz.

...
The most surprising aspect, perhaps, is that the architecture seems to be latency-intolerant, but is coupled with stacked memory to minimize that problem and simulatenously maximize bandwidth. This is a fundamentally different paradigm from today's latency-tolerant GPUs. As such, much of the potential of Polaris as a concept has yet to be known, as it depends greatly upon the performance and size of its stacked memory, and of the overall architecture's latency, bandwidth and caching characteristics.

TheChefO · Feb 14, 2007

patsu said:
From front page of Beyond3D...

Thanks Patsu, but believe it or not the front page article is actually what sparked the question in the first place. It didn't give the in-depth description I was looking for here in seeing whether the tech might be of some use in a future console.

Capeta · Feb 15, 2007

TheChefO said:
Agreed - I'm thinking more along the lines of using it as a coprocessor. At 32nm this chip would cost peanuts.

You could probably use it as a coprocessor for example as a PPU to handle really complicated physics since it's capable of 1 TFLOPS (single precision) at 3GHz. You probably could divide it up into 4 smaller dies each having 20 cores mounted in a SiP to improve yields and lower costs. Intel has designed each core to be tileable so you just slap more cores on when you want more processing capability. It doesnt' even seem like the cores need to be clustered into groups of 8 etc.

Pigman BABY!!! · Feb 16, 2007

Cell slaughters the Kentsfield in Floating Point but the Kentsfield slaughters Cell in Integer and general purpose.

Carl B · Feb 16, 2007

Pigman BABY!!! said:
Cell slaughters the Kentsfield in Floating Point but the Kentsfield slaughters Cell in Integer and general purpose.

No, the Cell has integer performance to match its floating point performance... and in terms of 'general purpose,' what does that even mean? If we're talking about x86 and OOE execution, then yes. But any 'general purpose' task that can be executed on any other architecture can be optimized to run on Cell as well. Optimization, and thus effort, is the key.

pjbliverpool · Feb 16, 2007

Carl B said:
No, the Cell has integer performance to match its floating point performance... and in terms of 'general purpose,' what does that even mean? If we're talking about x86 and OOE execution, then yes. But any 'general purpose' task that can be executed on any other architecture can be optimized to run on Cell as well. Optimization, and thus effort, is the key.

So with the correct optimisation Cell effectively "slaughters" Kentsfield not only in floating point, but also integer and even general purpose code?

3dilettante · Feb 16, 2007

SIMD operations can operate on int values.
Scalar int operations would favor Kentsfield per-core, and arguably would favor the x86 chip overall.

nonamer · Feb 16, 2007

pjbliverpool said:
So with the correct optimisation Cell effectively "slaughters" Kentsfield not only in floating point, but also integer and even general purpose code?

Assuming you can write very concurrent integer code. For single threaded code its an easy Kentsfield victory.

Carl B · Feb 16, 2007

pjbliverpool said:
So with the correct optimisation Cell effectively "slaughters" Kentsfield not only in floating point, but also integer and even general purpose code?

See, here's the problem - two years after the fact, there are still people living in the shadow of Major Nelson's post-E3 article of '05. Toss that sh*t out.

Like I said, when people refer to 'general purpose' code, they are talking basically about unoptimized code created to perform a specific task; the burden is placed on the hardware more or less to run effectively. Kentsfield, as most x86 architectures, is obviously well suited for this - its entire design paradigm is centered around this. Out-of-order execution is the key element, as well as support for a massive legacy-code base. And in this, obviously obviously Kentsfield lays waste to Cell.

Ok.

Now... when people use 'general purpose' loosely though, as was done by Pigman, they do so unsure of what they're saying, but with vague notions of AI performance, gameplay elements, etc etc... and these are two different things. That code is only general purpose insomuch as normally in the PC space it is also unoptimized, relies on heavy-branching, and so on and so on. So when you hear things like "Cell is bad on general purpose code...," what you're hearing is the distilled thoughts boiled down essentially with regard to Cell's performance on current code (and coding practices). But yes, with the time and effort taken to optimize, Cell can run these things extremely well. *That* conversation, time/reward for doing such optimization, is a different thread topic in and of itself. But the idea that 'general purpose' is some nebulous concept that can't have qualities associated with it save what gets regurgitated from some wildly over-simplified (and largely incorrect) article of two years ago has got to end. It's sad, because the average layperson actually knows less about processors for Nelson's legacy rather than more, with the ironic twist than in their own minds it's just the opposite.

pjbliverpool · Feb 16, 2007

Carl B said:
See, here's the problem - two years after the fact, there are still people living in the shadow of Major Nelson's post-E3 article of '05. Toss that sh*t out.

Like I said, when people refer to 'general purpose' code, they are talking basically about unoptimized code created to perform a specific task; the burden is placed on the hardware more or less to run effectively. Kentsfield, as most x86 architectures, is obviously well suited for this - its entire design paradigm is centered around this. Out-of-order execution is the key element, as well as support for a massive legacy-code base. And in this, obviously obviously Kentsfield lays waste to Cell.

Ok.

Now... when people use 'general purpose' loosely though, as was done by Pigman, they do so unsure of what they're saying, but with vague notions of AI performance, gameplay elements, etc etc... and these are two different things. That code is only general purpose insomuch as normally in the PC space it is also unoptimized, relies on heavy-branching, and so on and so on. So when you hear things like "Cell is bad on general purpose code...," what you're hearing is the distilled thoughts boiled down essentially with regard to Cell's performance on current code (and coding practices). But yes, with the time and effort taken to optimize, Cell can run these things extremely well. *That* conversation, time/reward for doing such optimization, is a different thread topic in and of itself. But the idea that 'general purpose' is some nebulous concept that can't have qualities associated with it save what gets regurgitated from some wildly over-simplified (and largely incorrect) article of two years ago has got to end. It's sad, because the average layperson actually knows less about processors for Nelson's legacy rather than more, with the ironic twist than in their own minds it's just the opposite.

Thanks, thats a very revealing explanation of what "general purpose code" really is (something I have been trying to find out for a while now).

So the bottom line here is that while Kentsfield (or any other PC processor) is great in a PC were code is unoptimised, the Cell is a better and more powerful processor in most if not every way but only with very optimised code. So if every PC in the world ran a Cell and all code was purpose built for it, pretty much every app from windows to Excel to games would run better. Right?

Carl B · Feb 16, 2007

pjbliverpool said:
Thanks, thats a very revealing explanation of what "general purpose code" really is (something I have been trying to find out for a while now).

So the bottom line here is that while Kentsfield (or any other PC processor) is great in a PC were code is unoptimised, the Cell is a better and more powerful processor in most if not every way but only with very optimised code. So if every PC in the world ran a Cell and all code was purpose built for it, pretty much every app from windows to Excel to games would run better. Right?

Uh, well it's more complicated than that... but essentially yes... essentially.

That doesn't make Cell a 'better' processor though - it just makes it a different processor. See, that's the whole thing; consider both Kentsfield and Cell to be 'miracle' chips - it's simply that what those miracles are, are different. With Cell, if you write the proper code, your app is going to fly like the wind. The 'miracle' here is potential; the cost is rethinking the way you write code, becoming less reliant on compilers to get you there, and basically more time and money spent on the software dev side (this is not an absolute truth, but a relative one).

On the other hand, the 'miracle' of Kentsfield - and all of the OOE x86 chips with which it competes - is that they can take the most poorly written, unoptimized, and terrible piece of legacy x86 code from 1990... run it, and in fact run it much faster than it ran on the processors contemporary to when that code was originally written. The benefit is cheap(er), and quicker, software development. As these architectures go increasingly multi-core, there will be increased crossover between them and processors like Cell in terms of dev challenge, but I think this helps paint a picture for you of why in the present day even though a world built on Cell might ultimately be faster, it's just simply a world we can't afford to move to on the desktop on a macro level from an infrastructure, ecosystem, or economic standpoint. I think whatever course the desktop takes, it will remain largely the realm of x86 for quite some time to come; every year that passes, legacy support only becomes more crucial than the last.

I'm a fan of Cell though, and I think there are definitely places for it and architectures like it going forward; not everywhere is the desktop, and there are lots of industries willing to put the time in to optimize if it means a tangible improvement in execution.

3dilettante · Feb 16, 2007

General purpose code sounds like any software product that has a development cost structure and target market that does not permit pervasive low-level optimization to a single platform.

Perhaps it is code that was made without the assumption that a problem can be easily subdivided and run as a streaming or batched application, where the data could not be assumed to be amenable to SIMD execution, or the dynamic behavior proves extremely difficult to profile at compile time.

Perhaps it is software for a problem that can be run as such, but that the required overhead makes it undesirable when compared to more general architectures.

pjbliverpool · Feb 16, 2007

Carl B said:
Uh, well it's more complicated than that... but essentially yes... essentially.

That doesn't make Cell a 'better' processor though - it just makes it a different processor. See, that's the whole thing; consider both Kentsfield and Cell to be 'miracle' chips - it's simply that what those miracles are, are different. With Cell, if you write the proper code, your app is going to fly like the wind. The 'miracle' here is potential; the cost is rethinking the way you write code, becoming less reliant on compilers to get you there, and basically more time and money spent on the software dev side (this is not an absolute truth, but a relative one).

On the other hand, the 'miracle' of Kentsfield - and all of the OOE x86 chips with which it competes - is that they can take the most poorly written, unoptimized, and terrible piece of legacy x86 code from 1990... run it, and in fact run it much faster than it ran on the processors contemporary to when that code was originally written. The benefit is cheap(er), and quicker, software development. As these architectures go increasingly multi-core, there will be increased crossover between them and processors like Cell in terms of dev challenge, but I think this helps paint a picture for you of why in the present day even though a world built on Cell might ultimately be faster, it's just simply a world we can't afford to move to on the desktop on a macro level from an infrastructure, ecosystem, or economic standpoint. I think whatever course the desktop takes, it will remain largely the realm of x86 for quite some time to come; every year that passes, legacy support only becomes more crucial than the last.

I'm a fan of Cell though, and I think there are definitely places for it and architectures like it going forward; not everywhere is the desktop, and there are lots of industries willing to put the time in to optimize if it means a tangible improvement in execution.

Ok cheers, I think the bottom line im hearing here though is that Cell is better as a console CPU (where it can be optimised for) than Kentsfield (with or without optimisation).

Or put another way, if you take 100% advantage of both, Cell would come out on top in most areas.

pjbliverpool · Feb 16, 2007

3dilettante said:
General purpose code sounds like any software product that has a development cost structure and target market that does not permit pervasive low-level optimization to a single platform.

Perhaps it is code that was made without the assumption that a problem can be easily subdivided and run as a streaming or batched application, where the data could not be assumed to be amenable to SIMD execution, or the dynamic behavior proves extremely difficult to profile at compile time.

Perhaps it is software for a problem that can be run as such, but that the required overhead makes it undesirable when compared to more general architectures.

I guess the real issue is where do you draw the line between ease of use and power? Cell has more power, Kentsfield is easier to use, is either ideal or is their a better implementation....

My guess is Intel will bring both sides together, x86 cores which are "easy" to use combined with specialised high performance cores ala Cell.

TheChefO · Feb 16, 2007

pjbliverpool said:
I guess the real issue is where do you draw the line between ease of use and power? Cell has more power, Kentsfield is easier to use, is either ideal or is their a better implementation....

My guess is Intel will bring both sides together, x86 cores which are "easy" to use combined with specialised high performance cores ala Cell.

Terascale

Cal · Feb 17, 2007

To me the term "general purpose code" does not sound like unoptimized code. It's more like the high level language code you can directly compile for multi-platform use, e.g. a C++ FFT routine which only uses standard floating point operation.

Capeta · Feb 17, 2007

Cal said:
To me the term "general purpose code" does not sound like unoptimized code. It's more like the high level language code you can directly compile for multi-platform use, e.g. a C++ FFT routine which only uses standard floating point operation.

Agreed.

Shifty Geezer · Feb 17, 2007

Cal said:
To me the term "general purpose code" does not sound like unoptimized code. It's more like the high level language code you can directly compile for multi-platform use.

Which is unoptimized code! Those algorithms will perform differently well on different architectures. An architecture that runs them well is considered to be good at 'general purpose code'. The actual tasks that 'general purpose code' perform are not 'general purpose tasks' : 'general purpose tasks' don't exist. Thus a chip that performs poorly on 'general purpose code' may not be bad at the task that code is doing; it could just be needing a different, non-generalised implementation. That doesn't mean to say that any processor can perform well at any task, though. Some processors just aren't well designed for some tasks no matter what the implementation.

Kentsfield as an alternative to Cell in PS3

TheChefO

3dilettante

patsu

TheChefO

Capeta

Pigman BABY!!!

Carl B

Friends call me xbd

pjbliverpool

B3D Scallywag

3dilettante

nonamer

Carl B

Friends call me xbd

pjbliverpool

B3D Scallywag

Carl B

Friends call me xbd

3dilettante

pjbliverpool

B3D Scallywag

pjbliverpool

B3D Scallywag

TheChefO

Cal

Capeta

Shifty Geezer

uber-Troll!

Similar threads