Kentsfield as an alternative to Cell in PS3

aaaaa00 · Feb 19, 2007

SPM said:
As for multi-platform Windows, as I said it has been tried at great expense to Microsoft and failed miserably. For example NT on the Alpha was fully developed and supported by Microsoft for many years, but was dropped in favour of Linux by HP due to lack of demand.

More like the NT Alpha port was dropped because Digital was unwilling to pay for its maintenance, since the Alpha was on its way to becoming a dead architecture anyway.

Similarly Microsoft developed and supported and hyped Windows 2000 Dataserver for the Itanium platform as an enterprise class server OS. Unfortunately due to lack of demand that too was dropped, but is doing well as a high end server running Linux.

Itanium is still a fully supported server platform in Windows Server 2003 and AFAIK will be for Vista Server.

None of the competing CPUs (Alpha, PowerPC, MIPS, Itanium or other) have historically provided enough compelling advantage to enter the Windows mainstream, and thus never managed to sell any quantities large enough to make developing software specifically for them worthwhile.

That said, it's actually relatively straightforward to provide Windows software on alternative platforms. I tested my stuff on Alpha when it was still around, and today I still cross compile and test on Itanium and x64.

But it's simply not worth the effort to put significant effort into alternate architectures because the market is so small. (Except x64, support for that is growing really rapidly.)

3dilettante · Feb 19, 2007

Shifty Geezer said:
The term 'general purpose' is just plain meaningless, like a 'general purpose' book. All code has a specific purpose, and all code is written in differing degrees of optimization for different hardware, whether written in a high-level or low-level language. The idea of a CPU that runs all code well without having to worry about being targetted by that code gives us the notion of a 'general purpose' CPU, but it's a concept that's born of oversimplifications and marketting, I think.

The persistence of the term for a good number of years indicates some kernel of truth, though the language is perhaps innacurate.

Maybe it is possible to define some categories of code that usually pop up when people discuss "general purpose" code.

The first broad category I can think of is "commodity code". Even if there is no such thing as a general purpose book, we do differentiate between a newspaper and Proust.

There is just code out there that won't be optimized unless it's the default flag on the compiler that happens to be on the machine the developer is working on.

We can probably divide that grouping again, into "commodity due to cost" and "commodity due to low utility". The second one is mostly non-performance critical, so it's not useful as an argument for x86 when a high-end chip would be overkill anyway.

The first group is the large pool of applications where performance would be nice, but other constraints limit the amount of optimization that can be done. Just like we don't pay the premium to have a modern-day Tolstoy write our ad copy, we don't pay someone to massage the code if the money just isn't in it.

More speculative:
"special purpose, dynamic optimum", code that has a specific purpose, but for whatever reason has execution requirements that shift drastically, perhaps due to specific behaviors that occur unpredictably based on data elements.

sub-categories
"special purpose, dynamic optimum (instruction)", code with branch and instruction mixes that are difficult to profile at compile time, or are a solution to a problem that simply has no ideal single instruction combination for a given design.
x86 cores are likely very good at this, or at least better than a single SPE or PPE

"special purpose, dynamic optimum (data)", code that has difficult to profile data access behavior.
This does assume that there is an optimum. If there isn't, then this falls under the category of "SOL" code.
An SPE may still do well here, if the data behavior can be usefully contained within the bounds of the LS, or is not so scattered that it can't be handled by creative DMA paging. The downside is that for commodity code, this has to be handled creatively without a code genius.

This is very much a moving target. If compilers sprout to the ability to magically implement highly robust and very good optimized code from many high-level sources, then the range of commodity code that does well, even if it is considered "general purpose" would increase.

A category with some overlap with commodity code:

"SOL" code. Just plain bad-luck.
Two categories:
"SOL because of the programmer", this is just junk code.
"SOL for a reason", perhaps the problem being worked on is poorly defined, or the spec keeps changing. Perhaps it's just a really hard problem. I can't think of a good one, but I'm sure these problems exist.

ADEX · Feb 21, 2007

Running 8 threads on a SPE is a bizarre use of the SPE. You're better off designing software to not stall and need threads to keep the logic busy; and running threads to completion, 8 or 9 at a time, before switching, unless the switches are very occassional.

You're assuming you have the time or money to change the algorithms to something SPE friendly, this is probably not the case for the vast majority of software. That is, if it can be done at all - the solution to some problems may simply be branchy and/or have a random memory access pattern and can't be changed.

Running 8 threads may seem bizarre but if you are running lots of threads with a random access pattern it's going to work a lot better than running one thread per SPE - that single thread will be sitting around waiting on memory for much of the time.

It might be a potential solution for Ai for lots of baddies - Can anyone confirm this?

Shifty Geezer · Feb 21, 2007

ADEX said:
Running 8 threads may seem bizarre but if you are running lots of threads with a random access pattern it's going to work a lot better than running one thread per SPE - that single thread will be sitting around waiting on memory for much of the time.

Whe you say 8 threads per SPE, do you mean switching out the current thread and swapping to another - which means saving out 256kb of LS and all the registers, and loading in 256kb of LS and registers from the next thread. Ouch! - or do you mean running 8 threads in a SPE, which gives an average 32 kb per thread of LS space?

3dilettante · Feb 21, 2007

Wouldn't that fraction be a little smaller?
Can someone back up my memory about a proof of concept paper that ran some kind of tiny embedded kernel that allowed switching threads within an SPE? I think it took a significant fraction of the LS all by itself.

ADEX · Feb 22, 2007

Shifty Geezer said:
Whe you say 8 threads per SPE, do you mean switching out the current thread and swapping to another - which means saving out 256kb of LS and all the registers, and loading in 256kb of LS and registers from the next thread. Ouch! - or do you mean running 8 threads in a SPE, which gives an average 32 kb per thread of LS space?

8 threads in a single SPE, no swapping, switching between threads takes just 1 cycle.
I believe it was to be included in the compiler at some point.

The limited LS sounds very limiting but remember this is for apps (or really algorithms) that are hitting memory so by definition are not using the LS much.

Aggressor · Apr 14, 2007

So a Cell processor build in PS3 that can be bought for 600$ is better overall then a QX6700 costhing 900$ or more a piece :?:

inefficient · Apr 14, 2007

Aggressor said:
So a Cell processor build in PS3 that can be bought for 600$ is better overall then a QX6700 costing 900$ or more a piece

Cell will be better in some areas and visa versa. Trying to claim one is "better overall" would be a gross over simplification of the situation.

Aggressor · Apr 14, 2007

inefficient said:
Cell will be better in some areas and visa versa. Trying to claim one is "better overall" would be a gross over simplification of the situation.

Well, i dont know alot about processors, i have asked on AVS forums and there i got this answer:

When it comes to games, a quad core CPU would blow the snot out of the PS3/360's CPU. Heck, a dual core, out-of-order exeuction CPU would probably be better.

I didnt get any explanation or anything, so can anyone confirm this?

Rainbow Man · Apr 14, 2007

Aggressor said:
I didnt get any explanation or anything, so can anyone confirm this?

Generalized statements generally don't fit to every situation.

Even assuming a quad-core CPU could "blow the snot" out of current console CPUs not just in the average case but any case could such a quad-core console be built at a price an average human being could afford?

Current console CPUs look the way yhey do because that's the only cost-effective way of reaching that level of performance with the technology we have at our disposal today.

Just saying Hardware X blows the snot out of it is not very constructive. The earth simulator coudl very likely software render much prettier graphics than any CPU/GPU combination available today. At a billion dollar cost and megawatt power draw.

Isthat a useful comparison? No.
Peace.

inefficient · Apr 15, 2007

Aggressor said:
Well, i dont know alot about processors, i have asked on AVS forums and there i got this answer:

When it comes to games, a quad core CPU would blow the snot out of the PS3/360's CPU. Heck, a dual core, out-of-order execution CPU would probably be better.

I didn't get any explanation or anything, so can anyone confirm this?

There is no empirical evidence to back up that claim since there is no PC game really showing off the advantages of a quad core. People are going to point to Crysis but until the game is actually out and benchmarked compared to other CPUs we won't have the facts.

Theoretically the Cell will out perform that quad core under many ideal situations. We will see toward the end of this year if the numbers on paper materialize as actual performance.

Carl B · Apr 15, 2007

Aggressor said:
Well, i dont know alot about processors, i have asked on AVS forums and there i got this answer:

When it comes to games, a quad core CPU would blow the snot out of the PS3/360's CPU. Heck, a dual core, out-of-order exeuction CPU would probably be better.

I didnt get any explanation or anything, so can anyone confirm this?

Read this thread and you'll have your answer: there is no black-and-white. If you don't understand the thread, well then just leave it alone, but *please* do not just post here asking about other posts on other forums. There are too many "How does Cell/XeCPU compare to Intel?" threads/tangents around here... it's almost a weekly thing, and it's tiresome.

A quad-core OOE CPU is obviously a strong chip, and for the majority of real-world game environments would be the prefered architecture. I'd like to add a bunch of qualifiers to that, but I'm going to save the post-response chain and just say again, read this thread; it won't change the answer favoring the quad-core CPU in most cases, but it will give you insights into why... and frankly when it comes to understanding anything, to know the yes/no "answer" is not enough, knowing why that answer is what it is, is what matters.

Aggressor · Apr 15, 2007

Ok, i was just confused by so much forums saying different things, this one seems to be correct, but its to complicated for me to understand it all if i read this whole tread.
What i do understand is, that Cell is better at some things, and vice versa after reading.

Shifty Geezer · Apr 15, 2007

Cell is a quite a specialised processor, which means in some areas it performs fantastically well. Consider a custom processor in a DVD player, say. It can decode movies far quicker than a CPU (of way back when) but that's all it could do. Specialised hardware is faster at it's specialistion than non-specific, programmable hardware, all things being equal.

What Cell goes for is a middle ground. It's a generally programmable CPU that can do any job, but it's specialised at processing large amounts of data in bundles (vectors). You don't do one sum at a time on each SPE; you do four. To use this performance, you need to write software that can work that way. This is a headache because 15+ years of programming teaching and experience has been centered on thinking the 'x86' standard-core way. And some things just won't map very well. For a lot of game stuff, the benefits of Cell might not be very applicable. So at the moment, there's a 'wait and see' situation. If a lot of stuff can be targetted at Cell's way of doing things, its design will allow some incredible performance to rival bigger, more expensive processors. How much that materialises and makes a difference in PS3, we don't know. Predictions are roughly divided into two camps - the optimists who think Cell will really go places and the doubters who think the benefits of Cell won't make much real difference, either because few will write code that's Cell optimized (because of cross-platform development, and costs of per-platform optimizations) or because what Cell can provide isn't that much of an improvement. ie. An 'ordinary' CPU could handle the same content but in lower quality, with that lower quality not making much difference to the overall experience. Joe Public would probably never notice if an XB360 fabric mesh was a quarter the resolution of a PS3 fabric mesh, for example.

Aggressor · Apr 15, 2007

Shifty Geezer said:
Cell is a quite a specialised processor, which means in some areas it performs fantastically well. Consider a custom processor in a DVD player, say. It can decode movies far quicker than a CPU (of way back when) but that's all it could do. Specialised hardware is faster at it's specialistion than non-specific, programmable hardware, all things being equal.

What Cell goes for is a middle ground. It's a generally programmable CPU that can do any job, but it's specialised at processing large amounts of data in bundles (vectors). You don't do one sum at a time on each SPE; you do four. To use this performance, you need to write software that can work that way. This is a headache because 15+ years of programming teaching and experience has been centered on thinking the 'x86' standard-core way. And some things just won't map very well. For a lot of game stuff, the benefits of Cell might not be very applicable. So at the moment, there's a 'wait and see' situation. If a lot of stuff can be targetted at Cell's way of doing things, its design will allow some incredible performance to rival bigger, more expensive processors. How much that materialises and makes a difference in PS3, we don't know. Predictions are roughly divided into two camps - the optimists who think Cell will really go places and the doubters who think the benefits of Cell won't make much real difference, either because few will write code that's Cell optimized (because of cross-platform development, and costs of per-platform optimizations) or because what Cell can provide isn't that much of an improvement. ie. An 'ordinary' CPU could handle the same content but in lower quality, with that lower quality not making much difference to the overall experience. Joe Public would probably never notice if an XB360 fabric mesh was a quarter the resolution of a PS3 fabric mesh, for example.

As i read ur post, i understand it now. Thanks.

joker454 · Apr 15, 2007

Shifty Geezer said:
Joe Public would probably never notice if an XB360 fabric mesh was a quarter the resolution of a PS3 fabric mesh, for example.

I know Shifty that you probably just used a random number in there to make a point, which is fine. But I wanted to add an addendum to this! Even with tools like Edge, it remains to be seen if a situation like the above will ever occur. Certain tasks, such as skinning 2+ million verticies on the gpu, are not recommended on RSX, hence why 'skinning' was one of the bullet points in the Edge talks. But, the 360 gpu has no trouble ripping thru skinning so it's totally feasible to just let it handle it in your game. In other words, just because Edge lets you do x, y and z on spu's, that doesn't mean you have to bother with any of that on the 360 cpu, you can just let it's gpu take care of it. So, we'll see if the 1/4 mesh situation ever actually happens

icecold1983 · Apr 16, 2007

cell is about as powerful as an 8 core conroe for games processing.

Fox5 · Apr 16, 2007

icecold1983 said:
cell is about as powerful as an 8 core conroe for games processing.

Amdahl's law says nein.

Though its certainly possible to construct a workload that technically counts as a game in which this would be true, but while the 8 core conroe should see a fairly linear increase in performance (assuming the tasks are trivially parallel so that the code isn't stopping either cell or an 8 core conroe from being fully utilized), cell will ultimately be limited to the performance of its PPU.

Now then, an 8 core conroe would also have many times the silicon dedicated to it than cell has, not to mention more cores outright.

In all honesty, cost wise cell shouldn't be compared to anything beyond a dual core, and saying games is far too generic of a concept to say which would be faster. Games as they are right now would probably show even a single core top end x86 cpu beating cell, but its easy to throw in more of the kind of things cell is good at and send the x86 cpu screeching to a halt. Vice versa as well, though just by nature of its design I think cell could deal with its worst case scenario better than an x86 cpu could, but smart game design should really play to the strengths of the processor and not leave one with the impression that the game is just barely chugging along, trying to do things that the system isn't meant for.

However, it could turn out that the things Cell is good at are mostly non-critical to gameplay and easily removed to improve performance, while the things an x86 cpu is better at for games isn't as easily removed or cut down. Essentially, it could turn out similar to graphics cards and cpus now, graphics cards improve at a much faster rate, but yet can have the exact same game played almost the same on graphics cards that are an order of magnitude different in performance, whereas even half the performance on a cpu can render a game unplayable. There's a certain minimum speed needed for a cpu to do everything a game needs done, and anything above that is pretty much wasted (assuming we have a framerate target) and anything below that leaves the system critically lacking in resources.

icecold1983 · Apr 16, 2007

i was being facetious.

Kentsfield as an alternative to Cell in PS3

aaaaa00

3dilettante

ADEX

Shifty Geezer

uber-Troll!

3dilettante

ADEX

Aggressor

inefficient

Aggressor

Rainbow Man

inefficient

Carl B

Friends call me xbd

Aggressor

Shifty Geezer

uber-Troll!

Aggressor

joker454

icecold1983

Fox5

icecold1983

Similar threads