Cellebrate Car AcCelleration spin-off

tunafish · Dec 9, 2012

Hornet said:
Hence the question: why did the go with narrow cores and wide vector units last time around, while this time around they are doing the opposite thing?

Because they bought the designs from IBM, who at the time was going through a "dumb speed-demon cpus are best"-phase. Just look at POWER6, designed at the same time. The argument between brainy cpus aiming for efficiency and utilization, and fast, simple CPUs aiming for peak throughput is a very old one. In the middle of the aughts, IBM was solidly in the "fast cores beat smart cores" camp. It was only how badly Cell failed in the marketplace (Remember, it was supposed to be so much more than a game console. Beyond some pilot projects, none of that came though.) and their inability to scale POWER6 without truly exotic cooling that brought them back to a more traditional, wider design.

Right now, Intel and IBM are leaning towards brainy CPUs, while AMD tried to use a fast design in bulldozer. The desired frequencies didn't come through, so maybe we can expect AMD to head towards smarter CPUs too? Regardless, the debate is far from settled. IBM is doing a lot of research on exotic ways to cool silicon dies -- if they hit a breakthrough, I can certainly see them build another monster aiming at absurd clock speeds.

MrFox · Dec 9, 2012

tunafish said:
It was only how badly Cell failed in the marketplace (Remember, it was supposed to be so much more than a game console. Beyond some pilot projects, none of that came though.)

But the Cell dominated the Green500 supercomputers for years. It took 5 years for someone to beat it's performance per watt, and it was the Blugene/Q prototype in 2010. Even then, the Bluegene/Q is downclocked significantly in order to beat the Cell. Today GPGPU and Xeon Phi are beating everyone of course, but GPGPU isn't nearly as flexible as the Cell was, it was a real CPU. And the blugene/Q is ranked tie with Nvidia Kepler solutions!

It seems to have served it's purpose perfectly, and led the way to the accelerated processing revolution years later. I would have thought that gaming devs would prefer a very powerful OoO PPE combined with lots of SPEs instead of having to do GPGPU code. But what do I know.

tunafish · Dec 10, 2012

MrFox said:
Today GPGPU and Xeon Phi are beating everyone of course, but GPGPU isn't nearly as flexible as the Cell was, it was a real CPU.

No. Cell is not a real cpu. Or, it's a real CPU, but comparable with the high-tech wonders of the mid-80's. Being only able to directly address a few hundred kB of local pool is not some minor detail that you can forget in the margin, it defines Cell. Memory access is in general more important than computation these days. The memory architecture of Cell means that the SPEs are much less real cpus than, say, the shader arrays in GCN, which can directly address the global pool.

The PPE of course is a real CPU, but it's also pathetic on it's own.

MrFox said:
But the Cell dominated the Green500 supercomputers for years. It took 5 years for someone to beat it's performance per watt, and it was the Blugene/Q prototype in 2010. Even then, the Bluegene/Q is downclocked significantly in order to beat the Cell.

If it was so good, then why didn't people want to buy it? IBM sold less than 2EF total of Cell supercomputers, and every one of them that got shipped was either heavily supported by IBM, or sold to US Govt before the characteristics of Cell were well known. It certainly wasn't expensive, what with IBM desperately trying to push it on everyone. If it was cheap and very efficient, then why didn't everyone jump at the chance?

The answer of course being that while it got good numbers on synthetic benchmarks with completely predictable memory access, on real code it's really bad. It's bad per watt and just bad in general. There are a few things it's really good at, but it general, the things supercomputers are used for can rarely be split into nice small 64kB chunks that you can process in parallel and independently. If they could, Cell would still be a very good system and every major operator would still be begging for IBM to take their money.

IBM learned their mistake, and now their top-end entry for supercomputers is the PPC A2, which ditched the thing that made Cell what it was, and now has proper memory access from each thread. And those, people actually want to buy.

It seems to have served it's purpose perfectly, and led the way to the accelerated processing revolution years later.

I really don't think there is any way that Cell can be said to have lead to an accelerated computing revolution. Various vector processors and other accelerated processing have existed since the beginning of high-end computing, and modern GPGPU has much more in common with, say, the Fujitsu Numerical Wind Tunnel than Cell, both in architecture and programming model.

I would have thought that gaming devs would prefer a very powerful OoO PPE combined with lots of SPEs instead of having to do GPGPU code.

GPGPU platform with a shared memory space is vastly, vastly more comfortable to code for than having to try to split your problem into nice small chunks that you have to dma to the processing elements before they are needed. One of these things just requires you to withstand a little more latency (which is almost perfectly masked by many threads in flight), the other requires you to orchestrate a careful dance of data. All you need is just one unpredictable read from a megabyte pool to really, really ruin your day.

But what do I know.

Devs in general hate Cell. I personally hate it with a burning passion. Ever meet some old-timer PS3 devs, buy the one with prematurely gray hair a beer, and you get to listen to vitriol-filled horror stories about trying to fit normal processing into the programming model of the Cell. Sure, we'll use it, if it's the only game in town, and we'll even grumpily admit that it's actually really good at a narrow set of tasks, but I don't think I've ever heard anyone who has actually programmed on it to prefer it over anything other than having his fingers amputated. And even that will probably make a lot of us pause and think.

Lucid_Dreamer · Dec 10, 2012

tunafish said:
No. Cell is not a real cpu. Or, it's a real CPU, but comparable with the high-tech wonders of the mid-80's. Being only able to directly address a few hundred kB of local pool is not some minor detail that you can forget in the margin, it defines Cell. Memory access is in general more important than computation these days. The memory architecture of Cell means that the SPEs are much less real cpus than, say, the shader arrays in GCN, which can directly address the global pool.

The PPE of course is a real CPU, but it's also pathetic on it's own.

If it was so good, then why didn't people want to buy it? IBM sold less than 2EF total of Cell supercomputers, and every one of them that got shipped was either heavily supported by IBM, or sold to US Govt before the characteristics of Cell were well known. It certainly wasn't expensive, what with IBM desperately trying to push it on everyone. If it was cheap and very efficient, then why didn't everyone jump at the chance?

The answer of course being that while it got good numbers on synthetic benchmarks with completely predictable memory access, on real code it's really bad. It's bad per watt and just bad in general. There are a few things it's really good at, but it general, the things supercomputers are used for can rarely be split into nice small 64kB chunks that you can process in parallel and independently. If they could, Cell would still be a very good system and every major operator would still be begging for IBM to take their money.

IBM learned their mistake, and now their top-end entry for supercomputers is the PPC A2, which ditched the thing that made Cell what it was, and now has proper memory access from each thread. And those, people actually want to buy.

I really don't think there is any way that Cell can be said to have lead to an accelerated computing revolution. Various vector processors and other accelerated processing have existed since the beginning of high-end computing, and modern GPGPU has much more in common with, say, the Fujitsu Numerical Wind Tunnel than Cell, both in architecture and programming model.

GPGPU platform with a shared memory space is vastly, vastly more comfortable to code for than having to try to split your problem into nice small chunks that you have to dma to the processing elements before they are needed. One of these things just requires you to withstand a little more latency (which is almost perfectly masked by many threads in flight), the other requires you to orchestrate a careful dance of data. All you need is just one unpredictable read from a megabyte pool to really, really ruin your day.

Devs in general hate Cell. I personally hate it with a burning passion. Ever meet some old-timer PS3 devs, buy the one with prematurely gray hair a beer, and you get to listen to vitriol-filled horror stories about trying to fit normal processing into the programming model of the Cell. Sure, we'll use it, if it's the only game in town, and we'll even grumpily admit that it's actually really good at a narrow set of tasks, but I don't think I've ever heard anyone who has actually programmed on it to prefer it over anything other than having his fingers amputated. And even that will probably make a lot of us pause and think.

Siggraph 2008 (Jon Olick's presentation) suggested that code needed to be broken down into small chunks (the suggested size actually fits into the Cell programming dynamics well) for next-gen games. I guess a lot of devs still won't be doing this next-gen, either.

I guess I can depend on the same devs that have gotten the best performance from the hardware this gen will be the ones with the best results next-gen as well. When I see ND and SMS have excellent use of the hardware (looking at their clock cycle usage and game feature sets), it keeps me interested in the hardware/software marriage.

Before I found my way to this forum, several years ago, I use to think ALL devs were extremely adaptable/flexible. I use to think they loved to be challenged with new and exotic high performance hardware. Hardware designers would come up with fast hardware and software devs would find new and interesting ways to make it sing. In my mind, it was some sort of man vs. machine paradigm. It appealed to me.
I'm not going to say what I think it has largely become, now. No disrespect intended, but I'll just say I've been extremely let down.

Does anyone think these next-gen consoles will feature small high speed SSDs?

tunafish · Dec 10, 2012

Lucid_Dreamer said:
Siggraph 2008 (Jon Olick's presentation) suggested that code needed to be broken down into small chunks (the suggested size actually fits into the Cell programming dynamics well) for next-gen games.

Code isn't the issue. Data is.

I guess a lot of devs still won't be doing this next-gen, either. I guess I can depend on the same devs that have gotten the best performance from the hardware this gen will be the ones with the best results next-gen as well. When I see ND and SMS have excellent use of the hardware (looking at their clock cycle usage and game feature sets), it keeps me interested in the hardware/software marriage.

Before I found my way to this forum, several years ago, I use to think ALL devs were extremely adaptable/flexible. I use to think they loved to be challenged with new and exotic high performance hardware. Hardware designers would come up with fast hardware and software devs would find new and interesting ways to make it sing. In my mind, it was some sort of man vs. machine paradigm. It appealed to me.
I'm not going to say what I think it has largely become, now. No disrespect intended, but I'll just say I've been extremely let down.

But what if the hardware is just crappy? It's not just about being the best coder out there. If you cannot do many reads from memory, you are limited to what can be done with simple streaming. There really is no way around that -- no matter what kind of hot shot you are. Cell is so frustrating because it is so extremely lopsided. It has more alu throughput than sense, but because the entire memory pipeline is so limited, you end up not being able to use three quarters of it for anything better than summing the same three numbers together over and over again. I love powerful hardware. It's just that it needs to be fed.

Does anyone think these next-gen consoles will feature small high speed SSDs?

I'd say it's almost certainty that at least some of them will include some flash, just because that's the cheapest way to add a baseline of storage. Is it fast? I'm not very optimistic about that. Flash chips of reasonable price are not all that fast in practice. Drives get fast by having a lot of them and accessing them in parallel. In order to guarantee a level of performance, you'd need to tie yourself to shipping a lot of flash chips per console for the entire lifetime.

Lucid_Dreamer · Dec 11, 2012

tunafish said:
But what if the hardware is just crappy? It's not just about being the best coder out there. If you cannot do many reads from memory, you are limited to what can be done with simple streaming. There really is no way around that -- no matter what kind of hot shot you are. Cell is so frustrating because it is so extremely lopsided. It has more alu throughput than sense, but because the entire memory pipeline is so limited, you end up not being able to use three quarters of it for anything better than summing the same three numbers together over and over again. I love powerful hardware. It's just that it needs to be fed.

Then, titles like Battlefield 3, Uncharted 2 and 3, Gran Turismo 5, God of War 3, Killzone 3, etc shouldn't exist. Last of Us and God of War: Ascension looks to be even more of a case against that statement. This is a 6 year old product. The hardware was capable of, at least, this from day one. Only the mind sets/skill sets needed to catch up. Some have decided to catch up to the hardware and that hard work will serve them well in the coming years.

function · Dec 11, 2012

Lucid_Dreamer said:
Then, titles like Battlefield 3, Uncharted 2 and 3, Gran Turismo 5, God of War 3, Killzone 3, etc shouldn't exist.

Why?

Last of Us and God of War: Ascension looks to be even more of a case against that statement.

Why?

This is a 6 year old product. The hardware was capable of, at least, this from day one. Only the mind sets/skill sets needed to catch up. Some have decided to catch up to the hardware and that hard work will serve them well in the coming years.

But how does this prove that the PS3 isn't limited by memory access, or that you couldn't do better with the same silicon or power budget?

Lucid_Dreamer · Dec 11, 2012

function said:
Why?

Why?

Because, "crappy hardware" can't yeild such results. Show me these games on the Wii.

But how does this prove that the PS3 isn't limited by memory access, or that you couldn't do better with the same silicon or power budget?

Every device is limited by something. There were ways planned around things like that, when the hardware was being made. Others are using those ways, apparently.

MrFox · Dec 11, 2012

Because a "Crappy hardware" cannot have great games on par visually with the alternative that was claimed to be the most balanced system ever made. And if the statement that you cannot use a third of what it has would be true, it wouldn't be able to produce these results. More work, perhaps, not perfect, but considering the GPU was factually inferior to the 360 GPU, there's a limit how much you can stretch the statement of "Crappy hardware" and point out the Cell, when the result was great... unless all 360 programmers are incompetent and can't use more than a third of their hardware either.

Decoding H264 was supposed to be impossible to do at 40mbit on this "crappy" Cell (someone even wrote a paper about it) and they ran circles around it with two streams in 3D plus the DTS lossless audio codec, ZERO drop frame with anything you throw at it. The most expensive PC CPU couldn't dream of it at the time, neither does the 360 "most balanced system ever". It's also a bad idea to compare GCN from 2012 with a chip that was produced in 2005 when GPU processing was very limited. Looking at the 2005 GPU memory model, is this better than the Cell?

onQ · Dec 11, 2012

function said:
Why?

Why?

Because if the RSX is weak (what the internet told me) , split memory pool was the wrong way to go (again the internet) & the Cell was crappy hardware (internet ) how is it that the PS3 is putting out such good looking games?

Lucid_Dreamer · Dec 11, 2012

onQ said:
Because if the RSX is weak (what the internet told me) , split memory pool was the wrong way to go (again the internet) & the Cell was crappy hardware (internet ) how is it that the PS3 is putting out such good looking games?

And, it's not that they are only good look. They are great playing games, as well (from a technical perspective).

AlphaWolf · Dec 11, 2012

So they couldn't have been better if the architecture had been more developer friendly? You guys sound like you are taking what he said about cell personally.

onQ · Dec 11, 2012

AlphaWolf said:
So they couldn't have been better if the architecture had been more developer friendly? You guys sound like you are taking what he said about cell personally.

Developer Friendly = Automatic

The Cell = Manual

If the Drivers that are used to the Automatic tell me that the Manual is Crappy but I watch a few Drivers beat them driving that Manual I'm going to think that maybe the Manual isn't as bad as they tried to make it out to be & maybe they just need to learn how to drive it better.

Squilliam · Dec 11, 2012

onQ said:
Developer Friendly = Automatic

The Cell = Manual

if the Drivers that are used to the Automatic tell me that the Manual is Crappy but I watch a few Drivers beat them driving that Manual I'm going to think that maybe the Manual isn't as bad as they tried to make it out to be & maybe they just need to learn how to drive it better.

If that manual is a stripped down 1992 Honda Civic with a turbo charger and a completely torn out interior and the Developer friendly version is an automatic BMW then sure you've got the right analogy.

Sonic · Dec 11, 2012

onQ said:
Developer Friendly = Automatic

The Cell = Manual

If the Drivers that are used to the Automatic tell me that the Manual is Crappy but I watch a few Drivers beat them driving that Manual I'm going to think that maybe the Manual isn't as bad as they tried to make it out to be & maybe they just need to learn how to drive it better.

Bad analogy. There are a few "automatic" trannies that will allow the drivers better performance than a manual transmission would. And it also depends on the power application and the objective of the tuner. Straight line performance an auto would win hands down. For twists and corners I prefer manual, but that's more like me selecting the gear at the push of a button. That's the kind of transmission a few of my cars have.

The correct analogy might be:

Developer friendly = Supra automatic

CELL = entire 3000GT VR4 drive train and the constant headaches and pissed offedness that comes with it

CELL would be more

onQ · Dec 11, 2012

Sonic said:
Bad analogy. There are a few "automatic" trannies that will allow the drivers better performance than a manual transmission would. And it also depends on the power application and the objective of the tuner. Straight line performance an auto would win hands down. For twists and corners I prefer manual, but that's more like me selecting the gear at the push of a button. That's the kind of transmission a few of my cars have.

The correct analogy might be:

Developer friendly = Supra automatic

CELL = entire 3000GT VR4 drive train and the constant headaches and pissed offedness that comes with it

CELL would be more

I was only talking about 2 Cars one being a Automatic & the other being a Manual.

Blazkowicz · Dec 11, 2012

onQ said:
I was only talking about 2 Cars one being a Automatic & the other being a Manual.

Well, my PC is a bicycle

not really, maybe the good old 1989 Game Boy is a bicycle.
My linux PC is a beat up car with the radio stuck at loudest volume (not that I complain), and it's a creepy model from the 60s that still had a hole in the front for the hand crank.

Sonic · Dec 11, 2012

delete

MrFox · Dec 11, 2012

A one-legged driver wants to slit his wrists at the thought of driving manual, while the two-legged driver is asking what the problem is. (just kidding)

If the cancelled Cell2 from IBM was supposed to have direct memory instructions in addition to the local store DMA, would it have solved everyone's problem or is there more?

NRP · Dec 11, 2012

AlphaWolf said:
You guys sound like you are taking what he said about cell personally.

It seems Kool Aid is still the drink of choice seven years later.

I am impressed with what devs have been able to accomplish on PS360, so it seems silly to ignore those same devs when they complain about the hardware.

Perhaps those exclusive games mentioned earlier were good in spite of cell, rather than because of it? Same with the 360 CPU really.

Cellebrate Car AcCelleration spin-off

tunafish

MrFox

Deludedly Fantastic

tunafish

Lucid_Dreamer

tunafish

Lucid_Dreamer

function

None functional

Lucid_Dreamer

MrFox

Deludedly Fantastic

onQ

Lucid_Dreamer

AlphaWolf

Specious Misanthrope

onQ

Squilliam

Beyond3d isn't defined yet

Sonic

Senior Member

onQ

Blazkowicz

Sonic

Senior Member

MrFox

Deludedly Fantastic

NRP

Similar threads

Cellebrate Car AcCelleration *spin-off*

Deludedly Fantastic

None functional

Deludedly Fantastic

Specious Misanthrope

Beyond3d isn't defined yet

Senior Member

Senior Member

Deludedly Fantastic

Similar threads

Cellebrate Car AcCelleration spin-off