Cellebrate Car AcCelleration *spin-off*

Status
Not open for further replies.
A one-legged driver wants to slit his wrists at the thought of driving manual, while the two-legged driver is asking what the problem is. (just kidding)

If the cancelled Cell2 from IBM was supposed to have direct memory instructions in addition to the local store DMA, would it have solved everyone's problem or is there more?
I think it got pointed out by cleverer people (than me) that the STI allaince kind of drove them-selves in a dead end when it comes to BC if the Broadband engine were to be modified.
Take with a grain of salt (or search all the existing thread for first hand information with matching analysis and well constructed pov) but I tend to remember that the code is kind of bound to the latency to access the LS, if you change that, it's not good.
so if you kind of move away from the LS+DMA approach you pretty jeopardize existing code. Keeping existing code functioning could require quiet some legacy hardware and a pretty constrained design.
On top of all the other architectures short coming that has been pointed through since it has been released, I think it is a quiet bothering one, there is no real latency hiding mechanism, it's more the code is build on the low latency LS, if you change that...

I would also point out to nAo's posts about why STI vouched for only 4 wide SPU. He thought it is weird that they went with something that narrow (you can't get much narrower for vector processing).
Looking at Intel architecture and the jump from Westmere to Sandy bridge, I wonder my-self
so not to be put on the same level as nAo or other POV that you may find if you search the forum numerous threads on the matter (sometime discussed in relation to larrabee fyi)
I wonder if STI should have gone at least for something akin to AVX introduced in Sandy bridge. The jump in transistor count (IGP aside) between Westmere and Sandy bridge isn't that significant. With that light increase in transistor count, Intel though AVX double the FP throughput. So to the point I wonder if STI could have gotten for a non prohibitive cost (in silicon) to get the SPU to handle 8 wide vectors in FP mode and 4 wide vectors while dealing with integer.
That would have granted a neat gain in throughput.

Overall I believe that the design by committee killed the chip, it is not that clear to me if the chip had a single clear purpose in the same time it came with a pretty constraining memory model.
I wish they would have pulled a "wannabee Larrabee", that were not an option, too many different views about what the chip would/should be /use (wrt to the tech), IBM was adamant about the POWERPC being used somehow, others had their views too and so on.

Edit
More geek ranting... I would also think that Sony spread it-self a tad thin by working on both the Broadband engine and the cancelled GPU.
That more geeky sweet dream than anything else but I wished they would have vouched clearly for a software based approach and worked on a matching solution, most likely using two of the resulting chip head to head as in the Cell blades.

I don't know what was achievable I can only dream about it. Clearly the (silicon) budget for something like Larrabee was on there but I wonder (so putting aside all the constrains due to the allance of different parties with different pov) if they could have designed what I called earlier a "wannabee larrabee".
As an example of the worthless questions that plague my mind, could the Cell have been based on multi-threaded (4 way) VPUs with a width matching avx units, L1 instruction and data cache, no standard CPU at all, and a share pool of scratch pad memory? IBM did not use EDRAM in it CPU by that time (on their 90nm lithography) but could have they? (so more scratchpad memory).
The thing would have operate at pretty high clock speed.For some reason my mind insists in telling me that within Cell silicon budget (or close) there could have been 6 such VPUs and an unknown amount of scratch pad memory (when the internal broken guesstimator failed to deliver it is really a bad sign... and that imagination has gone out of control lol ).


I'm pretty proud of that one, it is really worthlessness at its peak... with enough if one could put Paris into a bottle... :LOL:
Sorry but the name of thread called for that kind of crap (good one Alstrong).
 
Last edited by a moderator:
I remember an insider claiming that Toshiba was there with a single purpose in mind, a stream processor for codecs, so it seems they were the winners in the commitee. But an ASIC is so much better for that (the argument was that it takes years to make an ASIC, while the Cell was available for anything right away for any new codecs on the spot). Toshiba didn't need wide vector FP performance, nor any complex random memory calls. Sony did need the same thing as Toshiba (for bluray and media playback capabilities), but they also had similar needs as IBM for vector processing and general computing. It looks like Toshiba got every details they needed, while IBM and Sony got an imperfect solution.

I'm curious what would have happened with a Sony+IBM design only. Obviously if this time they didn't go with IBM, they weren't satisfied with the Cell roadmap nor the A2, but the timing of IBM cancelling the cell is very late, and maybe it was at the same time Sony decided not to pursue that solution and go with AMD.
 
Decoding H264 was supposed to be impossible to do at 40mbit on this "crappy" Cell (someone even wrote a paper about it) and they ran circles around it with two streams in 3D plus the DTS lossless audio codec, ZERO drop frame with anything you throw at it. The most expensive PC CPU couldn't dream of it at the time, neither does the 360 "most balanced system ever". It's also a bad idea to compare GCN from 2012 with a chip that was produced in 2005 when GPU processing was very limited. Looking at the 2005 GPU memory model, is this better than the Cell?
Whaaat? Do you have a link to this paper that says you can't decode a high bitrate H.264 stream on a Cell? Because it'd be full of crap. Cell was _designed_ to handle high bitrate streaming codec operations, it's practically a savant at it. Sure it might have had some issues with CABAC, but it obviously worked them out. Toshiba engineers once showed me a demo of a single Cell decoding 48 simultaneous SD MPEG2 streams (admittedly no CABAC), then scaling and tiling them onto a 1080p frame. According to them they didn't even work too hard to get there. Here's a link. They originally wanted to put a cell into every TV, due to it being so good at decoding video streams. (not so good at reducing cost though...)

As to the 360 not being able to decode high bitrate H.264. It decodes it fine. In fact it can do a second stream simultaneously, and that's using only 2 of the 3 cores (and the GPU). And the 360 GPU had full access (both read and write) to all of RAM, using MemExport. That's why we could decode H.264 efficiently in the end. One of the engineers involved speculated that with more time he could have fitted a third stream in there, but since it wasn't necessary, we stopped optimizing at exactly the point that we could meet the spec.
 
Thanks :smile:

It was a long time ago, right before the PS3 launch, maybe I remember it with opaque black colored glasses.
http://rsim.cs.illinois.edu/Pubs/sasanka-phd-thesis.pdf
This study was being quoted to draw the conclusion that the Cell didn't have the required computation power. Because it shows that H264 needs 12 times more processing power than MPEG2. (The study doesn't actually make that conclusion itself, it only explains what is needed to decode H264 versus MPEG2, someone else made the connection, which I cannot do personally)

I know it's full of crap, that's my point. It was plastered all over the internet just in time for the PS3 release, and everyone and their mother started saying the Cell failed at it's most core advantage, using the above study as "proof". It's was bull. It makes it difficult for someone like me to find credible information to draw a conclusion, based on anything other than the results. Because I can judge the success as something absolute. So when someone says the Cell can't use more than a third of it's available power, that sounds like the same kind of bias.

There was another claim around that time about 40Mbit plus lossless DTS not being possible with the 360, but it didn't talk about the GPU, it was only about the CPU comparisons. I guess it was bull too.

My results at the time were that a file I was encoding at 40mbit H264 was perfect every time on the PS3 and a friend of mine with a 360 said none of the files were working.
 
Developer Friendly = Automatic

The Cell = Manual



If the Drivers that are used to the Automatic tell me that the Manual is Crappy but I watch a few Drivers beat them driving that Manual I'm going to think that maybe the Manual isn't as bad as they tried to make it out to be & maybe they just need to learn how to drive it better.
I think this is a good analogy.

If that manual is a stripped down 1992 Honda Civic with a turbo charger and a completely torn out interior and the Developer friendly version is an automatic BMW then sure you've got the right analogy.
It doesn't matter. Winning/being the best (more features executed simultaneously) is all that matters, from a technical standpoint.

It seems Kool Aid is still the drink of choice seven years later.

I am impressed with what devs have been able to accomplish on PS360, so it seems silly to ignore those same devs when they complain about the hardware.

Perhaps those exclusive games mentioned earlier were good in spite of cell, rather than because of it? Same with the 360 CPU really.
Does it seem equally as silly to ignore the devs creating masterpieces and NOT complaining about the hardware? That's what's happening here. There have been devs that said a lot of devs are leaving a great deal of Cell performance on the table by not utilizing certain skills/tools. It goes ignored.

You had nAo and Deano C that have created great techniques on the PS3, their words yet goes ignored. Those are the devs you should be listening to. They have actually created groundbreaking games, from a technical standpoint. Yet, as you said, "Kool Aid is still the drink of choice 7 years later."

How can games be good/great, technically, in spite of something? That's like saying I can buy cars, buildings, etc, in spite of being broke. It doesn't make sense.

When you judge a runner, you judge he/she by his/her best times. When you want to get a car's 0-60 time, you take the best time. Why do we do that for hardware? It's because of the human factor. There are better drivers than others and the best driver can yet do even more. You can't get more out of a machine than it can do. HOWEVER, you can get more out of a driver/operator that hasn't been able to exploit everything the machine has to offer. It seems that's were we are with the Cell/PS3.

Some people will never be able to bring the full potential of hardware to bare (drivers, developers, etc).
 
Imo your argument is BS, game development has its bright minds for sure but there are plenty elsewhere, cell got discarded on solid basis by bright minds in the video game industry and elsewhere, including the main brains (ie IBM) behind the project.

I'm close to think that this talk is remnant of lazy dev bullcrap and that is not the issue here... /should get pruned.

NB what tech are you speaking about with regard to nAo? Because its nAo32 hdr implementation has nothing to do with the cell.
 
Would this be a good car analogy?
The Cell is faster but can only turn left. You must do a 270 to turn right.
 
Imo your argument is BS, game development has its bright minds for sure but there are plenty elsewhere, cell got discarded on solid basis by bright minds in the video game industry and elsewhere, including the main brains (ie IBM) behind the project.
That's something I don't understand, why did IBM go as far as starting to fab the successor and gave up in 2010 for "yield issues"? That doesn't sound like IBM was dissatisfied with the Cell approach at all. That the A2 and/or Power7 are better solutions today, it's obvious now, but why try the Cell2 up to starting to fab them, if it was that obvious it wasn't a good path? It means the main brains dismissed it for yield issues, not based on architecture philosophy. Or was it just saving face?
 
Would this be a good car analogy?
The Cell is faster but can only turn left. You must do a 270 to turn right.
Well I would say in my view that it is more of a dragster really fast in (/designed for) straight lines, not really good to cruise the USA where you have plenty of different types of roads: High way, standard road, city, shitty / falling apart roads, etc.
Things is (looking at the result) there was plenty of "straight lines" in a game (or part that could be made be straight(which in a technical term could related to data parallel problems with really few dependencies) so it had its success. (hence nAo comment while criticizing some design choices that it kind of made sense in a console but imho that a miss for such an ambitious project).
 
Last edited by a moderator:
That's something I don't understand, why did IBM go as far as starting to fab the successor and gave up in 2010 for "yield issues"? That doesn't sound like IBM was dissatisfied with the Cell approach at all. That the A2 and/or Power7 are better solutions today, it's obvious now, but why try the Cell2 up to starting to fab them, if it was that obvious it wasn't a good path? It means the main brains dismissed it for yield issues, not based on architecture philosophy. Or was it just saving face?
Never heard of a successor dismissed for yields issues. You have a serious source for that? (there was lot of noise surrounding the Cell thanks to its presence in a video game system).
I remember they came with a DP version of the Cell but that is all, I suspect that they did not make enough money out of it, so as IBM likes high margins they killed it.
IBM usually sells high margins products (and the software environment) they don't really care for yields, productions costs are not much of an issue.

Imho IBM vouched Xenon better and build the Power A2 which they use in at least two products (Power EN and Blue Gene /Q). The chief architect in charge of xenon got the position to develop that product.
 
My results at the time were that a file I was encoding at 40mbit H264 was perfect every time on the PS3 and a friend of mine with a 360 said none of the files were working.
Aah, you're talking about the dash player, yeah, that one has artificial limits on bitrate and other features. The HD DVD player was a lot more efficient because it could discard about half the H.264 spec. Both Blu-Ray and HD DVD use a constrained version of H.264 High Profile. The dashboard player also only supports stereo, while the HD DVD player supports all the relevant audio codecs.

When HD DVD launched, there was no encoder capable of real time high quality HD H.264 encoding. Studios were using clusters to try get close to real time so they could tune the encodes per scene. Nowadays your Core i3 in your laptop has a hardware 1080p real time H.264 encoder in it. Quite fun to think about how far it's come, and no one seems to have noticed.
 
I really hate commenting on Cell because it always sounds like whining.

I've said before it was more evolutionary than revolutionary, it's a very clear incremental step from the VU's in PS2, which leads me to believe it was very much Kutaragi's baby.

What I've always thought was the mistake was the assumption that vector workloads were the dominant problem in games, sure you do a lot of vector ops and probably half your run time is doing nothing but them.

It was an issue in the PS1 era, because the game logic was so simple, but of all the games I had to optimize of PS2 and XBox, on not one did I ever save a lot of time by rewriting vector code, it was always the same the game play code was always >50% of your runtime and killing you because of the way it accessed memory. It was always death by a thousand cuts, with no clear way to optimize.

I've often wondered what the intent of Cell was in a game console was?
What was the expectation, that developers would adapt in some way to the architecture?
I really think all of the shortcomings were glaringly obvious from the outset.
I haven't done an exhaustive review of games by any stretch, but I'd bet the bulk of SPU cycles are being used trivially parallel graphics tasks.

The only way I can see to judge it is as a part of the platform as a whole, and you ended up with something that shipped 12 months later, at a higher price and for most of it's lifetime has been "competitive".
 
Never heard of a successor dismissed for yields issues. You have a serious source for that? (there was lot of noise surrounding the Cell thanks to its presence in a video game system).
I don't remember where I read it... could have been noise. My brain has a very bad noise filtering ability. The official statements dont mention any "official" reason.

What's interesting in their statement was :
"IBM continues to invest in Cell technologies as part of this hybrid and multicore strategy, including in new Power7-based systems expected next year. IBM continues to manufacture the Cell processor for use by Sony in its PlayStation3 and we look forward to continue developing next-generation processors for the gaming market."

Does that mean Sony ditched them later on, or that they both decided to ax the Cell roadmap together, and the statement means IBM was hoping to provide a different solution to Sony?
 
Imo your argument is BS, game development has its bright minds for sure but there are plenty elsewhere, cell got discarded on solid basis by bright minds in the video game industry and elsewhere, including the main brains (ie IBM) behind the project.

I'm close to think that this talk is remnant of lazy dev bullcrap and that is not the issue here... /should get pruned.

NB what tech are you speaking about with regard to nAo? Because its nAo32 hdr implementation has nothing to do with the cell.
Of course, you would think my argument is BS. So, your reasoning is that Cell got discard for the future and that's why you won't accept the logic I put forth?

Let's forget that 400 engineers designed this "crappy hardware". Let's forget the budget used to design "crappy hardware". I guess that was a part of the design goal and was signed off on because of it. Let's forget all the real world tests and real world examples of top performance using this "crappy" Cell. Let's forget the devs that uses the Cell processor to create beautiful and breathing game worlds. After all, we don't judge ability by what can be best done on something. We judge ability by what can be done poorly on something, right? I mean, that makes sense, right?

And, nAo HDR implementation is more about the spirit of Cell programming. When most other devs said it was impossible, he used his mind to make it possible. It was man vs. machine, again, and man won. However, I was talking about the other comments he has made about Cell programming on this board. I've quoted some here a couple/few times. Then, there's Deano C and T.B.

Of course, this is all forgotten/ignored. Then, the excuses appear.
 
Of course, you would think my argument is BS. So, your reasoning is that Cell got discard for the future and that's why you won't accept the logic I put forth?

Let's forget that 400 engineers designed this "crappy hardware". Let's forget the budget used to design "crappy hardware". I guess that was a part of the design goal and was signed off on because of it. Let's forget all the real world tests and real world examples of top performance using this "crappy" Cell. Let's forget the devs that uses the Cell processor to create beautiful and breathing game worlds. After all, we don't judge ability by what can be best done on something. We judge ability by what can be done poorly on something, right? I mean, that makes sense, right?

And, nAo HDR implementation is more about the spirit of Cell programming. When most other devs said it was impossible, he used his mind to make it possible. It was man vs. machine, again, and man won. However, I was talking about the other comments he has made about Cell programming on this board. I've quoted some here a couple/few times. Then, there's Deano C and T.B.

Of course, this is all forgotten/ignored. Then, the excuses appear.
I'm not sure if you realize how fanboyish your "writing tone" is, it sounds almost as if you like you had personal feeling for a slab of silicon.
Wrt to nAo 32 it happens on the GPU if memory serves right... (ie instead of RGBA so 8 bits per channel including transparency, you had 8bits for U and V (so 2 dimensional color space) and 16 bit for the logarithm of the luminance.).

You decide to ignore all the other POVs about the architectures, that is your problem.
By the way I haven't said it is "crappy", though from reading other povs pro&con I would say that as a CPU (so in the general case not specifically in the ps3 case) the memory model is crippling the design. So is the absence of latency hiding mechanism which would makes evolution of the architecture constrained, you might want to keep the LS latency at 6 cycles no matter what.

Back to nAo I'm not sure what you are speaking about, I think it was in the Cell V.2 thread (it is a long while ago) but he criticized quiet some of the designs choices made by STI (I remember criticism with regard to the absence of I$, some multi-threading and the width of the SPUs.

With regard to the man winning vs the machine, Watson and terminator funny thoughts aside, I think that hopefully the man wins but not in the way you are thinking.
Man managed to harness the power of Cell in the PS3 either way the games that seems to be for you proof of the architectural soundness of the design would not have run as they did.
And man won again by designing newer slab of silicon that are closer to their requirements, be it more programmable GPUs, many cores CPU, or dedicated piece of hardware.
You make it sounds like the Cell failure is a miss opportunity for mankind, that is imho a bit ridiculous. It did the job in the PS3 (when you think of the investments in both hardware and software, one may say that it better have the job done). Elsewhere, all the actors involved abandoned it notwithstanding the big investments they made on the hardware and software. Sony went as far as giving up on BC. It should speak volume either way you think that all the actors involved in that matter of fact (so IBM, Sony, Toshiba and all the costumers that have not found interest in the product) were dumb asses, which is quiet pushing / ridiculous.

I'm more sorry for larrabee than for the broadband engine, I don't think that larrabee sucked that much, it allowed new stuffs to be done (things that GPU still don't do well), the chip was not perfect but more importantly there was no market for it as the GPU market is tied to API that are throwing most of larrabee advantages out of the windows.
I read in the thread about software rendering that it was actually performing at the level of a gtx 280, not too ba when you think it was in fact emulating a GPU (or the graphic pipeline as defined by the API).
 
Last edited by a moderator:
Yes, by all means, talk about my "tone". Let's talk about anything except the fact that those things in my post actually happened/happens...here. Better yet, let's just dismiss them. ;)

Come to think about it, when was the last time the "greats" (Deano C, nAo, and T.B.) have posted in the console forum? I still remember how Christer Ericson was trying to explain how the camera system worked in God of War 3. I remember how well that went, too.

I'll help you with the nAo stuff.

http://forum.beyond3d.com/showpost.php?p=1116934&postcount=388

There are tons of posts from T.B. in the "Was Cell Any Good?" thread, to be ignored.

Anyway, Cell was a technical powerhouse for those with the skills to wield it. That ability has, now, been mostly incorporated in the modern day/future GPUs. It has helped to force better game code/design throughout the developer community (even if only a little on the whole). It took Intel a long time to surpass Cell in a few areas. Has Intel consumer CPUs been able to surpass Cell in every area, yet (6 years later)?

Consumer don't buy things for a number of reasons. I didn't buy a Mercedes E class vehicle, when they pushed new models out. I was a potential customer. It's not like I didn't buy it, because it wasn't a great product. Do you understand?

Random thought: I still don't think I've seen an implementation of Sony's MLAA on another platform. I wonder if I will see it, on another platform, this generation.
 
Last edited by a moderator:
Wow. You sure like to drop names, don't you? Imagine what those "greats" could have done with a decent architecture?

It must make you really sad that cell is dead.
 
Yes, by all means, talk about my "tone". Let's talk about anything except the fact that those things in my post actually happened/happens...here. Better yet, let's just dismiss them. ;)

Come to think about it, when was the last time the "greats" (Deano C, nAo, and T.B.) have posted in the console forum? I still remember how Christer Ericson was trying to explain how the camera system worked in God of War 3. I remember how well that went, too.

I'll help you with the nAo stuff.

http://forum.beyond3d.com/showpost.php?p=1116934&postcount=388

There are tons of posts from T.B. in the "Was Cell Any Good?" thread, to be ignored.

Anyway, Cell was a technical powerhouse for those with the skills to wield it. That ability has, now, been mostly incorporated in the modern day/future GPUs. It has helped to force better game code/design throughout the developer community (even if only a little on the whole). It took Intel a long time to surpass Cell in a few areas. Has Intel consumer CPUs been able to surpass Cell in every area, yet (6 years later)?

Consumer don't buy things for a number of reasons. I didn't buy a Mercedes E class vehicle, when they pushed new models out. I was a potential customer. It's not like I didn't buy it, because it wasn't a great product. Do you understand?

Random thought: I still don't think I've seen an implementation of Sony's MLAA on another platform. I wonder if I will see it, on another platform, this generation.

Uhhh,
1. Intel invented MLAA, not sony.
2. Sony is unlikely to give out their version of MLAA to non sony devs.
3. Even if they did give it out, it would be programmed to run on Cell, not what everyone else uses for post AA (the GPU).
4. Most devs seem to like FXAA more.
5. A few 360 games do use MLAA.
 
Kool Aid aside (from both ******* sides), does anyone have an answer to this question?

I can't find a reliable source for the actual reason they scrapped the whole thing so late in the process. Both seem to have abandoned it in november 2009. Did Sony and IBM decided to ditch the Cell together, or did one company ditched it first and the other had to follow?
 
Status
Not open for further replies.
Back
Top