Cell's dead baby, Cell's dead. Even in its final form. *spawn*

Also, if we look at the history of GPGPU, it also had a quite shaky start. In the first few years, many people were still skeptical on whether GPGPU is or will be useful. Many put GPGPU as something that's only useful for some specialized tasks, with no real impact on more consumer oriented workloads. Some may argue that it's still true, as most GPGPU applications are still in the scientific computing and AI realms. So, in an alternative history, it's quite possible that CELL-like architecture would be limited to similar scopes.

A big problem with GPGPU is that you can't really write platform-independent code. OpenCl never really took off, and the only really good framework is CUDA that is tied to nVidia.
 
We arent discussing which is the more capable system which again is subjective and a different subject since both consoles had their own set of pros and cons. We arent discussing about whether power defines who the market leader is going to be either. Again thats a different subject.
We arent discussing whether Sony's overall choices for the PS3's architecture in general were the best either. We all agree that the RSX was weaker and that the CPU was a pain. So I am not sure what you are arguing.

We are only discussing the capabilities of the CELL processor and the circumstances under which it was properly or improperly utilized. In general the PS3 punched above its weight despite the tangible weaknesses compared to the 360 and the various false notions about the CELL. And it was all thanks to the CPU that the PS3 later in its life performed so close, in many cases equal, and in some rare occasions better while it spat out AAA titles that set new standards in console gaming visuals more often than the 360 did

i have no idea what your arguing here, the topic is the Cell and how dead it is. Im giving arguments as to why that is in the gaming space, though the architecture had no use anywhere basically.

The raw data from paper specs, performance tests and benchmarks coupled with an understanding of the data and processing workflow of a game (or indeed any CPU optimised workload) seeing how it can be parallelised and streamed. Firstly, your CPU has a peak amount of maths it can do. Then it gets bottlenecked trying to keep them busy. If you can keep them busy, the CPU with the most ALUs wins. The case you are wanting, a game 10x better on PS3 than XB360, you will never find, so you have to understand what's actually happening with the software, data, and processing to see where Cell's strengths lie and why it was attempted in the first place.

Paper specs, performance tests and benchmarks are nice, though i usually look further than that, what the machine actually performed like in games. Im also reading what developers and users that have had hands-on with the thing have to say. It has more value to me than what people personally think about the CPU and its architecture.

We could have had better results if devs were able to develop Cell-ideal games too and the software was actually mapped well to it.

Basically, Cell represents a returning to computing's roots, and as such will be intrinsically 'superior' to getting work done, but at the cost of software requirements. Computers work a particular way which is completely alien to human thinking. In computing's infancy, people had to learn to think like a machine. They hand-crafted code to make the machines do things the way the machines liked, to get them to work at their peak. Over time, the exponential increase in processing performance coupled with a need for more complex software saw processors designed to accommodate a human way of doing things. Eventually, code development became human led, based on being easy for developers, rather than led by the machine's requirements. But this is wasteful both in prodessing time and silicon budget. Now your CPU is hopping around random memory and trying to shuffle which instructions to do when to get the developers code to operate effectively. The result is your chunky processor attaining 1/10th what it's actually capable of doing. But then you have to write code the Old Fashioned way, fitting the machine rather than having the machine work with your human-level thinking - 'difficult to code for'

Many of the criticisms levied at Cell are only of the time. It was a processor in its infancy, with tools to match. When you look at the tools such as that guy's video about Time To Triangle and the massive program needed for Cell proving how complicated and 'crap' the SPE was, that simple 5 line 'printf' code is hiding a metric fuck-ton of code hidden in the OS, firmware, tools, libraries, etc. The actual amount of code needed for the machine to produce that "Hello World" on screen is possibly millions of instructions. That's been hidden from the dev thanks to tools that have been developed over decades. If you were to compare the raw machine language to get the CPU to put something on screen to the machine language needed for the SPE, the difference would be a miniscule percentage more effort.

Now imagine Cell has been around 15 years. You'll be able to go into VS and include a bunch of standard libraries that do the base-work and boiler plate for you, just as happens with the x64 CPU you write to. Not only that, the Cell has received numerous updates to improve its usability. Now you have a 128 core processor that's faster than Threadripper but far smaller because it doesn't have all the legacy stuff designed to make 'human centric' code run at a decent speed because all the software on Cell is 'machine centric'. There's no reason to doubt this as it's exactly what's happened in the GPU space. GPU workloads were designed around streamed data keeping the ALUs occupied, and software was machine-friendly needing educated engineer to write bespoke shader. GPU work hasn't had decades of clumsy branchy code that modern GPU are haven't to try to run quickly. As such, you can stuff multiple teraflops of ALU in a GPU's silicon budget and get that to do useful work thanks to not hopping around randomly in memory and executing instructions in arbitrary orders. Just as GPUs are faster at data-driven workloads than CPUs (and CPUs are better at machine-centric work rather than human-centric work), so too is Cell in principle.

Now imagine a modern data-driven engine on PS3, maybe using something like SDFs, new concepts that could have been supported on PS3 but they weren't in the developer toolset or mindset at that point. The results would have been mindblowing and likely not possible on other platforms that lacked the raw power to execute.

Hard disagree with all of this. Its alot of IF's, 'imagine', different universes and what a possible improved cell could have been. I dont believe a 'imagined cell' would have been faster than a Threadripper, not even in another universe.
The Cell was a weak CPU, it was also hard to code for. The RSX was weak and the badly managed ram pools didnt help it much either. Nah, sorry, i wont give the Cell or the PS3 much credit, just like DF doesnt nor the people who have had hands on with it (developers and hobby coders), we can all be glad Sony abandoned the idea to try something similar with the PS4 again.
 
i have no idea what your arguing here, the topic is the Cell and how dead it is. Im giving arguments as to why that is in the gaming space, though the architecture had no use anywhere basically.

That will come as news to IBM who sold a lot of Cell-based server blades, including the IBM Roadrunner supercomputer built for the Los Alamos National Laboratory. That was decommissioned in 2013 not because it was slow, but because it was simply too energy inefficient which was the biggest issue for Cell.

Cell was designed when options to achieve massive non-symmetric parallel computation was very limited outside of the supercomputer space. You have to remember that when Cell was designed, Intel's Core2 architecture was years off and symmetric multiprocessing was also rare. IBM (and Motorola and Sony) getting Cell to market really quick by levering PowerPC and it arrived around the same time as Core2.

While Cell scaled-well in terms of performance compared to almost all other architectures, it was fundamentally a very energy-inefficient design, a situation which only worsened over time and - for reasons that were never made clear - the design did not node-shrink well. A lot of people learned of lot from what Cell did well and what it did poorly.
 
Its alot of IF's,
Yes. It's called a 'hypothetical discussion'. There are no real world examples so we have to postulate on theory. The value in such discussions, a mainstay of scientific discovery, if whether they are realistic or outlandish. The 'IFs' have to be evaluated on their plausibility to determine if the proposed outcome is likely or not. Then, if possible, you follow up with experimental research to prove or disprove the theory.
The Cell was a weak CPU, it was also hard to code for. The RSX was weak and the badly managed ram pools didnt help it much either. Nah, sorry, i wont give the Cell or the PS3 much credit, just like DF doesnt nor the people who have had hands on with it (developers and hobby coders), we can all be glad Sony abandoned the idea to try something similar with the PS4 again.

The Cell was capable of 200 Gflops sustained in numerous workloads, and actually achieved close to this peak. It was shown doing workloads like 'realtime raytracing' far faster than alternatives of the time. How the hell is that a 'weak CPU'? The question then becomes whether this strong CPU is inefficient at some workloads - strong but specialised, rather than weak. The classic story is the PPE was weak as a primary cores and SPEs were marginalised for game-based work, giving rise to the overall summary that Cell was a mishmash processor not really good at anything in particular for game development, and it would have been better to use a traditional multicore processor. The conclusion there, it'd have been better for PS3 to use a conventional CPU, is undisputed. Yes, in PS3, a different processor with less maths and better IPC would have been better (well that's disputable actually given the pairing of GPU, Cell looks necessary to support RSX to match Xenos). However, that is a different question "was Cell a good chocie for PS3" to the one "was Cell a good CPU design."

The real argument comes down to whether data-driven game design is better or not? If it is, then the question becomes if Cell was (notably) more capable at that than other processors of its time. If that's the case, it's proven a 'good processor'. However, that's not a discussion you're equipped to be a part of, so at this point you'll just have to agree to disagree. It's a discussion the dev community still seems to be struggling with. And in fact one that's really a discussion in its own right. And also one very few people could actually talk about because you'd need to be able to map ideas of data-driven design to the limitations of Cell. How many people in the world are actually positioned to know whether 256 kb of SPE LS is going to be enough to fit the latest fabulous way of doing things? My theory might come crashing down the moment I take my data-driven game engine and find it's 24kb over LS in every workload!
 
Dang, we shouldve had the Cell in todays gaming machines :p
No, because of all the other issues, the problems supporting an entirely different paradigm in a world of multiplatform titles and industry founded on thinking and working a certain way. However, the choice of processor for PS4 and PS5 and beyond isn't determined by the processor being 'bad' but by what makes sense. The very words "good" and "bad" are basically useless without further clarification of the comparison metrics and what goal they are being measured against!

Putting it another way, is the Metric System 'good' or 'bad'? Objectively, it's 'good' for its purpose, founded on more practical basis and with interchangeability of measures. Okay, so the US should swap to Metric? Well, doing that would incur significant costs and complexities; it's not really something that can be done overnight with the flick of a switch. Therefore converting to the Metric System could be considered 'bad' and something to avoid. Cell and its paradigm could be the Metric System to processing, with the current industry being the entrenched US economy unable to switch and only aware of all the problems and costs it would incur. That doesn't make Cell a bad design, nor make it a smart choice to use.

In your specific case of Cell continuing in consoles, the workloads Cell would work at will be moved increasingly onto Compute, akin to educating young Americans on the Metric system alongside the entrenched Imperial System so they can adapt to the more efficient ways of working in the Sciences. The jobs of designing problems to fit streamable data and processing them efficiently has moved from the CPU programmers to the GPU programmers, and the work of designing processes to deal with streaming workloads has moved from the CPU engineers to the GPU engineers. The main body of programmers can still write their object-oriented code and data structures, and use optimisations to do 'least possible work' instead of 'fastest possible work'. This transition, with all the performance advantages it brings, is graceful where Cell wasn't and cannot ever be.

And of course it's all moot anyway because the future is likely ML and completely different paradigms again where the workload isn't about solving exact computations as quickly as possible but minimising approximation errors in AI solvers that are basically guessing on what we should be getting. 😉
 
Can we move off the RSX discussion here, please? Cell wasn't designed as a coprocessor for RSX and the choice of graphics rendering Cell got paired with in PS3 didn't impact the design or decision making that went into Cell. It was finalised and produced before the graphics half of the platform was finalised, hence whether Sony could have got G80 or something ATi or whatever is immaterial to the discussion of Cell's suitability as a (game console) CPU.
 
Tell you a console that Cell might have been a good fit for tho ... Dreamcast.

All that T&L power, decompress power, culling power.... and not a single vertex shader to steal the glory.
 
I wonder how SPU's would do with BVH generation and updating compared to a modern, more traditional CPU.
This is where things get radical and I vaguely recall Mike Acton come against this in a talk and I don't think he explained it in a way the question asker liked! We know BVH is good for speeding things up as it avoids work. Hence we want to map that onto processors to do that job quickly. If processing that tree then doesn't fit the SPEs, what use are they?

Well the real question has to start sooner then the pre-existing understanding of the 'correct way'. Why are we using a BVH? To reduce the amount of work we have to do looking up relationships between objects. But is that the best way to get to the answer in the least time? The actual job is 'how do we compare/select all the objects based on locality in the least possible time' or somesuch. A tree is the solution on conventional processors. What if instead of doing the work to create a subdividing structure representing the relationships and traversing it in branchy fashion, you can instead structure the relationships in a flat format, have to do a lot more work searching, but because that search and compare is now hardware-optimised it ends up faster? This is exactly how GPGPU developed, addressing the problem from a more open starting position and mapping the data onto the hardware rather than trying to make the hardware with a data that is not a natural fit.

However, it's not a question easily answered. There isn't the ongoing research to really push this as research is largely commercially led. We get new ideas here and there but they aren't then explored by a sizeable academic community until they start to get traction. Also you get competing ideas and some get sidelined and not explored as far as others.
 
The Cell was a weak CPU, it was also hard to code for. The RSX was weak and the badly managed ram pools didnt help it much either. Nah, sorry, i wont give the Cell or the PS3 much credit, just like DF doesnt nor the people who have had hands on with it (developers and hobby coders), we can all be glad Sony abandoned the idea to try something similar with the PS4 again.
Cell design is extremely powerful for its time. It may have been difficult to extract the power from cell, but that was a learning process. The hardware is capable of what the hardware is capable of. I think if people knew how to maximize PS3 from day 1 it would have been a pretty solid case vs Xbox 360. It's just one of those things that is attributed to the fact that developers were still largely of single threaded mindset back then. The fact that we have to revisit those types of software designs again is sort of testament that we got back there anyway, only difference is that we do it through cores instead of SPE.

The reality is that, to really maximize hardware, developers need to code software specifically designed and tailored to it. Which, will mean moving away from object oriented code, and code that is written for human logic. But in this day of age performance seems to matter less than iteration speed. The reality is, the more you want to scale up, you'll likely end up adopting some form of coding in which everything is done in parallel and at the same time. We can't just slowly iterate through each object and apply logic to it. Single threaded game code is really about simplification for our minds to handle, but as you can see with DoD and where unity is headed, there are ways to program in parallel that takes significant advantage of the hardware.

There are issues with Cell, but parallel programming paradigm wasn't one, that's a developer problem that nearly 20 years later are we starting to see a push back in that direction. It is sad to see UE5 not be able to move in this direction just yet, but honestly, so many games are still stuck in that single threaded loop still.
 
Edit: The saddest part, "Mike Acton" is a B3D member who actually teamed up with Beyond3D to host the Cell development centre for discussion and best practice. You can check out his content at
I'm not sure what's particularly sad about it =P Things are always changing, people just go with the flow. I think it's great that this site has that type of history here. Should be celebrated not saddened!
 
Narrator: It turns out it wasn't.

Cell was ready, but the dev tools and documentation were not. I've programmed Cell in a large scale server environment and like a lot of complex architectures - and what isn't these days - everything hinges on good documentation and good tools. Sony eventually realised this and this drove them to acquire SN Tools in 2005, but it was too late for games launching in the first couple of years.

Cell didn't work like any other processor design and there were initially no realtime debugging tools at all, what you had was IBM's Cell emulator which ran like a dog on every platform of any cost because accurate Cell emulation means emulating the individual PPU, SPEs and the interactions across the internal Cell bus, and the two external memory buses in PS3, which is why Cell remains a challenge to emulate some sixteen years later.

I spent a lot of time writing code for Cell in a massive server environment and we had pretty good tools, but we had to rewrite all our tools for Cell development from the ground-up. It was almost like alien tech arriving and you have to solve the mystery of how it worked. Getting Cell code to run some calculations at 20,000 x faster than x86 servers was joyous, but man it took so much damn effort.
 
Last edited by a moderator:
Reminds me exactly of my times with bleeding edge DSPs in the mid 1990's. I would not want anyone to know or share in those experiences at all. Truly dreadful. Don't recommend.
Yup, DSPs were in a very similar situation. Another ephemeral blip of technology that was so different in implementation from anything before and which ultimately became just a stepping-stone to more mature technologies that deliver the same result without as many flaws. And before DSPs, it was Transputers. Ever radically-different technology struggles because there is so much possibility that it's almost impossible to devise focused development tools.
 
Back
Top