Return of Cell for RayTracing? *split*

It also didn't help that, with the PS3 implementation of Cell at least, developers had to code for three distinct processing elements with very different models between the PPC cores, the SPEs, and the GPU. Without being able to dump one of those elements in the next design, this would likely continue to annoy developers.
Would a second generation of Cell for this gen ballooned their price point beyond $399?
 
It also didn't help that, with the PS3 implementation of Cell at least, developers had to code for three distinct processing elements with very different models between the PPC cores, the SPEs, and the GPU. Without being able to dump one of those elements in the next design, this would likely continue to annoy developers.

Cell was designed to accommodate traditional processing rather than graphics so it was considered a strength to have a traditional general core (PowerPC) for traditional processing jobs that the SPEs were poor at, along with the SPEs for those things that they proved to excel at. RSX were Sony's last resort which is much of the reason PS3 was that much more complex to design games for compared to 360 which was clearly designed with a unified architecture from the outset.

Frankly, I'm amazed PS3 was not more of a Frankenstein's monster. :yep2:
 
Cell was designed to accommodate traditional processing rather than graphics so it was considered a strength to have a traditional general core (PowerPC) for traditional processing jobs that the SPEs were poor at, along with the SPEs for those things that they proved to excel at. RSX were Sony's last resort which is much of the reason PS3 was that much more complex to design games for compared to 360 which was clearly designed with a unified architecture from the outset.

Frankly, I'm amazed PS3 was not more of a Frankenstein's monster. :yep2:

Well this whole line of thinking was started by Shifty speculating that Cell, if it saw continual development up through today, might have been an interesting architectural choice if the traditional rendering pipeline could be dumped completely. Since that wouldn't have been viable at the start of this gen, they would have to have either added elements of the traditional rendering pipeline to Cell so it can take over for the GPU or make Cell more viable as a general-purpose CPU. If you don't you end up with the three distinct processing elements again. I'm very skeptical they could have pulled either of those two things off in a way that would have been more performant than what they got from the APU solution from AMD and it would have cost a ton more in R&D. This last bit makes it hard to to really know if it was indeed off the table for technical reasons and the AMD solution was just better or, given Sony's overall financial situation and Cell's failure to expand beyond being a niche product (outside of it's use in the PS3), it was all about there being a poor ROI in continuing to develop Cell.
 
Well this whole line of thinking was started by Shifty speculating that Cell, if it saw continual development up through today, might have been an interesting architectural choice if the traditional rendering pipeline could be dumped completely. Since that wouldn't have been viable at the start of this gen, they would have to have either added elements of the traditional rendering pipeline to Cell so it can take over for the GPU or make Cell more viable as a general-purpose CPU. If you don't you end up with the three distinct processing elements again. I'm very skeptical they could have pulled either of those two things off in a way that would have been more performant than what they got from the APU solution from AMD and it would have cost a ton more in R&D.

You have to remember that the variant of Cell that Sony used in PS3 was different to the full blown processor design that powered supercomputers like Los Alamos's Roadrunner. Cell's PPU/SPE design was (and to this day remains) are an attractive architectural approach for certain type of loads where the mix of general purpose code running on the main core is proportionately balanced to the highly-vectorised code running on the SPEs - but that's probably not a great most for mainstream computing tasks and why personal computers and not engineered like supercomputers.

The biggest barrier for Cell was not the SPE architecture, which could have been improved massively, but the PPU being PowerPC. PowerPC was what really killed Cell as a desktop or high-performance processor.
 
You have to remember that the variant of Cell that Sony used in PS3 was different to the full blown processor design that powered supercomputers like Los Alamos's Roadrunner. Cell's PPU/SPE design was (and to this day remains) are an attractive architectural approach for certain type of loads where the mix of general purpose code running on the main core is proportionately balanced to the highly-vectorised code running on the SPEs - but that's probably not a great most for mainstream computing tasks and why personal computers and not engineered like supercomputers.

The biggest barrier for Cell was not the SPE architecture, which could have been improved massively, but the PPU being PowerPC. PowerPC was what really killed Cell as a desktop or high-performance processor.

Starting to drift too far OT, so I'll self-moderate here (you're welcome @BRiT). And every thread I found in search to forward discussion of the merits of Cell ended up being locked after running their course. The idea of an alternate-reality Cell as a postmodern rendering device was a newish wrinkle, but the "what Cell got right/what Cell got wrong/why Cell died/how might Cell have lived" discussion has pretty much been done.
 
Starting to drift too far OT, so I'll self-moderate here (you're welcome @BRiT). And every thread I found in search to forward discussion of the merits of Cell ended up being locked after running their course.

And it is indeed a pointless discussion as are most what-if discussions, particularly when predicated on the benefit of hindsight.
 
Is the cell speculation going on here really serious? Maybe for ps6?
Emotion Engine 2 with GS2 and a whole lot of edram, instant buy for me :) Does such architecture even fit for raytracing?
 
Too bad, maybe some other architecture in the future?

Nobody in their right mind would be thinking about changing architectures unless there were overwhelmingly strong cost or technical reasons for it. Why do you want to see a change in architectures?
 
PS2 had uniqe graphics for its time, thx to its odd hardware. And offcourse many think its kind of cool with something like ee or cell, i do understand though that its not developer friendly.
 
PS2 had uniqe graphics for its time, thx to its odd hardware. And offcourse many think its kind of cool with something like ee or cell, i do understand though that its not developer friendly.

And as technology gets more complex developer friendless becomes ever-more important. Cell took a further unconventional leap forward from PS2 that was sufficiently impactful for developers enough that Sony ruled that out for PS4. If you want to hear it from the horse's mouth, Mark Cerny spoke about this very deliberate decision at Gamelab 2013 in his presentation, 'The road to PS4'.

After his introduction and background the next 20 minutes is spent on how things since the original PlayStation got more complicated and PS4 had to reverse that. He even mentions specific technical options (i.e. the approach that could've been) in the presentation. It's crap quality but fascinating nonetheless.

 
Too bad, maybe some other architecture in the future?
I would say learning about what the drivers are for why Sony left Cell could be a useful
History lesson on what type of factors could go into their next gen device.

Part of it could have been cost. The other part developer ease. The $399 from a Cost perspective seems wildly successful and is likely at least from what I can see a larger factor for success in terms of sales than developer ease when we are looking strictly at the hardware.

An interesting point here is that if we assume Cell 2 was a possibility for this gen, and at one point in time It must have been considered, then Sony effectively traded big performance for price and developer ease.

That should be a factor people should consider in these predictions.
 
Yeah understandable, its between the ears that exotic hardware is so cool... on the other hand i dont care what hw is in there, if the games look and behave next gen it doesnt matter what kind of hw is under there. Next gen amd apu could easily be as good as ee2/gs2.

Might ps5 be 8~10 TF then that would be 5x the improvement over baseline ps4 hw games follow. If thats 5x GoW/last of us 2/hzd, i wouldnt complain, thats not even considering cpu/ram/storage improvements.

For ps2 it was abit special though as it gave it some uniqe gfx effects.
 
Yeah understandable, its between the ears that exotic hardware is so cool... on the other hand i dont care what hw is in there, if the games look and behave next gen it doesnt matter what kind of hw is under there. Next gen amd apu could easily be as good as ee2/gs2.

Might ps5 be 8~10 TF then that would be 5x the improvement over baseline ps4 hw games follow. If thats 5x GoW/last of us 2/hzd, i wouldnt complain, thats not even considering cpu/ram/storage improvements.

For ps2 it was abit special though as it gave it some uniqe gfx effects.
For what it's worth, for a long time I had always assumed that rasterization and compute would be the be all go forward and that anytime we would be able to muster up enough hardware power for RT, that there would be so much more power in non RT methods and approximations that real time RT would never be a thing. Granted, true native realtime RT is still no where close. But the architectures and designs have come out for RT hybrids. Given enough time, we look at eventuality and see where things are headed ultimately.

Such a thing could be possible with other unique architectures to solve different problems. We may yet see a re-entrance of those older concepts, but it likely won't be in the cards for a while.
 
even more ironicly - the Cell and its derivatives appears to be well suited for RT
-> https://www.sci.utah.edu/publications/benthin06/cell.pdf

so.. maybe a sophisticated dedicated Cell in the PS5 to handle the RT?

Nope. More efficient to just throw hardware in there to do just that and only that. My belief is that as long as you have to have a traditional rendering pipeline and you need it to have maximum performance you need a GPU and if you need a GPU anyway, you're going to be better off making that bigger. Cell is not specialized enough to be an efficient GPU (or an even more specialized processor like a RT processor) and too specialized to be a good CPU. There's no more place for it today than there was 5 years ago and there wouldn't be a place for it in the next machines either. Maybe something like it would work in what comes after that if we see a transition completely away from rasterization.
 
Edit: There are plenty of funny numbers to be had with this idea. 80 Cell BBEs would fit on the silicon. Assuming no trouble connecting them up, you'd have 80x the attained ~200 GB/s SPE data access across the EIB, so 16 TB/s internal bandwidth. 160 MBs of SRAM local storage on SPEs and another 40 MBs for the PPEs' cache.

The assumption that there's no trouble connecting things up is typically the first thing such designs founder upon. The 200 GB/s bandwidth figure is something of an optimistic figure, since it is the maximum bandwidth in the case of the processors on the EIB transferring only to their neighbors. 80 such units would be 80x the memory bandwidth, as their capacity to reduce traffic is confined to each individual Cell.
Trying to have a given SPE or PPE work with another core in another Cell would take the ring bus and in the absence of a topology change introduce a hop count far beyond what Cell was designed for and drop bandwidth to 25.6 GB/s (simultaneously throttling a subset of the local ring bus during the transfer).

Cell was before it's time. It hit consumer hardware before highly parallelised multi-threaded code was prevalent in gaming and Cell relied on these techniques to shine.
I did see a reference to a DSP architecture that presaged this, the TI TMS320C80 MVP.
Single generalist master core and 4 DSPs. It didn't seem to catch on.
As far as multithreading went, such techniques were needed for other hardware in the 8 years after Cell, and so reusing them would make sense generally without tacking on the complexity of the SPEs, a master core/bottleneck, and lack of caching.

What do we need to get the cell's efficiency from a more traditional x86 architecture? I was thinking cell was fast because of the LS having ridiculously low (and fixed!) latency. It was literally like old school DSPs, 256k of registers.

I wonder if it would be possible to modify the zen architecture to have a local store to emulate cell, maybe repurpose half the L2 to behave like registers or something.
The Zen architecture's FP physical register file is sufficiently large enough to contain the SPE's register file. However, without an architectural change there's no encoding that can flatly address that many registers. The FPU itself has enough 128-bit operand ports to match what the SPE could do. The instruction latency for math operations would be better on Zen, although I'm not sure if the permute path can be so readily supported--particularly since Zen lost the more general permute instructions that belonged in Bulldozer's XOP set.

The L2's latency would be at least twice that of the SPE's LS, which would potentially lead to problems with running out of independent work or context space to compensate. Treating the L2 like an SPE register file would be worse in terms of operand bandwidth, whereas the L2 right now would be better than the LS in bandwidth terms.


I would say learning about what the drivers are for why Sony left Cell could be a useful
History lesson on what type of factors could go into their next gen device.

Part of it could have been cost. The other part developer ease. The $399 from a Cost perspective seems wildly successful and is likely at least from what I can see a larger factor for success in terms of sales than developer ease when we are looking strictly at the hardware.

An interesting point here is that if we assume Cell 2 was a possibility for this gen, and at one point in time It must have been considered, then Sony effectively traded big performance for price and developer ease.

That should be a factor people should consider in these predictions.
There was the other issue that of the alliance that made Cell (Sony, Toshiba, IBM), only one of them really maintained or advanced a microelectronics division and technology base at that level. IBM, being an expert in coherent SMP systems and leading-edge interconnects, was not a fan of the SPE architecture or philosophy. It was Toshiba and to a middling extent Sony's old-school DSP expertise and lack of capability in the technologies of the day that pushed the comparatively unsophisticated SPE and straightforward LS.

Sony split the difference to create Cell, and its subsequent overreach (had an entire leading-edge fab built for the architectural revolution that never came) mean this last hurrah of a toolset already showing signs of obsolescence ended in disaster. Toshiba, being the DSP maven, made a Cell derivative without a PPE that went on to have nobody care about.

Given what Sony sold off, spun off, or cancelled, it's not clear where it would have gotten the expertise to get a Cell 2. IBM was done with that experiment, and there was another 8 years of advancement devoted to SMP or at least cached SIMD architectures. Sony spun the SPE as being an architectural leap, but many of its underpinnings were from a starting point that was arguably archaic. By the time of the current gen, given where GPUs and CPUs had gone, there was no thriving pool of initiatives revolving around speed demon branch-averse architectures with DMA-based memory access.

even more ironicly - the Cell and its derivatives appears to be well suited for RT
-> https://www.sci.utah.edu/publications/benthin06/cell.pdf

so.. maybe a sophisticated dedicated Cell in the PS5 to handle the RT?

From that paper, it appears a big benefit is in the SIMD ISA and generous register set.
They made some decisions in implementation to work around pain points in the architecture. Running a ray tracer in each SPE rather than running separate stages of the ray tracer's pipeline on different SPEs and passing along the results. Software caching, workarounds for serious branch penalties, favorable evaluation of software multithreading for the single-context SPEs.
The architecture they were trying to write onto the hardware was a SIMD architecture (possibly with scalar ISA elements), different forms of branch handling, better threading, and caches.
 
Why were 2 of my posts moved here? They're commenting a rumor that has nothing to do with Cell..
 
Given what Sony sold off, spun off, or cancelled, it's not clear where it would have gotten the expertise to get a Cell 2. IBM was done with that experiment, and there was another 8 years of advancement devoted to SMP or at least cached SIMD architectures. Sony spun the SPE as being an architectural leap, but many of its underpinnings were from a starting point that was arguably archaic. By the time of the current gen, given where GPUs and CPUs had gone, there was no thriving pool of initiatives revolving around speed demon branch-averse architectures with DMA-based memory access.
Which mirrors what's happened in the RT space. There's been so much investment in rasterisation, RT is at a technological disadvantage. Makes it very hard to compare the two in terms of absolute potential.
 
Back
Top