Return of Cell for RayTracing? *split*

Discussion in 'Console Technology' started by Arkham night 2, Aug 31, 2018.

  1. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    8,156
    Likes Received:
    6,425
    Would a second generation of Cell for this gen ballooned their price point beyond $399?
     
  2. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    11,155
    Likes Received:
    5,960
    Location:
    London, UK
    Cell was designed to accommodate traditional processing rather than graphics so it was considered a strength to have a traditional general core (PowerPC) for traditional processing jobs that the SPEs were poor at, along with the SPEs for those things that they proved to excel at. RSX were Sony's last resort which is much of the reason PS3 was that much more complex to design games for compared to 360 which was clearly designed with a unified architecture from the outset.

    Frankly, I'm amazed PS3 was not more of a Frankenstein's monster. :yep2:
     
  3. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,615
    Likes Received:
    2,023
    Well this whole line of thinking was started by Shifty speculating that Cell, if it saw continual development up through today, might have been an interesting architectural choice if the traditional rendering pipeline could be dumped completely. Since that wouldn't have been viable at the start of this gen, they would have to have either added elements of the traditional rendering pipeline to Cell so it can take over for the GPU or make Cell more viable as a general-purpose CPU. If you don't you end up with the three distinct processing elements again. I'm very skeptical they could have pulled either of those two things off in a way that would have been more performant than what they got from the APU solution from AMD and it would have cost a ton more in R&D. This last bit makes it hard to to really know if it was indeed off the table for technical reasons and the AMD solution was just better or, given Sony's overall financial situation and Cell's failure to expand beyond being a niche product (outside of it's use in the PS3), it was all about there being a poor ROI in continuing to develop Cell.
     
    iroboto and milk like this.
  4. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    11,155
    Likes Received:
    5,960
    Location:
    London, UK
    You have to remember that the variant of Cell that Sony used in PS3 was different to the full blown processor design that powered supercomputers like Los Alamos's Roadrunner. Cell's PPU/SPE design was (and to this day remains) are an attractive architectural approach for certain type of loads where the mix of general purpose code running on the main core is proportionately balanced to the highly-vectorised code running on the SPEs - but that's probably not a great most for mainstream computing tasks and why personal computers and not engineered like supercomputers.

    The biggest barrier for Cell was not the SPE architecture, which could have been improved massively, but the PPU being PowerPC. PowerPC was what really killed Cell as a desktop or high-performance processor.
     
  5. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,615
    Likes Received:
    2,023
    Starting to drift too far OT, so I'll self-moderate here (you're welcome @BRiT). And every thread I found in search to forward discussion of the merits of Cell ended up being locked after running their course. The idea of an alternate-reality Cell as a postmodern rendering device was a newish wrinkle, but the "what Cell got right/what Cell got wrong/why Cell died/how might Cell have lived" discussion has pretty much been done.
     
    BRiT likes this.
  6. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    11,155
    Likes Received:
    5,960
    Location:
    London, UK
    And it is indeed a pointless discussion as are most what-if discussions, particularly when predicated on the benefit of hindsight.
     
  7. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Is the cell speculation going on here really serious? Maybe for ps6?
    Emotion Engine 2 with GS2 and a whole lot of edram, instant buy for me :) Does such architecture even fit for raytracing?
     
    Heinrich4 likes this.
  8. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    11,155
    Likes Received:
    5,960
    Location:
    London, UK
    No.
     
  9. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Too bad, maybe some other architecture in the future?
     
  10. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    11,155
    Likes Received:
    5,960
    Location:
    London, UK
    Nobody in their right mind would be thinking about changing architectures unless there were overwhelmingly strong cost or technical reasons for it. Why do you want to see a change in architectures?
     
    Scott_Arm and vipa899 like this.
  11. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    PS2 had uniqe graphics for its time, thx to its odd hardware. And offcourse many think its kind of cool with something like ee or cell, i do understand though that its not developer friendly.
     
    Heinrich4 likes this.
  12. DSoup

    DSoup meh
    Legend Veteran Subscriber

    Joined:
    Nov 23, 2007
    Messages:
    11,155
    Likes Received:
    5,960
    Location:
    London, UK
    And as technology gets more complex developer friendless becomes ever-more important. Cell took a further unconventional leap forward from PS2 that was sufficiently impactful for developers enough that Sony ruled that out for PS4. If you want to hear it from the horse's mouth, Mark Cerny spoke about this very deliberate decision at Gamelab 2013 in his presentation, 'The road to PS4'.

    After his introduction and background the next 20 minutes is spent on how things since the original PlayStation got more complicated and PS4 had to reverse that. He even mentions specific technical options (i.e. the approach that could've been) in the presentation. It's crap quality but fascinating nonetheless.

     
    bitsandbytes and vipa899 like this.
  13. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    8,156
    Likes Received:
    6,425
    I would say learning about what the drivers are for why Sony left Cell could be a useful
    History lesson on what type of factors could go into their next gen device.

    Part of it could have been cost. The other part developer ease. The $399 from a Cost perspective seems wildly successful and is likely at least from what I can see a larger factor for success in terms of sales than developer ease when we are looking strictly at the hardware.

    An interesting point here is that if we assume Cell 2 was a possibility for this gen, and at one point in time It must have been considered, then Sony effectively traded big performance for price and developer ease.

    That should be a factor people should consider in these predictions.
     
    Heinrich4 and vipa899 like this.
  14. vipa899

    Regular Newcomer

    Joined:
    Mar 31, 2017
    Messages:
    922
    Likes Received:
    354
    Location:
    Sweden
    Yeah understandable, its between the ears that exotic hardware is so cool... on the other hand i dont care what hw is in there, if the games look and behave next gen it doesnt matter what kind of hw is under there. Next gen amd apu could easily be as good as ee2/gs2.

    Might ps5 be 8~10 TF then that would be 5x the improvement over baseline ps4 hw games follow. If thats 5x GoW/last of us 2/hzd, i wouldnt complain, thats not even considering cpu/ram/storage improvements.

    For ps2 it was abit special though as it gave it some uniqe gfx effects.
     
  15. iroboto

    iroboto Daft Funk
    Legend Regular Subscriber

    Joined:
    Mar 6, 2014
    Messages:
    8,156
    Likes Received:
    6,425
    For what it's worth, for a long time I had always assumed that rasterization and compute would be the be all go forward and that anytime we would be able to muster up enough hardware power for RT, that there would be so much more power in non RT methods and approximations that real time RT would never be a thing. Granted, true native realtime RT is still no where close. But the architectures and designs have come out for RT hybrids. Given enough time, we look at eventuality and see where things are headed ultimately.

    Such a thing could be possible with other unique architectures to solve different problems. We may yet see a re-entrance of those older concepts, but it likely won't be in the cards for a while.
     
    vipa899 and Shifty Geezer like this.
  16. Skaal

    Newcomer

    Joined:
    Oct 16, 2015
    Messages:
    19
    Likes Received:
    10
    Heinrich4 likes this.
  17. mrcorbo

    mrcorbo Foo Fighter
    Veteran

    Joined:
    Dec 8, 2004
    Messages:
    3,615
    Likes Received:
    2,023
    Nope. More efficient to just throw hardware in there to do just that and only that. My belief is that as long as you have to have a traditional rendering pipeline and you need it to have maximum performance you need a GPU and if you need a GPU anyway, you're going to be better off making that bigger. Cell is not specialized enough to be an efficient GPU (or an even more specialized processor like a RT processor) and too specialized to be a good CPU. There's no more place for it today than there was 5 years ago and there wouldn't be a place for it in the next machines either. Maybe something like it would work in what comes after that if we see a transition completely away from rasterization.
     
    vipa899 and Scott_Arm like this.
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,137
    Likes Received:
    2,939
    Location:
    Well within 3d
    The assumption that there's no trouble connecting things up is typically the first thing such designs founder upon. The 200 GB/s bandwidth figure is something of an optimistic figure, since it is the maximum bandwidth in the case of the processors on the EIB transferring only to their neighbors. 80 such units would be 80x the memory bandwidth, as their capacity to reduce traffic is confined to each individual Cell.
    Trying to have a given SPE or PPE work with another core in another Cell would take the ring bus and in the absence of a topology change introduce a hop count far beyond what Cell was designed for and drop bandwidth to 25.6 GB/s (simultaneously throttling a subset of the local ring bus during the transfer).

    I did see a reference to a DSP architecture that presaged this, the TI TMS320C80 MVP.
    Single generalist master core and 4 DSPs. It didn't seem to catch on.
    As far as multithreading went, such techniques were needed for other hardware in the 8 years after Cell, and so reusing them would make sense generally without tacking on the complexity of the SPEs, a master core/bottleneck, and lack of caching.

    The Zen architecture's FP physical register file is sufficiently large enough to contain the SPE's register file. However, without an architectural change there's no encoding that can flatly address that many registers. The FPU itself has enough 128-bit operand ports to match what the SPE could do. The instruction latency for math operations would be better on Zen, although I'm not sure if the permute path can be so readily supported--particularly since Zen lost the more general permute instructions that belonged in Bulldozer's XOP set.

    The L2's latency would be at least twice that of the SPE's LS, which would potentially lead to problems with running out of independent work or context space to compensate. Treating the L2 like an SPE register file would be worse in terms of operand bandwidth, whereas the L2 right now would be better than the LS in bandwidth terms.


    There was the other issue that of the alliance that made Cell (Sony, Toshiba, IBM), only one of them really maintained or advanced a microelectronics division and technology base at that level. IBM, being an expert in coherent SMP systems and leading-edge interconnects, was not a fan of the SPE architecture or philosophy. It was Toshiba and to a middling extent Sony's old-school DSP expertise and lack of capability in the technologies of the day that pushed the comparatively unsophisticated SPE and straightforward LS.

    Sony split the difference to create Cell, and its subsequent overreach (had an entire leading-edge fab built for the architectural revolution that never came) mean this last hurrah of a toolset already showing signs of obsolescence ended in disaster. Toshiba, being the DSP maven, made a Cell derivative without a PPE that went on to have nobody care about.

    Given what Sony sold off, spun off, or cancelled, it's not clear where it would have gotten the expertise to get a Cell 2. IBM was done with that experiment, and there was another 8 years of advancement devoted to SMP or at least cached SIMD architectures. Sony spun the SPE as being an architectural leap, but many of its underpinnings were from a starting point that was arguably archaic. By the time of the current gen, given where GPUs and CPUs had gone, there was no thriving pool of initiatives revolving around speed demon branch-averse architectures with DMA-based memory access.

    From that paper, it appears a big benefit is in the SIMD ISA and generous register set.
    They made some decisions in implementation to work around pain points in the architecture. Running a ray tracer in each SPE rather than running separate stages of the ray tracer's pipeline on different SPEs and passing along the results. Software caching, workarounds for serious branch penalties, favorable evaluation of software multithreading for the single-context SPEs.
    The architecture they were trying to write onto the hardware was a SIMD architecture (possibly with scalar ISA elements), different forms of branch handling, better threading, and caches.
     
    Heinrich4, DSoup, mrcorbo and 4 others like this.
  19. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,102
    Likes Received:
    4,677
    Why were 2 of my posts moved here? They're commenting a rumor that has nothing to do with Cell..
     
  20. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    41,054
    Likes Received:
    11,671
    Location:
    Under my bridge
    Which mirrors what's happened in the RT space. There's been so much investment in rasterisation, RT is at a technological disadvantage. Makes it very hard to compare the two in terms of absolute potential.
     
    HBRU likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...