Why were 2 of my posts moved here? They're commenting a rumor that has nothing to do with Cell..
Nope. More efficient to just throw hardware in there to do just that and only that. My belief is that as long as you have to have a traditional rendering pipeline and you need it to have maximum performance you need a GPU and if you need a GPU anyway, you're going to be better off making that bigger. Cell is not specialized enough to be an efficient GPU (or an even more specialized processor like a RT processor) and too specialized to be a good CPU. There's no more place for it today than there was 5 years ago and there wouldn't be a place for it in the next machines either. Maybe something like it would work in what comes after that if we see a transition completely away from rasterization.
From that paper, it appears a big benefit is in the SIMD ISA and generous register set.
They made some decisions in implementation to work around pain points in the architecture. Running a ray tracer in each SPE rather than running separate stages of the ray tracer's pipeline on different SPEs and passing along the results. Software caching, workarounds for serious branch penalties, favorable evaluation of software multithreading for the single-context SPEs.
The architecture they were trying to write onto the hardware was a SIMD architecture (possibly with scalar ISA elements), different forms of branch handling, better threading, and caches.
i was not suggesting Cell should play the part of a GPU in the next PS5. I was proposing to use a individual Cell as an addition and not part of the APU. I know that the Cell is not good enough as a GPU but as an dedicated RT Chip he could do wonders.
As we don't know how the blocks in the GPU are facilitating raytracing, it's hard to compare. The flexibility of a raytracing processor rather than a memory-access block thing may also be better in supporting different algorithms, although GPUs are now very versatile in compute. Cell could also see a RT acceleration structure added in place of a SPE - the original vision allowed for specialised heterogeneous blocks to be added.I know what you were saying. And no, it couldn't. Hardware designed to do RT and only RT would absolutely destroy it in performance, power usage and would take up much less space on the die. You are effectively arguing that since Cell was better at video decoding back in 2005 than a typical CPU at the time that adding Cell to modern GPUs to act as the VPU would do wonders for their ability to decode video. The dedicated fixed-function blocks in GPUs responsible for video decoding are much more performant and use up much less space and power then any evolution of Cell ever could and it's exactly the same thing with dedicated RT hardware.
Doing some research on PVR, all this noise nVidia is getting is ridiculous. Power are/were soooo far ahead but they've been overlooked because they are a mobile chipset used in a limited number of devices.
I wonder what a monster-sized Power GPU with raytracing would look like?
As we don't know how the blocks in the GPU are facilitating raytracing, it's hard to compare. The flexibility of a raytracing processor rather than a memory-access block thing may also be better in supporting different algorithms, although GPUs are now very versatile in compute. Cell could also see a RT acceleration structure added in place of a SPE - the original vision allowed for specialised heterogeneous blocks to be added.
Yes, unfortunately PVR was out of the desktop space at that time and had been for many years. Had they still been involved in desktop PC computing they may have been able to make some headway. But being as they were basically all in on mobile at the time, what mobile developer is going to risk resources on RT for a chip that may or may not get picked up for a mobile SOC? A discrete PC solution isn't reliant on a SOC manufacturer picking it to use in a commercial product.
It's just really unfortunate that they were basically in the wrong hardware space to get something like that pushed into real use.
While the pick up in the desktop space (not just gamers, but research and science as well) may still have been slower than what is happening with NV's Turing chip due to being far smaller than NV in terms of size and especially marketing and developer relations dollars, it would have seen far greater adoption than it did with them being seen as a mobile graphics provider.
Regards,
SB
I did see a reference to a DSP architecture that presaged this, the TI TMS320C80 MVP.
Single generalist master core and 4 DSPs. It didn't seem to catch on.
As far as multithreading went, such techniques were needed for other hardware in the 8 years after Cell, and so reusing them would make sense generally without tacking on the complexity of the SPEs, a master core/bottleneck, and lack of caching.
In part the story of PowerVR Ray Tracing is about Imagination being in the wrong market to see it adopted, but from a different perspective, it's about Caustics, the company that actually developed that tech, having been aquired by imagination and not AMD not Nvidia.Well it's too bad Apple has ditched them for their own graphics IP. There could've been some legit merit in Apple using PVR's embedded RT tech in specialized workstations and even in the iPad Pro for design professionals doing their work and showing it off to clients in realtime.
Now that Imagination is defacto owned by the Chinese, there is now a new inroad for the company to produce GPUs for China's massive market for all segments. I sincerely believe dedicated GPUs could be part of that, even if they are in the lower end range. Affordable RT hardware would be a boon to a market that at the lower end can't drop oodles of cash on RTX based Quadros for real time ray tracing.
Seems like a chance for PVR to really undercut into segments RTX probably won't cover for two to three generations.
Rasterization is still very well suited to the width and burst-friendly cache and DRAM architectures, which Nvidia's research still seems to encourage using for the majority of what winds up on-screen. Cell's big assertion was that much of what had been put into CPU architecture in the last decade or so could be discarded, so in that regard wouldn't requiring a lot of investment be a sort of defeat? The reasons for why the individual elements Cell removed were so popular were very compelling ones, and the bet that standard architectures had run out of steam was proven wrong.Which mirrors what's happened in the RT space. There's been so much investment in rasterisation, RT is at a technological disadvantage. Makes it very hard to compare the two in terms of absolute potential.
i was not suggesting Cell should play the part of a GPU in the next PS5. I was proposing to use a individual Cell as an addition and not part of the APU. I know that the Cell is not good enough as a GPU but as an dedicated RT Chip he could do wonders.
but would that prevent a more modern Cell to be a part of a next Gen Console? The paper itself suggests that there is need for other rasterisation solutions .
If Intel could, then so would everyone else and the 10-15 GHz processors would have been as good or better than the dual and quad cores that we got instead. Multi-core is an inferior method for scaling performance, but one that remained physically possible.Yup but multi-core/multi-threaded solutions likely wouldn't have found adopted had Intel been able to keep cranking up clock speeds.
Yes. The concept was small cores, lots of 'em, and heterogeneous. The reason useful stuff was stripped out for Cell 1 was to make them small enough to get more on a die. At smaller lithographies, more can be invested per core to make it better based on how people actually need to use it, while still providing a huge CPU core count of 60+ on a die.Is an architecture with threading, branch handling, caches, more modest pipeline, and other generalist features really Cell?
Yes. The concept was small cores, lots of 'em, and heterogeneous. The reason useful stuff was stripped out for Cell 1 was to make them small enough to get more on a die. At smaller lithographies, more can be invested per core to make it better based on how people actually need to use it, while still providing a huge CPU core count of 60+ on a die.
If Intel could, then so would everyone else and the 10-15 GHz processors would have been as good or better than the dual and quad cores that we got instead. Multi-core is an inferior method for scaling performance, but one that remained physically possible.
Isn't that larrabee? (Failed, scrapped or not)Intel can't because 80x86 is heavily laden with legacy requirements. Nobody researching quantum computer applications is working with hardware anywhere near as slow [in terms of clock frequencies] as anything Intel sell commercially. Cooling remains a challenge, but not impossibly so.
Isn't that larrabee? (Failed, scrapped or not)
80x86 is not 80 time x86 cores? Ohh it's 8086 et al chip family...
That's true, but that doesn't prove it'll never be the best move in future.The track record for success producing things that are 1/2 way between a traditional multi-core CPU architecture and the GPU paradigm is spotty at best. Xeon Phi has it's niche, buy it's not exactly setting the world on fire.