Playstation 5 [PS5] [Release November 12 2020]

ND audio developer doing a joke about rendering people wanted to steal the Tempest Engine because most of the time they did not want to give some SPU to audio people.

https://twitter.com/sflavalle

Probably don't want to take his words literally. If it's just *a* (one) dedicated SPU, it won't be powerful enough.
Insufficient for major rendering work too.

He might mean a Compute Unit (that works like SPU).
The rendering/lighting people would want to steal "his" Compute Unit for graphics workload.

EDIT:
But if using for bc also, sounds like even more work to emulate using hybrid setup to me.
Or if not more work than using the cu's, wouldn't make it especially any easier?

It does sound overly complex to stick SPUs inside.
 
Probably don't want to take his words literally. If it's just *a* (one) dedicated SPU, it won't be powerful enough.
Insufficient for major rendering work too.

He might mean a Compute Unit (that works like SPU).
The rendering/lighting people would want to steal "his" Compute Unit for graphics workload.

EDIT:


It does sound overly complex to stick SPUs inside.

It is an hybrid CU/SPU but the progamming model is the same than an SPU. It is a joke linked to PS3 era when rendering people did not want to give SPU to audio people.
 
Even more curious, we have a performance estimation of it of 100-200 GFlops (whether going by CPU equivalent of CU performance). Cell would have been tiny and effective and more powerful and allow PS3 BC, and even be useable for other things. I wonder if it was considered? Did they see the value but find it but find it too difficult/costly to adapt Cell for an AMD SOC? Or was it not even on table, and if not, why not?
Sony might have needed to dust off the contracts and old data for Cell, if they maintained it. IBM was the last one to touch Cell, with a 45nm variant.
At this point, none of the three partners in the Cell alliance have kept up with the process node race, with IBM's server offerings lagging the least behind with a dedicated 14nm line at GF.
Someone would need to re-implement Cell, and it might be a question of the contracts and investment needed to spin that pipeline back up, or if it would be workable to invite an outside party like AMD in to implement it.
After that, some research and design modifications would be needed to get Cell working so many nodes past where it was last looked at, and to fit it as a slave device in a very different ecosystem. Something like that did happen with the Roadrunner supercomputer, although a straight port of that would require dusting off the 90nm K8 Opteron as well...

If Sony abstracted away the incompatible system architecture, (EIB, the interrupt handling of a PPC system, DSPs, different DMA engines) and 2 incompatible ISAs, it would just be for backwards compatibility and so an entirely backwards-facing investment.
Sony probably would not be up to the task, and I suspect Toshiba wouldn't be.
IBM might game if Sony paid enough, and maybe AMD could be enticed with a large sum if legal matters didn't get in the way. Both of them might feel that there would be too much re-building of an abandoned architecture, though.
A sufficient payment would be much greater than what was needed for the BC measures put into the CPU and GPU blocks, and no synergy would exist with AMD's extant product pipelines.

Not making the Cell block a BC-only feature and exposing to the outside world might have made it an ongoing engineering effort, which is something Sony tends to avoid.


The SPUs can be deployed without PPU. Toshiba sold such a configuration before:
https://en.wikipedia.org/wiki/SpursEngine
Lacking the PPE would hurt the backwards compatibility angle, however.

And it is heavily modified so much that it works like a SPU. This is not working at all like a CU This is what I said. ND Audio dev call it a SPU not a CU for example.

I know this is a modified CU but I am sure when dev code for it this probably have much more in common with SPU because of DMA and a serial programming model than a CUs with a cache and many wavefront.
Details are sparse, so I'm curious how different it its.
For example, the SPU is much narrower than a SIMD, and only having 2 wavefronts doesn't seem like it allows the wide hardware to be subdivided enough to give narrow-SIMD behavior.
What exactly replaces the caches, or which caches are replaced is another question mark. The GPU cache path is extremely long-latency relative to an SPU-like architecture, but what would have been put in its place?
There would still need to be memory operations out of some kind of scratchpad, although the LDS is still a long-latency pool compared to the LS in Cell.
Other details don't rule out a variant of the GCN ISA rather than the RISC-like SPU. A GCN/RDNA programming model would reduce the complexity for developers Sony is apparently making half of the wavefronts available to devs.

One type of GPU block that has a kind of batch or tile-based load, compute, export pattern is the RBEs, which is why they've historically been so effective at consuming memory bandwidth.

For years I always thought of the SPUs as a DSP-like architecture. Not sure why it's not called that.
I've seen descriptions of the SPUs as being DSPs, or DSP-like in their usage. It was pointed out this was their primary strong point, and detractors pointed out that Cell wasn't the first design to have a heterogeneous design with a general-purpose core flanked by DSPs. The lack of success of those attempts was one source of skepticism about the architecture.
 
Last edited by a moderator:
Lacking the PPE would hurt the backwards compatibility angle, however.

Can't be helped. Cerny only mentioned SPU-like capabilities in Tempest.
It'd be a stretch to speculate the presence of a PPU in the custom CU.

For PS3 b/c, they will have to go with a different approach.
 
Can't be helped. Cerny only mentioned SPU-like capabilities in Tempest.
It'd be a stretch to speculate the presence of a PPU in the custom CU.

For PS3 b/c, they will have to go with a different approach.

I mean, technically RPCS3 is open source, the reqs are well within range of the PS5, and their compatibility list is at nigh 60%. I mean, I'm sure there'd be Sony execs would be up in arms. But, what're you gonna do, rebuild an emulator from the ground up?
 
They decided to customize an AMD part for their own needs at the end, this is some hybrid CU/SPU...

ND audio developer doing a joke about rendering people wanted to steal the Tempest Engine because most of the time they did not want to give some SPU to audio people.
'SPU' here means 'Sound Processing Unit' and not 'Synergistic Processing Unit'. There's no Cell-like SPU involvement. It's an RDNA2 CU modified to make it a bit better at DSP workloads.
 
'SPU' here means 'Sound Processing Unit' and not 'Synergistic Processing Unit'. There's no Cell-like SPU involvement. It's an RDNA2 CU modified to make it a bit better at DSP workloads.


No they speak about it as an SPU as in the CELL and this is where comes the joke. You can watch again the "Road to PS5" video Mark Cerny talks about a hybrid of GPU parallelism and SPU like architecture. I can search another tweet about James Okonomiyonda of ICE Team probably the ultimate CELL SPU lover where someone said you are happy there is an SPU inside the PS5.

This is the slide Mark Cerny used for the Tempest engine, he doesn't talk about DSP... And I think they know what was modified inside the CU and we only have partial knowledge of it. The architect of the system, devs at studios and people in the software R&D team speak of it as an SPU like modified CU.

ETauYljVAAAxmV2
 
Last edited:
Yes, he also said something like that a regression happened from PS3 to PS4 in this area.


An indie dev talks about the PS5 at the beginning of this very kind video 02h30 and he was afraid when Mark Cerny talked about the CELL and he was happy when it was only about audio because it was a strong point of SPU and probably the easiest way to use SPU.
 
Last edited:
In german parlance, SPU has meant sound processing unit and I understood that to mean SPU when it was talked about. I did not think about the vector sub processors in the PS3 at that moment.
 
No they speak about it as an SPU as in the CELL and this is where comes the joke. You can watch again the "Road to PS5" video Mark Cerny talks about an hybrid of GPU parallelism ans SPU like architecture.
Saying something is like something doesn't make it the same. Similes are used to help understanding. Honestly, it was described transparently in the talk...

The Tempest Engine (is) based on AMD's GPU technology. We modified a compute unit in such a way as to make it very close to the SPU's in PlayStation 3. Remember when I said that they were ideal for audio, so the Tempest Engine has no caches just like an SPU, all data access is via DMA just like an SPU. Our target was that it would have more power than a CPU thanks to the parallelism that a GPU can achieve and then it would be more efficient than our GPU thanks to the SPU like architecture the goal being to make possible near 100% utilization​

It's a CU that operates in a fashion like a SPU in Cell, in not using caches but being constantly fed data so it can churn through it. That analogy is made so devs can understand how a CU can be made to crunch audio. There is nothing else SPU-like in the design. Why would there be? SPU is just a serial vector processor with a DMA engine on a ring bus and a Power-based ISA. A CU is a vector processor, and Tempest is a CU with a DMA engine. It doesn't need the alternative ISA or bus or anything else that a SPU has.

I agree that it could be talked about like a Synergistic Processing Unit, if one sees it as a modified CU with DMA serial RAM access instead of relying on caches, and I agree that maybe some devs might think of it as such, but it's not in any way a Cell SPU in PS5. It's an RDNA CU with caches removed.
 
There is no sound coprocessor unit in PS3 from the DF article.

"When using the Tempest engine, we DMA in the data, we process it, and we DMA it back out again; this is exactly what happens on the SPUs on PlayStation 3," Cerny adds. "It's a very different model from what the GPU does; the GPU has caches, which are wonderful in some ways but also can result in stalling when it is waiting for the cache line to get filled.

There is only one possibility when people talk about SPUs in PS3. I think this is very clear.



Some dev out of Sony liked SPU, few years ago I have seen one Ubi missing SPUs.

Saying something is like something doesn't make it the same. Similes are used to help understanding. Honestly, it was described transparently in the talk...

The Tempest Engine (is) based on AMD's GPU technology. We modified a compute unit in such a way as to make it very close to the SPU's in PlayStation 3. Remember when I said that they were ideal for audio, so the Tempest Engine has no caches just like an SPU, all data access is via DMA just like an SPU. Our target was that it would have more power than a CPU thanks to the parallelism that a GPU can achieve and then it would be more efficient than our GPU thanks to the SPU like architecture the goal being to make possible near 100% utilization​

It's a CU that operates in a fashion like a SPU in Cell, in not using caches but being constantly fed data so it can churn through it. That analogy is made so devs can understand how a CU can be made to crunch audio. There is nothing else SPU-like in the design. Why would there be? SPU is just a serial vector processor with a DMA engine on a ring bus and a Power-based ISA. A CU is a vector processor, and Tempest is a CU with a DMA engine. It doesn't need the alternative ISA or bus or anything else that a SPU has.

I agree that it could be talked about like a Synergistic Processing Unit, if one sees it as a modified CU with DMA serial RAM access instead of relying on caches, and I agree that maybe some devs might think of it as such, but it's not in any way a Cell SPU in PS5. It's an RDNA CU with caches removed.

Did I say it is exactly a CELL SPU? No, I said it is an hybrid CU/SPU and at the end it is a bit exotic and not a CU at all an hybrid. I did not say it is a CELL SPU inside the PS5.

But people programming the Tempest Engine said it looks like programming the CELL SPUs. This guy from ICE Team with a twitter Ken Kuturagi CELL profile picture has probably as a member of this team some knowledge about the Tempest Engine.

https://twitter.com/okonomiyonda
 
Last edited:
I agree that it could be talked about like a Synergistic Processing Unit, if one sees it as a modified CU with DMA serial RAM access instead of relying on caches, and I agree that maybe some devs might think of it as such, but it's not in any way a Cell SPU in PS5. It's an RDNA CU with caches removed.
Or GCN CU, which I find more likely, since RDNA CUs are designed to work in pairs and he mentioned specifically singular compute unit?
There is no sound coprocessor unit in PS3 from the DF article.
Did I say it is exactly a CELL SPU no, I said it is an hybrid CU/SPU and at the end it is a bit exotic and not a CU at all an hybrid. I did not say it is a CELL inside the PS5.
You haven't said anything that would make it "Hybrid CU/SPU". What Cerny said is literally that it's AMD CU from which they stripped caches to make it work via DMA like SPU did, which suits audio well. Work like something and being hybrid of something are different things.
 
Or GCN CU, which I find more likely, since RDNA CUs are designed to work in pairs and he mentioned specifically singular compute unit?

You haven't said anything that would make it "Hybrid CU/SPU". What Cerny said is literally that it's AMD CU from which they stripped caches to make it work via DMA like SPU did, which suits audio well. Work like something and being hybrid of something are different things.

Strip of the cache is maybe only one part of the job, do we have all details of the customization? Before some GDC presentations, we are not knowing what they did exactly.

ETauYljVAAAxmV2


For the moment it looks like this and programmers seem to think it works like this. And from the moment they strip the cache of the CU, this is changing the programming model.
 
exotic and not a CU at all

it looks like programming the CELL SPUs.

Why would you (or anyone) even want that? Its a modified cu because it probably simplifys integration in the gpu/apu by amd.
Something exotic/cell would most likely be less powerfull, more complicated to integrate, more costly and maybe trickier for devs.

I see you mentioning ken and cell etc, theres a reason they moved away from that, its the past now.
 
There is no sound coprocessor unit in PS3 from the DF article. There is only one possibility when people talk about SPUs in PS3. I think this is very clear
Yes, SPU here means Synergistic Processing element. SPU in PS5 probably means Sound Processing Unit.

Did I say it is exactly a CELL SPU? No, I said it is an hybrid CU/SPU and at the end it is a bit exotic and not a CU at all an hybrid.
Okay, what's hybrid? What has come out of SPU into the CU to make it something exotic? SPU consists of a cut-down Power ISA, two execution pipelines, maths units, control units, DMA engine and a SRAM scratchpad. The CU already has the maths units. the thing that makes it different in operation to a SPU is memory access, as Cerny said. So they modified it, as Cerny said, to give no caches, like a SPU, and DMA in, like a SPU. The only thing a CU needs to be like a SPU is a different memory access model, which is what Cerny tells us Sony did.

But people programming the Tempest Engine said it looks like programming the CELL SPUs.
Yes, because you DMA the data into it. You have full control of the data flow - that's how it's like SPU and how they can keep very high efficiency.

I don't really understand what your argument is. What customisations are you thinking Tempest has over a basic CU beyond the memory interface?

Please don't link to Twitter personalities; I've no idea what I'm supposed to looking at there to find the evidence to support whatever argument it is you're trying to make. Please present your theory and then supporting evidence.
 
Why would you (or anyone) even want that? Its a modified cu because it probably simplifys integration in the gpu/apu by amd.
Something exotic/cell would most likely be less powerfull, more complicated to integrate, more costly and maybe trickier for devs.

CELL SPU were very powerful for Audio the programming model is better than CU with cache.

They took a CU and customize it for being a SPU-like architecture.

This not a SPU or a CU with cache, it works differently. A CU with DMA And we don't know the extent of customization.
 
Yes, SPU here means Synergistic Processing element. SPU in PS5 probably means Sound Processing Unit.

Okay, what's hybrid? What has come out of SPU into the CU to make it something exotic? SPU consists of a cut-down Power ISA, two execution pipelines, maths units, control units, DMA engine and a SRAM scratchpad. The CU already has the maths units. the thing that makes it different in operation to a SPU is memory access, as Cerny said. So they modified it, as Cerny said, to give no caches, like a SPU, and DMA in, like a SPU. The only thing a CU needs to be like a SPU is a different memory access model, which is what Cerny tells us Sony did.

Yes, because you DMA the data into it. You have full control of the data flow - that's how it's like SPU and how they can keep very high efficiency.

I don't really understand what your argument is. What customisations are you thinking Tempest has over a basic CU beyond the memory interface?

Please don't link to Twitter personalities; I've no idea what I'm supposed to looking at there to find the evidence to support whatever argument it is you're trying to make. Please present your theory and then supporting evidence.

And this is not a Twitter personality but a ICE Team programmer ex Q Games dev.

Same prove it means Sound processing unit.


This guy is speaking about SPU and he was a game dev now working on Autonomous vehicle. And he is credible followed by Morgan Mc Guire(Nvidia), Marco Salvi(Nvidia ex Ninja Theory nAo on b3D if I remember well), Peter Shirley(Nvidia) Alex Evans(Media Molecule) and so on.

Hybrid because there is no cache, this is a major customization of the CUs and because they use only two wavefronts and the programming model is more serial than parallell like in a GPU CU. After I know it is not the same ISA than a CELL SPU or the same bus architecture and the fact there is no PPU... And this is probably better like this, don't add problem for indie dev, Rami Ismail(Vlambeer) was afraid when he heard Mark Cerny talking about the CELL and relieved when he understood it was only for Audio.

And this was the biggest problem for most developers with the CELL SPU the fact you need to manage the data flow. Again wait and see we will probably know what they modified later.

And scratchpad without cache is now a little bit exotic. ;)
 
Last edited:
Back
Top