Playstation 5 [PS5] [Release November 12 2020]

Globalisateur · Apr 13, 2020

The former principal software Engineer on PS5 is implying a narrow and fast design could be an advantage in some cases for RT.

https://twitter.com/x/status/1249420046855598080

Deleted member 13524 · Apr 13, 2020

For a hypothetical PS3 BC sake, is there anything a Cell SPE or a PPE does at 3.2GHz that couldn't be emulated by a Zen2 core and its much more powerful 256bit FPU at 3.5GHz?

Xbat said:
The specs ain't changing the only way that can change in a short period is clock speeds and I doubt very much that Sony is going to up there clock speeds even more.

Other than clock speeds, specs are unlikely to change.
That's not to say they're impossible. If yields wouldn't substantially hurt their bottom line, I'd see no problem in enabling the extra CUs.
It wouldn't take them a respin to do that. The CUs that end up being disabled have the exact same layout and connections as the active ones, because they don't know which CUs end up being disabled during production.
And if Sony/AMD apparently found a way to disable 18 CUs out of 36 for base PS4 BC and up to 35CUs out of 36 for dashboard, then there's no reason to believe they wouldn't be able to disable 4 CUs out of 40 for PS4 Pro BC.

I also don't believe they'd need any substantial software work to enable those CUs. AMD makes firmwares that set the number of active CUs on GPUs and APUs, and drivers that just accept those differences on the fly, all the time. There's a chance the PS5 devkits already have 40 CUs enabled too, like the One X devkits.

Again: this seems unlikely at this point.

The memory bandwidth could be the bigger bottleneck, though. Sony deciding to go with 36 active CUs instead of 40 might have been influenced by their inability to secure enough 16Gbps/18Gbps chips on time. For example. more raw compute with reduced bandwidth-per-TFLOPs - with the possible GPU downclock it would require - wouldn't result in a large-enough real performance delta that would justify slightly lower yields.

Shifty Geezer · Apr 13, 2020

chris1515 said:
And this is not a Twitter personality but a ICE Team programmer ex Q Games dev.

You linked to a personality on Twitter. That doesn't form the basis of an argument, even if you link to Lord Cerny himself! Link not to the account, but to the evidence - the quote or article. In this case, we have someone calling it a SPU like a SPU from PS3, and yes, I acknowledged that some people may be referring it to that. However, it's not a PS3 SPU (as you recognise)

And scratchpad without cache is now a little bit exotic.

Again, I don't understand what exactly the argument is here. I'm saying Tempest is a CU with the caches replaced with a DMA solution so the compute capabilities of the CU can be better used for streaming data.

chris1515 · Apr 13, 2020

Shifty Geezer said:
You linked to a personality on Twitter. That doesn't form the basis of an argument, even if you link to Lord Cerny himself! Link not to the account, but to the evidence - the quote or article. In this case, we have someone calling it a SPU like a SPU from PS3, and yes, I acknowledged that some people may be referring it to that. However, it's not a PS3 SPU (as you recognise)

Again, I don't understand what exactly the argument is here. I'm saying Tempest is a CU with the caches replaced with a DMA solution so the compute capabilities of the CU can be better used for streaming data.

But at the end the programmer use it differently than a CU in a GPU. This is why Mark Cerny present it like this. If the programmer try to use it like in a GPU it will not work well. That's all I said and it is exotic and something not seen inside console since 2013. This DMA model was the reason many devs hated the CELL because it let the burden of memory management to the developer. Here it will be tolerated because it is easier to do with audio workload.

But I think make the CELL come back for backward compatibility is not a good idea from a cost point of view and worse for use it for new development out of Audio the most trivial case. I imagine what devs will think about another ISA, a different endianess and an in order PPU.

Most of the devs are happy with a PS4 it is a better console than the exotic PS3. At the end there are enough problem with huge budget in AAA games and it is more important to be indie friendly and not make life of dev problematic. A good old x86 or if one day they change an ARM CPU works perfectly.

Sony is again a bit more custom at least for things out of the CPU and GPU. From the SSD to the Tempest Engine and adding some new functionality to the controller. Will all of this will pay? Time will tell.

Edit:
Ex Frosbite technical director one of the multiple CELL hater:

Globalisateur · Apr 13, 2020

chris1515 said:
But at the end the programmer use it differently than a CU in a GPU. This is why Mark Cerny present it like this. If the programmer try to use it like in a GPU it will not work well. That's all I said and it is exotic and something not seen inside console since 2013. This DMA model was the reason many devs hated the CELL because it let the burden of memory management to the developer. Here it will be tolerated because it is easier to do with audio workload.

But I think make the CELL come back for backward compatibility is not a good idea from a cost point of view and worse for use it for new development out of Audio the most trivial case. I imagine what devs will think about another ISA, a different endianess and an in order PPU.

Most of the devs are happy PS4 is a better console than the exotic PS3. At the end there are enough problem with huge budget in AAA games and it is more important to be indie friendly and not make life of dev problematic. A good old x86 or if one day they change an ARM CPU works perfectly.

Sony is again a bit more custom at least for things out of the CPU and GPU. From the SSD to the Tempest Engine and adding some new functionality to the controller. Will all of this will pay? Time will tell.

Edit:
Ex Frosbite technical director one of the many CELL hater:

Contrary to Cell they probably have easy APIs available to work with the Tempest CU.

chris1515 · Apr 13, 2020

Globalisateur said:
Contrary to Cell they probably have easy APIs available to work with the Tempest CU.

Modified compute GPU API probably...

Scott_Arm · Apr 13, 2020

chris1515 said:
Modified compute GPU API probably...

AMD already has an audio SDK for the GPU. I imagine it's going to be very similar to that.

Shifty Geezer · Apr 13, 2020

chris1515 said:
Modified compute GPU API probably...

Quite possibly not even that. If the engine is an audio processor, Sony will provide the Audio library. There's a reason they bought up the makers of WWise.

chris1515 · Apr 13, 2020

Shifty Geezer said:
Quite possibly not even that. If the engine is an audio processor, Sony will provide the Audio library. There's a reason they bought up the makers of WWise.

Yes Audiokinetic the biggest Audio middleware, I forget this. For sure developer friendly out of Sony devs working on the library but for some like Jaymin Kessler it is a dream.

This is ok if the burden is not on third party devs but on Audiokinetics, ICE Team and Sony ATG devs.

patsu · Apr 13, 2020

So the Wwise stack is already integrated into the workflow and GPU architecture (e.g., ray tracing hardware) ?
https://www.audiokinetic.com/products/wwise-spatial-audio/

... and then the custom CU for rendering the audio efficiently (SPU style) for the perceivable objects and environmental sound sources ?

patsu · Apr 13, 2020

ToTTenTranz said:
For a hypothetical PS3 BC sake, is there anything a Cell SPE or a PPE does at 3.2GHz that couldn't be emulated by a Zen2 core and its much more powerful 256bit FPU at 3.5GHz?

May be ? Honestly it’s probably better to wait for Sony to say more. They have bits and pieces of relevant technologies in the new system. The RPCS3 project sounds interesting too.

But it may be a business decision first and foremost, which is out of scope in Console Technology.

chris1515 · Apr 13, 2020

patsu said:
So the Wwise stack is already integrated into the workflow and GPU architecture (e.g., ray tracing hardware) ?
https://www.audiokinetic.com/products/wwise-spatial-audio/

... and then the custom CU for rendering the audio efficiently (SPU style) for the perceivable objects and environmental sound sources ?

This seems logical.

patsu said:
May be ? Honestly it’s probably better to wait for Sony to say more. They have bits and pieces of relevant technologies in the new system. The RPCS3 project sounds interesting too.

But it may be a business decision first and foremost, which is out of scope in Console Technology.

If the can release a PS3 emulator on x86 and AMD GPU architecture it makes maybe an economical sense for PSNow.

Scott_Arm · Apr 13, 2020

https://github.com/GPUOpen-LibrariesAndSDKs/TAN

AMD has already optimized the most expensive audio processing functions for their GPUs.

chris1515 · Apr 13, 2020

Scott_Arm said:
https://github.com/GPUOpen-LibrariesAndSDKs/TAN

AMD has already optimized the most expensive audio processing functions for their GPUs.

Yes, but it needs to be done again because this is not working exactly the same on PS5 tempest Engine and I think Sony probably did it optimizing WWise API for PS5. This is logic.

patsu · Apr 13, 2020

Scott_Arm said:
https://github.com/GPUOpen-LibrariesAndSDKs/TAN

AMD has already optimized the most expensive audio processing functions for their GPUs.

If you read the wiki for “AMD TrueAudio”, you’ll find brief mentions of Wwise’s value adds (and further optimization) over it.

And specifically for PS5, Sony will need to optimize it some more for their custom hardware, and potentially fit into (or at least don’t disturb) their streaming system.

Wwise also has software to help author and manage sound objects during development.

pTmdfx · Apr 13, 2020

chris1515 said:
They decided to customize an AMD part for their own needs at the end, this is some hybrid CU/SPU. This is a bit exotic but not too much and doing the audio lib many people from ICE Team or Sony ATG were very good with programming SPU it might have helped a lot. TrueAudio Next is doing some reservation of some CU inside a GPU, not the same at all. At the end saying there is no exotism with a CU without cache and an asynchronous DMA programming model is just not true at all whatever people told this is just false. I don't know how Mark Cerny can be more precise than that.

Let's put it in a very naive view: almost all processors we see these days are von Neumann architecture — they load instructions, they do work as instructed, and they write results back. "A modified CU" makes no difference other than the work it specialized in, and what memory it operates with. Perhaps it has more instructions catering to the heavy low-precision matrix maths that audio DSPs favor. Perhaps it has a scratchpad SRAM like SPUs given the emphasis on a DMA programming model. But these do not constitute a fundemental need of change to the general-purpose AMD CU microarchitecture, and how they fit into the system architecture.

For example, AMD and Nvidia both have been experienced in customizing CUs for different market segments. Also, a DMA based model is, eh, what GPUs have been accustomed to, especially considering Xbox One X and its scratchpad SRAM (exotic for GPUs but common in DSPs). The Tempest CU may totally work like a normal CU at the grand system level, differing only in its resources: e.g. being controlled by a dedicated ACE pipe (priority queues like TAN), implementing a slightly different instruction set with matrix math additions, having neither L0/L1 cache nor LDS nor "graphics stuff" (TMU/export/GDS), and being physically close to a DMA-addressable large scratchpad SRAM.

Cost-effective for both AMD and Sony, and still matches the profile of the Tempest CU as described by Cerny.

TrueAudio Next is doing some reservation of some CU inside a GPU, not the same at all.

What I expect was the Tempest CU to be controlled like TrueAudio Next, through AMD's compute front-end, and I acknowledged the fact that PS5 does not crave out a pie from the main pool of CU like TAN does.

It is important to stress that I am not disputing any communicated information — I am merely trying to interpret them from a software engineer angle, and perhaps highlight IMO the unnecessity of a "Yes SPU Not SPU" merry-go-round.

chris1515 · Apr 13, 2020

pTmdfx said:
Let's put it in a very naive view: almost all processors we see these days are von Neumann architecture — they load instructions, they do work as instructed, and they write results back. "A modified CU" makes no difference other than the work it specialized in, and what memory it operates with. Perhaps it has more instructions catering to the heavy low-precision matrix maths that audio DSPs favor. Perhaps it has a scratchpad SRAM like SPUs given the emphasis on a DMA programming model. But these do not constitute a fundemental need of change to the general-purpose AMD CU microarchitecture, and how they fit into the system architecture.

For example, AMD and Nvidia both have been experienced in customizing CUs for different market segments. Also, a DMA based model is, eh, what GPUs have been accustomed to, especially considering Xbox One X and its scratchpad SRAM (exotic for GPUs but common in DSPs). The Tempest CU may totally work like a normal CU at the grand system level, differing only in its resources: e.g. being controlled by a dedicated ACE pipe (priority queues like TAN), implementing a slightly different instruction set with matrix math additions, having neither L0/L1 cache nor LDS nor "graphics stuff" (TMU/export/GDS), and being physically close to a DMA-addressable large scratchpad SRAM.

Cost-effective for both AMD and Sony, and still matches the profile of the Tempest CU as described by Cerny.

What I expect was the Tempest CU to be controlled like TrueAudio Next, through AMD's compute front-end, and I acknowledged the fact that PS5 does not crave out a pie from the main pool of CU like TAN does.

It is important to stress that I am not disputing any communicated information — I am merely trying to interpret them from a software engineer angle, and perhaps highlight IMO the unnecessity of a "Yes SPU Not SPU" merry-go-round.

When I said reserved a CU in GPU I don't speak about a technical solution but why they created the Tempest. They could have gone with more CUs and give the possibility to dev to reserved some CUs for audio but like on PS3 and PS4 audio would have received a fraction of the CPU or GPU power out of VR. This the reason Ninja Theory told too they are happy to have an audio DSP inside the XSX.

I talk because I know an audio guy working on a Ps5 exclusive and he told me rendering dev were always winning the argument against the audio engineer. But after VR, the possibility to promote 3d audio using youtube and using it to differentiate the PS5 to the XSX, they decided to dedicated some horsepower to it.

This is not perfect because it does not use personalized HRTF but using headphone this video is impressive. They can now use 3d audio inside trailers.

patsu · Apr 13, 2020

MrFox said:
For years I always thought of the SPUs as a DSP-like architecture. Not sure why it's not called that.

DMA a block in, process it in a nice massive register array or local memory, DMA the result out and... OMG the next block is already in! No coffee breaks for dsps...

Might be because the SPUs can be fully autonomous after setup (like a CPU). They fetch their own instructions and run them, handle interrupts (in a limited fashion), etc. e.g., The PS3 secure kernel runs completely off an SPU’s local memory and is “segregated” from the other cores.

Some day, we may find out how autonomous this custom CU is (if at all).

The Wwise acquisition is interesting in the sense that they also have a large library of software-based audio effects (on the CPU). Pretty good for cross platform development. Developers can deploy their audio solution to CPU or GPU or custom units as they see fit.

iroboto · Apr 13, 2020

Globalisateur said:
The former principal software Engineer on PS5 is implying a narrow and fast design could be an advantage in some cases for RT.

https://twitter.com/x/status/1249420046855598080

I think the answer to his question depends highly on memory/latency of data to feed the CUs.

Jay · Apr 13, 2020

patsu said:
The Wwise acquisition

Wow, didn't even know this happened!

Playstation 5 [PS5] [Release November 12 2020]

Globalisateur

Globby

Deleted member 13524

Guest

Shifty Geezer

uber-Troll!

chris1515

Globalisateur

Globby

chris1515

Scott_Arm

Shifty Geezer

uber-Troll!

chris1515

patsu

patsu

chris1515

Scott_Arm

chris1515

patsu

pTmdfx

chris1515

patsu

iroboto

Daft Funk

Jay

Similar threads