The former principal software Engineer on PS5 is implying a narrow and fast design could be an advantage in some cases for RT.
Other than clock speeds, specs are unlikely to change.The specs ain't changing the only way that can change in a short period is clock speeds and I doubt very much that Sony is going to up there clock speeds even more.
You linked to a personality on Twitter. That doesn't form the basis of an argument, even if you link to Lord Cerny himself! Link not to the account, but to the evidence - the quote or article. In this case, we have someone calling it a SPU like a SPU from PS3, and yes, I acknowledged that some people may be referring it to that. However, it's not a PS3 SPU (as you recognise)And this is not a Twitter personality but a ICE Team programmer ex Q Games dev.
Again, I don't understand what exactly the argument is here. I'm saying Tempest is a CU with the caches replaced with a DMA solution so the compute capabilities of the CU can be better used for streaming data.And scratchpad without cache is now a little bit exotic.
You linked to a personality on Twitter. That doesn't form the basis of an argument, even if you link to Lord Cerny himself! Link not to the account, but to the evidence - the quote or article. In this case, we have someone calling it a SPU like a SPU from PS3, and yes, I acknowledged that some people may be referring it to that. However, it's not a PS3 SPU (as you recognise)
Again, I don't understand what exactly the argument is here. I'm saying Tempest is a CU with the caches replaced with a DMA solution so the compute capabilities of the CU can be better used for streaming data.
Contrary to Cell they probably have easy APIs available to work with the Tempest CU.But at the end the programmer use it differently than a CU in a GPU. This is why Mark Cerny present it like this. If the programmer try to use it like in a GPU it will not work well. That's all I said and it is exotic and something not seen inside console since 2013. This DMA model was the reason many devs hated the CELL because it let the burden of memory management to the developer. Here it will be tolerated because it is easier to do with audio workload.
But I think make the CELL come back for backward compatibility is not a good idea from a cost point of view and worse for use it for new development out of Audio the most trivial case. I imagine what devs will think about another ISA, a different endianess and an in order PPU.
Most of the devs are happy PS4 is a better console than the exotic PS3. At the end there are enough problem with huge budget in AAA games and it is more important to be indie friendly and not make life of dev problematic. A good old x86 or if one day they change an ARM CPU works perfectly.
Sony is again a bit more custom at least for things out of the CPU and GPU. From the SSD to the Tempest Engine and adding some new functionality to the controller. Will all of this will pay? Time will tell.
Edit:
Ex Frosbite technical director one of the many CELL hater:
Contrary to Cell they probably have easy APIs available to work with the Tempest CU.
Modified compute GPU API probably...
Quite possibly not even that. If the engine is an audio processor, Sony will provide the Audio library. There's a reason they bought up the makers of WWise.Modified compute GPU API probably...
Quite possibly not even that. If the engine is an audio processor, Sony will provide the Audio library. There's a reason they bought up the makers of WWise.
For a hypothetical PS3 BC sake, is there anything a Cell SPE or a PPE does at 3.2GHz that couldn't be emulated by a Zen2 core and its much more powerful 256bit FPU at 3.5GHz?
So the Wwise stack is already integrated into the workflow and GPU architecture (e.g., ray tracing hardware) ?
https://www.audiokinetic.com/products/wwise-spatial-audio/
... and then the custom CU for rendering the audio efficiently (SPU style) for the perceivable objects and environmental sound sources ?
May be ? Honestly it’s probably better to wait for Sony to say more. They have bits and pieces of relevant technologies in the new system. The RPCS3 project sounds interesting too.
But it may be a business decision first and foremost, which is out of scope in Console Technology.
https://github.com/GPUOpen-LibrariesAndSDKs/TAN
AMD has already optimized the most expensive audio processing functions for their GPUs.
https://github.com/GPUOpen-LibrariesAndSDKs/TAN
AMD has already optimized the most expensive audio processing functions for their GPUs.
Let's put it in a very naive view: almost all processors we see these days are von Neumann architecture — they load instructions, they do work as instructed, and they write results back. "A modified CU" makes no difference other than the work it specialized in, and what memory it operates with. Perhaps it has more instructions catering to the heavy low-precision matrix maths that audio DSPs favor. Perhaps it has a scratchpad SRAM like SPUs given the emphasis on a DMA programming model. But these do not constitute a fundemental need of change to the general-purpose AMD CU microarchitecture, and how they fit into the system architecture.They decided to customize an AMD part for their own needs at the end, this is some hybrid CU/SPU. This is a bit exotic but not too much and doing the audio lib many people from ICE Team or Sony ATG were very good with programming SPU it might have helped a lot. TrueAudio Next is doing some reservation of some CU inside a GPU, not the same at all. At the end saying there is no exotism with a CU without cache and an asynchronous DMA programming model is just not true at all whatever people told this is just false. I don't know how Mark Cerny can be more precise than that.
What I expect was the Tempest CU to be controlled like TrueAudio Next, through AMD's compute front-end, and I acknowledged the fact that PS5 does not crave out a pie from the main pool of CU like TAN does.TrueAudio Next is doing some reservation of some CU inside a GPU, not the same at all.
Let's put it in a very naive view: almost all processors we see these days are von Neumann architecture — they load instructions, they do work as instructed, and they write results back. "A modified CU" makes no difference other than the work it specialized in, and what memory it operates with. Perhaps it has more instructions catering to the heavy low-precision matrix maths that audio DSPs favor. Perhaps it has a scratchpad SRAM like SPUs given the emphasis on a DMA programming model. But these do not constitute a fundemental need of change to the general-purpose AMD CU microarchitecture, and how they fit into the system architecture.
For example, AMD and Nvidia both have been experienced in customizing CUs for different market segments. Also, a DMA based model is, eh, what GPUs have been accustomed to, especially considering Xbox One X and its scratchpad SRAM (exotic for GPUs but common in DSPs). The Tempest CU may totally work like a normal CU at the grand system level, differing only in its resources: e.g. being controlled by a dedicated ACE pipe (priority queues like TAN), implementing a slightly different instruction set with matrix math additions, having neither L0/L1 cache nor LDS nor "graphics stuff" (TMU/export/GDS), and being physically close to a DMA-addressable large scratchpad SRAM.
Cost-effective for both AMD and Sony, and still matches the profile of the Tempest CU as described by Cerny.
What I expect was the Tempest CU to be controlled like TrueAudio Next, through AMD's compute front-end, and I acknowledged the fact that PS5 does not crave out a pie from the main pool of CU like TAN does.
It is important to stress that I am not disputing any communicated information — I am merely trying to interpret them from a software engineer angle, and perhaps highlight IMO the unnecessity of a "Yes SPU Not SPU" merry-go-round.
For years I always thought of the SPUs as a DSP-like architecture. Not sure why it's not called that.
DMA a block in, process it in a nice massive register array or local memory, DMA the result out and... OMG the next block is already in! No coffee breaks for dsps...
I think the answer to his question depends highly on memory/latency of data to feed the CUs.The former principal software Engineer on PS5 is implying a narrow and fast design could be an advantage in some cases for RT.
Wow, didn't even know this happened!The Wwise acquisition