Technical investigation into PS4 and XB1 audio solutions *spawn

Brad Grenz

Philosopher & Poet
Veteran
The PS4 has it's own audio processor and doesn't need to do most of the kinect-related audio processing. There's no reason it should need to devote GPU resources to audio.
 
The PS4 has it's own audio processor and doesn't need to do most of the kinect-related audio processing. There's no reason it should need to devote GPU resources to audio.

What we know is that shape is 1-200 GF block, not a simple audio chip. Even an 8 core in the deve kits is not enought to emulate shape, while probably the audio chip on ps4 will ben something similari to the ps3 one, have you any link that proves otherwise?
 
What we know is that shape is 1-200 GF block, not a simple audio chip. Even an 8 core in the deve kits is not enought to emulate shape, while probably the audio chip on ps4 will ben something similari to the ps3 one, have you any link that proves otherwise?

Mark Cerny has already said it would do the processing for hundreds of audio streams

And I really wish people would stop throwing around the GFLOP number for SHAPE its practically meaningless. Its a fixed function device, you can't do anything else with the flops and we have no idea how the number was calculated, nearly all of its cores are integer after all.

To do the majority of the normal audio functions of SHAPE (i.e. none of the kinect stuff) would probably take less then a single jaguar core I'm led to believe.
 
What we know is that shape is 1-200 GF block, not a simple audio chip. Even an 8 core in the deve kits is not enought to emulate shape, while probably the audio chip on ps4 will ben something similari to the ps3 one, have you any link that proves otherwise?

PS3 didn't have an audio processor. Do you have a link that proves the PS4 solution is vastly inferior? No. I didn't think so. But guess what, even if SHAPE is capable of producing higher fidelity sound that doesn't mean PS4 devs will have to dedicate additional resources to make up the difference. They'll just get the best sound they can out of the PS4's audio chip and call it a day.
 
Mark Cerny has already said it would do the processing for hundreds of audio streams

And I really wish people would stop throwing around the GFLOP number for SHAPE its practically meaningless. Its a fixed function device, you can't do anything else with the flops and we have no idea how the number was calculated, nearly all of its cores are integer after all.

To do the majority of the normal audio functions of SHAPE (i.e. none of the kinect stuff) would probably take less then a single jaguar core I'm led to believe.

as bkillian stated in the related thread there's an huge night to day difference between one hundred hardware voices but only 4-5 with full effects and one hundred voices all with full effects, this should say something to us
SHAPE is an audio block of 4 processors, and only one little of them is for echo cancelling and audio compunting fo kinect (vgleaks)

The same Bkillian compared SHAPE to how much resource will be needed in CU's terms, that's why I'm using the GF, even fixed fuctions uses floating point calculations, and bkillian have hinted some FP functions in the relative thread

so the point is, if an 8 core cpu can't reach the same number of effected voices to emulate shape, what will do the developers in porting?
inferior audio or 1-2 CU's dedicated to audio computing?
 
as bkillian stated in the related thread there's an huge night to day difference between one hundred hardware voices but only 4-5 with full effects and one hundred voices all with full effects, this should say something to us
SHAPE is an audio block of 4 processors, and only one little of them is for echo cancelling and audio compunting fo kinect (vgleaks)

The same Bkillian compared SHAPE to how much resource will be needed in CU's terms, that's why I'm using the GF, even fixed fuctions uses floating point calculations, and bkillian have hinted some FP functions in the relative thread

so the point is, if an 8 core cpu can't reach the same number of effected voices to emulate shape, what will do the developers in porting?
inferior audio or 1-2 CU's dedicated to audio computing?

Why are we using CU terms when we have CPU's which are probably better at the tasks anyway. IIRC only one of the cores uses floating point ops at all.

You went from CU's to Cores very quickly.

I've seen no one say that a 8 core CPU couldn't reach the same number of voices after all SHAPE does much more then voices, i could believe maybe that a 8 core CPU couldn't do _ALL_ of shape, but I have to see anyone provide any evidence that a CPU couldn't do all of the _NORMAL_ audio processing.
 
Why are we using CU terms when we have CPU's which are probably better at the tasks anyway. IIRC only one of the cores uses floating point ops at all.

You went from CU's to Cores very quickly.

I've seen no one say that a 8 core CPU couldn't reach the same number of voices after all SHAPE does much more then voices, i could believe maybe that a 8 core CPU couldn't do _ALL_ of shape, but I have to see anyone provide any evidence that a CPU couldn't do all of the _NORMAL_ audio processing.

well are you really putting on the same level an echo cancelling chip with hundreds, full effects, 3d displaced voices?

another thing spotted from the audio discussion was that every player with X1 have an headset with separate 3d audio stream from shape (ps4 will have mono headset so will lack this kind of audio experience), so those hundreds of full effected voices have to be 3D computed on each of 4 players; I can easily understand why an 8 core can't do the same

if you intended with "NORMAL audio processing" the mixing of 100 voices and only 4-5 with full effects and the same mono output to all the player, then yes, this could take a little (Bkillian says that in the X360 this kind of audio takes 1-3 threads from the 6 Thread CPU)


errata corrige: kinect works is NOT done by SHAPE at all:

audio.jpg
 
well are you really putting on the same level an echo cancelling chip with hundreds, full effects, 3d displaced voices?

another thing spotted from the audio discussion was that every player with X1 have an headset with separate 3d audio stream from shape (ps4 will have mono headset so will lack this kind of audio experience), so those hundreds of full effected voices have to be 3D computed on each of 4 players; I can easily understand why an 8 core can't do the same

if you intended with "NORMAL audio processing" the mixing of 100 voices and only 4-5 with full effects and the same mono output to all the player, then yes, this could take a little (Bkillian says that in the X360 this kind of audio takes 1-3 threads from the 6 Thread CPU)

Normal audio processing being none of the kinect stuff, I wouldn't be surprised if the PS4 chip could do audio mixing and what not as well.

I can't understand why a 8 core CPU thats probably 10-20x faster then the actual DSP cannot do it, its all the kinect stuff and voice recognition which is probably the processor intensive part.

Audio is cheap, the voice + kinect stuff isn't.

errata corrige: kinect works is NOT done by SHAPE at all:

audio.jpg

From this exact diagram all shape does is Decode streams, mix with clip detection run a EQ and state variable filters on them and also so sample rate conversion. I see no reason a single jaguar core cannot do all of this pretty quickly.
 
Normal audio processing being none of the kinect stuff, I wouldn't be surprised if the PS4 chip could do audio mixing and what not as well.

I can't understand why a 8 core CPU thats probably 10-20x faster then the actual DSP cannot do it, its all the kinect stuff and voice recognition which is probably the processor intensive part.

Audio is cheap, the voice + kinect stuff isn't.

please re-read my previous reply.
kinect stuff is not done by shape (source vgleaks) but from two audio processors outside shape.
we haven't any other speed hint apart those from Bkillian says, so the cpu being 10-20x faster don't make any sense to me


answer to your reply after your editing:
From this exact diagram all shape does is Decode streams, mix with clip detection run a EQ and state variable filters on them and also so sample rate conversion. I see no reason a single jaguar core cannot do all of this pretty quickly.

I don't think so.

for example The FLT/VOL filter can provide low pass, high pass, band pass, or notch filtering, and exposes Q and cutoff/center frequency parameters. It is used most commonly for distance and occlusion modeling. you know what this means in 4 different 3D audio streams
and so on, anyway there's a thread where bkillian explains why this audio is so heavy in computing, take a look if you are interested
 
Last edited by a moderator:
please re-read my previous reply.
kinect stuff is not done by shape (source vgleaks) but from two audio processors outside shape.
we haven't any other speed hint apart those from Bkillian says, so the cpu being 10-20x faster don't make any sense to me

It seems your right kinect is worked on by something else

SHAPE works on a 128 sample 24 bit integer audio frame

Which a CPU could process pretty quickly.. either way the PS4 has a audio chip which we know does the audio processing for it, to what extent we don't know but until we do its probably a smart decision to see it as similar to SHAPE.


for example The FLT/VOL filter can provide low pass, high pass, band pass, or notch filtering, and exposes Q and cutoff/center frequency parameters. It is used most commonly for distance and occlusion modeling. you know what this means in 4 different 3D audio streams
and so on, anyway there's a thread where bkillian explains why this audio is so heavy in computing, take a look if you are interested

You literally copied and pasted half that sentence from vgleaks.

low pass, high pass, band pass or notch filtering seem like they would be pretty trivial to do on a CPU. I've only looked at two (low and high pass) in a actual EE sense before but they are pretty trivial and doing them on a CPU would take a pretty small amount of time.
 
It seems your right kinect is worked on by something else

SHAPE works on a 128 sample 24 bit integer audio frame

Which a CPU could process pretty quickly.. .

well I'm not an audio engineer, but bkillias is and if he says that there're an huge difference from one hundred voices but only few with full effects and one hundred of voices all with effects, I'm going to believe him
from full 4-5 effected and 100-128 full effected voices it can be a 15-25x difference, not the same ballpark in my honest opinion

either way the PS4 has a audio chip which we know does the audio processing for it, to what extent we don't know but until we do its probably a smart decision to see it as similar to SHAPE

this is a speculation of yours, I think that such powerful audio block is a surprise for all, probably Sony was good with the X-FI audio chip in the ps3 and the chip in the ps4 is the same, why not?



edit to answer to your reply editing

You literally copied and pasted half that sentence from vgleaks

yes, because vgleaks is the main source with bkillian, and I'm not good with english to change the words without maybe making errors, what's the problem?
 
well I'm not an audio engineer, but bkillias is and if he says that there're an huge difference from one hundred voices but only few with full effects and one hundred of voices all with effects, I'm going to believe him
from full 4-5 effected and 100-128 full effected voices it can be a 15-25x difference, not the same ballpark in my honest opinion



this is a speculation of yours, I think that such powerful audio block is a surprise for all, probably Sony was good with the X-FI audio chip in the ps3 and the chip in the ps4 is the same, why not?



edit to answer to your reply editing



yes, because vgleaks is the main source with bkillian, and I'm not good with english to change the words without maybe making errors, what's the problem?



The PS3 didn't contain a chip at all to do any audio processing.

I honestly don't see even lots of effects running on a CPU taking more then a single core.

The problem with copying and pasting replies is that your copying things you don't understand.

Most of the filters that mentions take a trivial amount of components and on the CPU they would be a handful on instructions but I guess it looks impressive when you mention them all.
 
The PS3 didn't contain a chip at all to do any audio processing.

I honestly don't see even lots of effects running on a CPU taking more then a single core.

I have the words of bkillian to believe, if you do this kind of statement, give us some proves and evidences

The problem with copying and pasting replies is that your copying things you don't understand.

as you don't understand what are you trying to answering about. keep the discussion clean of personal attacks, please
 
The PS3 had superior audio output this generation and I can't recall a single instance when gamers or reviews gave the PS3 the edge due to this advantage. Most gamers are using stereo output on their TV or 5.1 to their audio receiver so I'm not even convinced the difference will amount to something that most could identify if one platform does get superior audio treatment from publishers.

I think its much more likely that publishers release games with stereo, 5.1 and 7.1 options on both consoles and the vast majority don't notice any difference between the audio this generation and next.

The graphics potentially could be a different story but I think its possible that publishers focus on a common framework and the make minor tweaks to satisfy the manufacturers: timed exclusivity, DLC release priority, unique characters and endings for example.

I don't expect very many people to make their decision on which version to purchase based off an audio spec and if someone truly thinks that is relevant in the mind of consumers should show where audio has successfully marketed before as key differentiator.
 
The PS3 had superior audio output this generation and I can't recall a single instance when gamers or reviews gave the PS3 the edge due to this advantage. Most gamers are using stereo output on their TV or 5.1 to their audio receiver so I'm not even convinced the difference will amount to something that most could identify if one platform does get superior audio treatment from publishers.

I think its much more likely that publishers release games with stereo, 5.1 and 7.1 options on both consoles and the vast majority don't notice any difference between the audio this generation and next.

The graphics potentially could be a different story but I think its possible that publishers focus on a common framework and the make minor tweaks to satisfy the manufacturers: timed exclusivity, DLC release priority, unique characters and endings for example.

I don't expect very many people to make their decision on which version to purchase based off an audio spec and if someone truly thinks that is relevant in the mind of consumers should show where audio has successfully marketed before as key differentiator.

bkillian explanation wasn't about audio channels, it was about sound processing.
 
We might see better audio from all the extra ram the consoles have, though.

I'm not trying to offend anyone who works in this area but how does that translate into marketing advantage for one manufacturer or another? Can you tell me how audio improved moving from PS2 to PS3? Can you point to a single instance when you chose a title on PS3 over 360 due to it supporting 7.1 versus 5.1 on the 360? Are there any websites or blogs devoted to gaming audio? How many faceoffs are done with blindfolds and headphones???

I'm not trying to be jerk here but I think anyone arguing that one platform has superior audio is not only comical its barely relevant. The marketplace hasn't shown much of an interest in this, in fact when have you seen a console reveal where more than a cursory reference is even given to audio?

This might sound harsh but we've seen some here argue that one platform had better specs which would be revealed later, endured speculation about multiple APUs that were kept hidden to trick Sony and when that was proven false the cloud suddenly became this harbinger of potential that would make up the gap and now its cloud and audio superiority. Most of this is wishful thinking and more emotional than anything else.

MS made a conscious decision not to focus on graphics, they choose to invest resources in Kinect and TV because they have something else in mind and their success this time will come down to how well they can execute on that vision and how right business plan turns out to be.
 
We might see better audio from all the extra ram the consoles have, though.

Not to mention the extra storage space afforded by both systems using bluray discs. I am guessing the reason the xbox360 had inferior audio compared to PS3 is partially due to size limits of DVD.
 
I'm not trying to offend anyone who works in this area but how does that translate into marketing advantage for one manufacturer or another? Can you tell me how audio improved moving from PS2 to PS3? Can you point to a single instance when you chose a title on PS3 over 360 due to it supporting 7.1 versus 5.1 on the 360? Are there any websites or blogs devoted to gaming audio? How many faceoffs are done with blindfolds and headphones???

I'm not trying to be jerk here but I think anyone arguing that one platform has superior audio is not only comical its barely relevant. The marketplace hasn't shown much of an interest in this, in fact when have you seen a console reveal where more than a cursory reference is even given to audio?

This might sound harsh but we've seen some here argue that one platform had better specs which would be revealed later, endured speculation about multiple APUs that were kept hidden to trick Sony and when that was proven false the cloud suddenly became this harbinger of potential that would make up the gap and now its cloud and audio superiority. Most of this is wishful thinking and more emotional than anything else.

MS made a conscious decision not to focus on graphics, they choose to invest resources in Kinect and TV because they have something else in mind and their success this time will come down to how well they can execute on that vision and how right business plan turns out to be.

Excellent post, my thinking exactly.
 
Back
Top