Technical investigation into PS4 and XB1 audio solutions *spawn

I thought someone from Microsoft said that the audio chip wasn't programmable by game devs & that it was mostly for Kinect?

Nope, that's not entirely the case and people have been misinterpreting those comments. The DSPs are not programmable by devs and appear, as you say, mostly reserved for Kinect processing and some other system functions, but the fixed function blocks, aka SHAPE, are something else entirely, and those six fixed function blocks that make up SHAPE are 100% available for programmers to do with as they please. It was designed with that purpose in mind because, according to bkilian, the folks that designed the audio block realized that they had some extra space to fit something else in there for free, and that's apparently how SHAPE came about.

The catch is just that, compared to what the full audio block on Xbox One is capable of, as in its full potential, the portion that's actually available to devs to program as they like, is a very small portion of that larger potential, but it was never suggested that the entire audio block belongs to Kinect. The Xbox One Audio Block is more than a single chip. It's 4 Tensilica DSP processors, and then it's the SHAPE block, which is made up of six fixed function units that are all available to developers.
 
You describe SHAPE as both fixed function and full programmable. Which is it? My understanding is that there are no programmable elements of SHAPE exposed to developers but that one or more of the programmable DSP's, currently dedicated to Kinect could in theory be released to developers in the future.
 
I think how the fixed-function pipelines are accessed and used is at the developers control, so it's a programmable pipeline using fixed-function hardware.
 
I think how the fixed-function pipelines are accessed and used is at the developers control, so it's a programmable pipeline using fixed-function hardware.

In that sense, it's like the ACP -- configurable and usable by the game programmers in their audio pipeline, but can't be changed/overwritten directly or fundamentally by them. OTOH, it is different from ACP because the fixed function blocks are higher level and/or specialized audio IP.

When playing Blu-ray and other media, I assume the XB1 DSPs are used instead of SHAPE ? If so, those DSPs are not just for Kinect. Tensilica DSPs support many AV codecs. PS4 is said to support AMD's TruAudio. Don't know whether they co-exist or replace the same codecs in Tensilica (TruAudio supports more gaming oriented use cases like 3D positional audio compared to Tensilica's more general purpose pre- and post-processing).

Sony (and MS) are also capable of rolling their own codecs and effects in the DSPs.
 
You describe SHAPE as both fixed function and full programmable. Which is it? My understanding is that there are no programmable elements of SHAPE exposed to developers but that one or more of the programmable DSP's, currently dedicated to Kinect could in theory be released to developers in the future.

I don't know, unless I misinterpreted his meaning in these posts.

Yeah, hard to do, because the tensilica cores are configurable. This one is configured similarly, although there will still be differences. the 16/32 bit difference may also explain why their result is about 2x what I expected. I do know that the MS vector cores have full 32bit float vector engines, because that's what the speech pipeline uses.

As far as I know, game developers do not have access to the 4 DSP cores. They are all system managed. They have access to codec algorithms running on the cores, and full access to the fixed function hardware. Much to the audio team's chagrin, the speech team bogarted the two vector cores. I know there was some internal pressure to force the speech team to give up some of their CPU so that developers could use it, but I have no idea if anything ever materialized from that.

Any numbers I could give you would be irrelevant. Games are not about FLOPS. SHAPE was created in order to give developers more flexibility in where they want to spend development time. Do you a) spend a bunch of time converting your audio engine over to SHAPE, and then save time on optimizing your physics engine, or b) leave the audio on CPU and spend a bunch of time optimizing the hell out of your physics engine? In the end, you will optimize and fiddle with the engines exactly the amount required to get the effect the designers want.

I like to use the example of the HD DVD player on the 360, since I have personal experience with it. When we started, we were using 100% of the 360, and getting 1 fps on a 24fps H.264 stream. When we finished, we were using 100% of the 360, and getting 24fps on a 24fps H.264 stream. Could we have optimized even more and gotten 30 or 60fps on that stream? Yes. Did we? No. It performed to spec, so we stopped right there.

In theory, the audio hardware in the One can produce results that could not be replicated by the entire 360 CPU. But what about good enough? If you don't use a polyphase SRC, and just go for the standard linear SRC? Now your audio is not quite as good (but you'd be hard pressed to tell most of the time) and you've reduced your CPU requirements a ton.

SHAPE, if it were utilized 100% at all times, would be hard to equal in CPU, yes. But they don't need to equal it. For one, it's using better, but more expensive algorithms, for which the cheaper versions work fine, and have been used for the last generation without complaint. And second, It's highly doubtful that it will be utilized 100% for most games. I'd be surprised if developers used even 50% of it's capabilities for most titles.

By the time developers are looking to push the capabilities of the audio block, I suspect using GPU compute audio will be well understood and a reasonable solution.

Not really, shape can't do convolution reverb, which audiokinetic just said the AMD tech can do. What shape can do is free up an entire CPU core that the game can then use for convolution reverb :)

Relab must be seething that all the game companies are hyping convolution reverb, and ignoring his high end, cheaper, and better custom reverb algorithms. Don't say I didn't warn you. :)

These comments, among a few others, seem to imply as if developers have a certain degree of control over what they do with SHAPE. Doesn't this make it seem like it can be programmed by developers?
 
I don't know, unless I misinterpreted his meaning in these posts.









These comments, among a few others, seem to imply as if developers have a certain degree of control over what they do with SHAPE. Doesn't this make it seem like it can be programmed by developers?

"Full access to fixed function h/w" means you can configure and call these function blocks (exclusively). But since these are coined fixed functions, you can't rewrite them. MS may be able to.
 
"Full access to fixed function h/w" means you can configure and call these function blocks (exclusively). But since these are coined fixed functions, you can't rewrite them. MS may be able to.

Oh okay, so they're just very specific function hardware designed for audio that developers can choose to incorporate into their games or not. So, it's better to just say developers have full access to them, and can use each fixed function block as much as they're able within their designed capabilities, but can't reprogram them to do things other than what they're intended to do. But then what's the difference? Would you even want to rewrite them in the first place if they're already doing as intended?

In essence, don't you still code for them in order to use specific portions of the hardware up to a certain point? Or is there zero coding involved, but some more simple kind of configuration that Microsoft or some middleware company provides? I'm a little confused by the distinction being made here. You can't program to use fixed function hardware? You specifically have to be able to rewrite their core functionality?
 
I don't know, unless I misinterpreted his meaning in these posts.









These comments, among a few others, seem to imply as if developers have a certain degree of control over what they do with SHAPE. Doesn't this make it seem like it can be programmed by developers?
Your 3rd quote of bkillian's post is quite disappointing, if developers have something cool at their disposal they should use it.

I kinda consider myself an audiophile and sound is often overlooked these days, most gamers cum over graphics and it seems that they couldn't care less about decent sound. And that's why we have on screen Radars on FPSs. :???::rolleyes::mad::devilish:

As you can see in this low quality video from 2006, a guy named Renkie is playing with a modified game so he can't see his opponent.

Like the description of the video says, all he has to go on to confront his opponent is the Razer Barracuda HP-1 5.1 surround headset.

Using audio with invisible models yet knowing where the enemies are is pretty remarkable. I wonder if you can make that on PS4 or Xbox One via the audio chip without resorting to something like A3D.

 
I kinda consider myself an audiophile and sound is often overlooked these days, most gamers cum over graphics and it seems that they couldn't care less about decent sound. And that's why we have on screen Radars on FPSs. :???::rolleyes::mad::devilish:

A rare breed we are... but you have to remember, the majority of people are still using their default TV stereo speakers, followed by headphones. And those owning Surround Sound Systems most likely have them setup improperly as well.
 
I kinda consider myself an audiophile and sound is often overlooked these days, most gamers cum over graphics and it seems that they couldn't care less about decent sound. And that's why we have on screen Radars on FPSs. :???::rolleyes::mad::devilish:
Low FoV is probably just as large of a factor. "Player character in a console shooter" is an occupation with a 100% risk of immediate, severe glaucoma.

I wouldn't be surprised if people acclimated to good directional sound in games feel a need for better sound when the radar is turned off. But that can absolutely cause a sense of tunnel vision, too.

It gets tricky for game developers because they don't actually have much to work with for player feedback. Good sound directionality would be a fantastic thing, but it's not something that developers can rely on, because people don't necessarily have a setup that can take advantage of it outside of what you can do with a pair of speakers with hopefully correct left-right orientation (and who knows, maybe there's someone out there who managed to run their PS4/XO to an old CRT with mono sound, though I'm sure most developers wouldn't care about throwing that guy under the bus :rolleyes:).
 
Your 3rd quote of bkillian's post is quite disappointing, if developers have something cool at their disposal they should use it.

I kinda consider myself an audiophile and sound is often overlooked these days, most gamers cum over graphics and it seems that they couldn't care less about decent sound. And that's why we have on screen Radars on FPSs. :???::rolleyes::mad::devilish:

As you can see in this low quality video from 2006, a guy named Renkie is playing with a modified game so he can't see his opponent.

Like the description of the video says, all he has to go on to confront his opponent is the Razer Barracuda HP-1 5.1 surround headset.

Using audio with invisible models yet knowing where the enemies are is pretty remarkable. I wonder if you can make that on PS4 or Xbox One via the audio chip without resorting to something like A3D.


Counter Strike 1.5-1.6 use to be my most heavily played games, so I definitely agree with the importance of being able to locate your opponents based on sound. Sound also plays a big role in why Halo 4 is my favorite game in the franchise.
 
Counter Strike 1.5-1.6 use to be my most heavily played games, so I definitely agree with the importance of being able to locate your opponents based on sound. Sound also plays a big role in why Halo 4 is my favorite game in the franchise.
Obviously audio quality is limited on the Xbox 360 but I played both Halo 3 and Halo 4 and going from Halo 3 to 4 is like going from a 32 kbps MP3 to a 320 kbps MP3.

Article and a video on the importance of TrueAudio in the upcoming Thief.

http://hexus.net/gaming/news/pc/64809-importance-amd-trueaudio-thief-explained-eidos-montreal/
 
I like that it seems like they are implementing the same audio solution for non TrueAudio systems but the overhead will be greater on the CPU. That's definitely the way forward IMO.
 
Cadence confirms themselves that the Xbox One uses 4 of their Tensilica microprocessors for audio.

http://www.prnewswire.com/news-rele...nsilica-processors-in-xbox-one-245201271.html

A question for bkillian.. :smile2: which has been bugging me as of late... If you used a Tensilica processor or some of them for 3D sound, the best reverb possible, etc, would the voice recognition capabilities of the console be compromised?
Depends. Reverb is high bandwidth, and I don't know the capacity of the bus that connects the tensilica processors to the main memory. Originally, the idea was to run two voice pipelines simultaneously, one on each vector processor. That way the system could update it's pipeline over time, and the game could have a known-good implementation that didn't require them to retest every time the OS updated. I have no idea if that solution was implemented or how they eventually divvied up the resources. If they did it that way, then yes, using the vector processors for other audio effects could compromise voice reco. If not, then they could possibly use an entire core for audio effects without changing reco. Assuming the effects don't overload the memory bus.
 
If the core was designed principally for the purpose of voice recognition, I'd assume that the bus design was adequate for that job and no more. Overkill there would add cost to no advantage, unless the cost is tiny and the thought of alternative functions entertained for the future.
 
Depends. Reverb is high bandwidth, and I don't know the capacity of the bus that connects the tensilica processors to the main memory

hmmm... what 'high bandwidth' means in MB/s for you? is it also latency sensible?

I am not very experienced when it comes to sound stuff...

and I don't know the capacity of the bus that connects the tensilica processors to the main memory

I highly doubt the MS tensilica's are tied directly to the memory, any way.
 
hmmm... what 'high bandwidth' means in MB/s for you? is it also latency sensible?

I am not very experienced when it comes to sound stuff...

I highly doubt the MS tensilica's are tied directly to the memory, any way.
500-800 megabytes/second? It's been a long time, and this is remembering scrawlings on a whiteboard as a coworker and I were discussing it.

And the Audio core uses a proprietary bus connecting it to the main RAM.
 
Back
Top