Xbox One (Durango) Technical hardware investigation

Status
Not open for further replies.
SHAPE sounds impressive. For an audio engineer working on a game built to use it's capabilities to the fullest I'd imagine it'd be heaven. It's basically an entire mixing console on there. That said, it seems slightly overkill to me. I'd like to hear the results though :smile:

I guess you're unware of what's inside a Soundblaster or Asus XONAR card ?

For me, SHAPE is nothing more than a hardware mixer with hardware (polyphase) sample rate converters plus DMA to fetch the data. Most PCs do this stuff in software and only have a peripheral chip with analog mixer, speaker/headphone amps, ADCs, DACs / SPDIF outputs.

You would have to do the math, but as audio is relatively "slow" in terms of Fs, the computational load is not that high.
 
Do you have confirmation that's not just the devkits?
Because it cuts out a lot of "emerging" markets.
A converter unit would suffice for that, no? Stick it in the HDMI and branch off audio and component. That means you're not lumbered with DAC costs in every box when only a small percentage of your customers want it. It's what I'd do.
 
Do you have confirmation that's not just the devkits?
Because it cuts out a lot of "emerging" markets.

There are no "emerging markets" for $500 consoles.

By the time Durango is cheap enough to entertain those markets, the potential buyer pool will be equipped with a HdMI enabled display also.

The consumer can always get an adaptor if they want to preserve their dinosaur I suppose.
 
Maybe I'm repeating stuff here, but looks like MS still has a clear seperation between GPU memory space and CPU memory space, unlike what Sony (and AMD) did for the PS4 SoC.

What makes you say that. All those memory spaces (all two of them) can be uniquely addressable or mirrored.
 
Last edited by a moderator:
It's really hard to find any kind of useful information on audio processors. I'm trying to find info on Creative's latest, and I get nothing useful. Pretty hard to get an understanding on this audio block for Durango when I have nothing to compare it to.
 
There are no "emerging markets" for $500 consoles.

By the time Durango is cheap enough to entertain those markets, the potential buyer pool will be equipped with a HdMI enabled display also.

The consumer can always get an adaptor if they want to preserve their dinosaur I suppose.
As I wrote in a different thread it seems strange that MS would get rid of other video outputs to process the console's video signal taking into account they have been traditionally supporting every format.

It was one of the keys to the success of the Xbox 360, imho. When the console came out, early adopters LOVED the fact that they could enjoy HD gaming in their PC displays using the VGA HD AV cable.

I was awestruck by the image quality.

Additionally, I wonder if pixel counting is going to be possible next generation, especially because if pixel counters used AA to measure the native output resolution, they should rely on other methods now that AA and resolution are going to be higher.

I think Quaz51 or grandmaster suggested that time ago.
 
As I wrote in a different thread it seems strange that MS would get rid of other video outputs to process the console's video signal taking into account they have been traditionally supporting every format.

It was one of the keys to the success of the Xbox 360, imho. When the console came out, early adopters LOVED the fact that they could enjoy HD gaming in their PC displays using the VGA HD AV cable.

I was awestruck by the image quality.

Additionally, I wonder if pixel counting is going to be possible next generation, especially because if pixel counters used AA to measure the native output resolution, they should rely on other methods now that AA and resolution are going to be higher.

I think Quaz51 or grandmaster suggested that time ago.

Just buy an HDMI -> DVI cable and you can still use a monitor, as long as you have something you can hook the optical audio up to, if they don't sell some kind of adapter to provide an analog audio output.
 
I don't think "content providers" (read: MAFIAA) would be too happy seeing MS selling an official box that could circumvent HDCP.

And yet Microsoft did exactly that with the X360. It's just that without full HDCP handshaking (which component video doesn't support), then HDCP protected content is limited to SD resolutions. It works like that on all devices.

So with an HDMI -> component video breakout cable, you'd likely get SD video (basically DVD resolution) for any BluRay playback. But that would be case with a dedicated BluRay player anyway. However, I don't believe that actual reduction is mandated, only an analog signal cannot be "full resolution"

The HDCP standard is more restrictive than the FCC's Digital Output Protection Technology requirement. HDCP bans compliant products from converting HDCP-restricted content to full-resolution analog form, presumably in an attempt to reduce the size of the analog hole.

So in theory a 1080p HDMI protected feed "could" legally be displayed at 720p, but no hardware manufacturer does that, IIRC. I think the highest analog resolution for HDCP protected content is basically DVD resolution. Unless the HDCP protected content is already DVD resolution in which case by those restrictions it would have to be displayed at lower resolution. But I don't "think" any DVD's are HDCP protected.

This obviously wouldn't apply to games.

Regards,
SB
 
@XpiderMX

No.
That is from the memory example article.

Then, why is it different?

durango_memory.jpg


:oops:
 
Last edited by a moderator:
One is showing the max theoretical throughput on each bus and the other is showing an example snapshot of how much each bus might be saturated at a particular time.
 
Shouldn't be. The highest frequency can't get higher just from mixing, and there's no point resolving the frequency at higher resolution than 48 kHz. Heck, most of the audio is mp3 anyway, so ultimate quality is hardly important. If the audio processing is performed in 32 bit floats at 48 kHz, that's as good as anyone'd need.
At 48Khz, the only improvement you're going to get is going to be in bit depth. 24 bit integer (or 32 bit float, they're identical for audio purposes) translates to ~120db dynamic range. In other words, from completely silent, to so loud that listening more than a few minutes will permanently damage your ears. You can theoretically improve the audio by increasing bit depth even more, since with 120dB you still can't reproduce a Jet engine at 100', or a 12 gauge shotgun blast to the chest, or, heaven forbid, a nuclear explosion, But since you'd need rock concert sized speakers to even reproduce sounds that loud, it's probably overkill.

I think there is. If the source material contains higher sample rate audio, HDMI can carry it and the audio receiver can accept and output it, having the transport be the limiting factor would be disappointing. Just talking about media here, though. Game audio would probably see little if any benefit from support for higher sample rates given the typical quality of game audio assets.
And your receiver is just going to throw a 24Khz low pass filter on the audio and output it that way, since allowing the ultrasonic frequencies through will a) cause your speakers to alias and reduce audio quality, and b) if they can be reproduced, damage your hearing.

I have a set of transducers able to reproduce audio > 38Khz, and the only thing it does is drive my pets completely nuts when I use it. (Although I can do awesome phantom audio effects with it, I used to take it up to the third floor of the xbox building, and project the "I'm a cylon" music from battlestar galactica at specific people on the ground floor. The resulting confusion was hilarious)

Maybe I'm repeating stuff here, but looks like MS still has a clear seperation between GPU memory space and CPU memory space, unlike what Sony (and AMD) did for the PS4 SoC.

VGLeaks also posted a picture on SHAPE (scalable audio processor). Looks like nothing special to me. Just a HW mixer and hardware SRC (sample rate conversion) block with its own DMA. Appearently Durango can only max. 48 KHz/24-bits audio PCM output (to e.g. HDMI).
Yeah... considering the best soundblaster can mix 128 channels with 4 effects, I'm very impressed </sarcasm>
Put it this way, the performance described in that supposedly leaked doc could not be replicated on the 360, even if all 3 cores were used purely for the audio.
 
At 48Khz, the only improvement you're going to get is going to be in bit depth. 24 bit integer (or 32 bit float, they're identical for audio purposes) translates to ~120db dynamic range. In other words, from completely silent, to so loud that listening more than a few minutes will permanently damage your ears. You can theoretically improve the audio by increasing bit depth even more, since with 120dB you still can't reproduce a Jet engine at 100', or a 12 gauge shotgun blast to the chest, or, heaven forbid, a nuclear explosion, But since you'd need rock concert sized speakers to even reproduce sounds that loud, it's probably overkill.

Yeah, but 96 kHz is bigger ...

And your receiver is just going to throw a 24Khz low pass filter on the audio and output it that way, since allowing the ultrasonic frequencies through will a) cause your speakers to alias and reduce audio quality, and b) if they can be reproduced, damage your hearing.

I have a set of transducers able to reproduce audio > 38Khz, and the only thing it does is drive my pets completely nuts when I use it. (Although I can do awesome phantom audio effects with it, I used to take it up to the third floor of the xbox building, and project the "I'm a cylon" music from battlestar galactica at specific people on the ground floor. The resulting confusion was hilarious)

Great anecdote. Wish I could have seen that.

Yeah... considering the best soundblaster can mix 128 channels with 4 effects, I'm very impressed </sarcasm>
Put it this way, the performance described in that supposedly leaked doc could not be replicated on the 360, even if all 3 cores were used purely for the audio.

It's kind of annoying that you're right there, with all the answers, but can't say anything.

Edit:
So on Xbox 360, there was hardware for decoding WMA files, but all mixing, filtering, conversion, scaling, multispeaker encoding was done on the CPU. I'm guessing all of the noise cancellation for Kinect was done on the CPU as well. So, in theory, Durango will give us a massive increase in the amount of sounds that can be processed with different "effects" applied concurrently, including all of the audio processing for Kinect and voice chat, without impacting the CPU much at all, since this audio block has it's own DMA. The 360 was 48kHz 16bit. Wikipedia mentions 320 decompression channels and 256 audio channels, but doesn't mention what the audio channels are capable of, or if they are purely a virtual definition in the software API.

I'd be curious to know how many sounds/voices a high-end game like Battlefield 3 might have processed at a given time, and what sort of effects they might have been using besides the obvious positional changes to volume.
 
So on Xbox 360, there was hardware for decoding WMA files, but all mixing, filtering, conversion, scaling, multispeaker encoding was done on the CPU. I'm guessing all of the noise cancellation for Kinect was done on the CPU as well. So, in theory, Durango will give us a massive increase in the amount of sounds that can be processed with different "effects" applied concurrently, including all of the audio processing for Kinect and voice chat, without impacting the CPU much at all, since this audio block has it's own DMA. The 360 was 48kHz 16bit. Wikipedia mentions 320 decompression channels and 256 audio channels, but doesn't mention what the audio channels are capable of, or if they are purely a virtual definition in the software API.
On the 360, there is hardware for decoding XMA files, which is a much simpler subset of WMA. XAudio2 allows decoding of xWMA files too, but that's CPU side software only. The XMA decoder chip is rated at 320 channels, but in reality it generally maxes out lower than that. The 256 audio channels was calculated using a full core I believe, and that's using a very simple linear interpolation SRC, and possibly a filter and volume per channel.

All audio on the 360, other than XMA decompression, is software and uses the main CPU. Party chat, including codecs and mixing, happen in the system reservation. Game Chat, Kinect MEC and voice recognition, and all game audio happen in the game process and use game resources, including memory and CPU. Game audio frequently uses an entire hardware thread, and I've seen games where it uses 3 hardware threads. Car racing games, in particular, can use upwards of a hundred voices on a single car.
 
On the 360, there is hardware for decoding XMA files, which is a much simpler subset of WMA. XAudio2 allows decoding of xWMA files too, but that's CPU side software only. The XMA decoder chip is rated at 320 channels, but in reality it generally maxes out lower than that. The 256 audio channels was calculated using a full core I believe, and that's using a very simple linear interpolation SRC, and possibly a filter and volume per channel.

All audio on the 360, other than XMA decompression, is software and uses the main CPU. Party chat, including codecs and mixing, happen in the system reservation. Game Chat, Kinect MEC and voice recognition, and all game audio happen in the game process and use game resources, including memory and CPU. Game audio frequently uses an entire hardware thread, and I've seen games where it uses 3 hardware threads. Car racing games, in particular, can use upwards of a hundred voices on a single car.

Thanks for the reply. I wouldn't have thought games would use anywhere near 3 of the 6 hardware threads on the 360 CPU, or am I misinterpreting what you've written? 3 Cores each with 2 hardware threads for Xenon.


Edit:
I do find this audio block rumour interesting considering the following quote:

When Xbox 360 was in its early design phases, we knew that audio processing (except for data compression) was moving to an all-software model. Drastic increases in processing power, combined with the flexibility and ease of simply writing C code to do any arbitrary signal processing, made that clear.

http://www.gamasutra.com/view/feature/131931/sponsored_feature_an_introduction_.php?page=2

Obviously the thinking must have changed as sound processing became more intensive. I do wonder about the part about flexibility by writing C code to do signal processing. This audio block is rumoured to use the same Xaudio2 API, so that flexibility must still be there. I just wonder how much can be done on the audio block vs the CPU.
 
I have a set of transducers able to reproduce audio > 38Khz, and the only thing it does is drive my pets completely nuts when I use it. (Although I can do awesome phantom audio effects with it, I used to take it up to the third floor of the xbox building, and project the "I'm a cylon" music from battlestar galactica at specific people on the ground floor. The resulting confusion was hilarious)

That is hilarious. I reallllllllllly wish I coulda seen that.
 
The best part of any job: screwing around.
Absolutely. Screwing around... for Science!
There was some research into using parametric speakers to deliver a stereo mix directly to your ears without you wearing headphones, driven by kinect head sensing. It could deliver a different audio mix to each player.

Mine uses 40Khz transducers and then modulates the normal waveform onto the ultrasonic one. When done properly, the sound just seems to arrive in your head without having come through the air.
A video in which an advertising company takes advantage of this effect. They have their beam set pretty wide. Mine I have set to about 3 feet wide at 30 feet. With some more DSP playing around, I could actually have it appear in a bubble by using interference from two beams.
 
Status
Not open for further replies.
Back
Top