NextGen Audio: Tempest Engine, Project Acoustics, Windows Sonic, Dolby Atmos, DTS X

So... it will ignore the less relevant noise. But at least noise will be realistic... I can hear rain,
Rain is an good one, because it's not one sounds but thousands. It'll have to be approximated somehow that can be all-around you when out in it, but a collective rain sound when your inside listening through a window. Perhaps it'd be approximated with 1 metre square rain effects placed in space at these 1 metre intervals, and the totality would give a sense of the whole environmental sound? You could have a broader environmental noise beyond that, and some closer sounds for rain on shoulders and hat.
 
It's worth bearing in mind context, this talk was aimed to devs who care about what their tools can do, not what the underlying technology is capable of doing. Dolby has a limit of 32 objects and Tempest does not.

Audio is not my jam but I appreciate good sound work in games. 32 ojects does not sound like a lot of objects. That said, some games do wonders with their sound with presumably less than even 32 objects. For example, put on a set of headphones and just walk around Los Santas in GTA V and sound-wise there is a lot going on. You can hear people walking, people having conversations with other peds or on their phones, traffic, from cars in traffic, distance crashes, sirens etc. I'm sure RDR2 is even better but there are less distinctive sources of sound to pick out in the background cacophony.
agreed. It's on dolby to change it, if the push is to have hundreds of object sounds for dynamic objects.
 
About atmos the real question really is what is defined in spec and what kind of implementation exists? Is atmos even relevant for gaming when considering more than 32 channels, what game uses more than 32 channels? Everything is possible, new specs and implementations can be made but what really exists today? In cerny's context, was there any realistic way for sony to test anything but up to 32channels with what is available as of when sony was evaluating technologies and making decisions(couple of years ago?) ?
 
Rain is an good one, because it's not one sounds but thousands. It'll have to be approximated somehow that can be all-around you when out in it, but a collective rain sound when your inside listening through a window. Perhaps it'd be approximated with 1 metre square rain effects placed in space at these 1 metre intervals, and the totality would give a sense of the whole environmental sound? You could have a broader environmental noise beyond that, and some closer sounds for rain on shoulders and hat.

Sound LODing basically.
 
About atmos the real question really is what is defined in spec and what kind of implementation exists? Is atmos even relevant for gaming when considering more than 32 channels, what game uses more than 32 channels? Everything is possible, new specs and implementations can be made but what really exists today? In cerny's context, was there any realistic way for sony to test anything but up to 32channels with what is available as of when sony was evaluating technologies and making decisions(couple of years ago?) ?
That’s where both Dolby and DTS are headed; the gaming market. So it’s highly relevant if they want to continue to make licensing revenue.

whether we have the hardware to encode 300+ dynamic sounds into Dolby for output; or the type of load that would result in, is another question.
 
whether we have the hardware to encode 300+ dynamic sounds into Dolby for output; or the type of load that would result in, is another question

This is the part that's more relevant to me and will devs implement it. This is where I feel first party is so important is to introduce people to new things and raise the bar, especially during the launch window.
 
This is the part that's more relevant to me and will devs implement it. This is where I feel first party is so important is to introduce people to new things and raise the bar, especially during the launch window.
I'm not even sure if tempest can do it. Honestly speaking.
With graphics we cull nearly everything that we don't see.

With audio, it's ever present whether you are looking at it or not. And the sounds need to be working with the animations etc. So if a blacksmith is hammering away at a sword, when you aren't looking at it, the animations and sounds for that NPC still need to be running and be aligned, whether or not your'e rendering it.

This is going to be a massive load on the CPU to try to make really complex soundscapes.
 
Sound LODing basically.
Indeed. However, I wonder if the workload is one reason why devs haven't bothered? Contrast a binaural recording of a city ambience versus modelling the city audio. It's akin to the difference between raytracing reflections and using a reflection map, with artists creating loads of different reflection maps to blend between.

The more I think about it, the more I think the audio ambience is going to be defined not by tech but production difficulties.

IWith audio, it's ever present whether you are looking at it or not. And the sounds need to be working with the animations etc. So if a blacksmith is hammering away at a sword, when you aren't looking at it, the animations and sounds for that NPC still need to be running and be aligned, whether or not your'e rendering it.
As Milk says, use LOD. When there's no visibility of black smith, just run the audio at intervals. Isn't it done that way already. In RDR, surely you can hear a sample of a blacksmith even if you can't see them.
 
In cerny's context, was there any realistic way for sony to test anything but up to 32channels with what is available as of when sony was evaluating technologies and making decisions(couple of years ago?) ?

I don't see why Cerny couldn't considering Microsoft's own format supports 128 sounds on the PC, 112 Dynamic and 16 Static.
 
I imagine it's simply about ability to represent a realistic environment. Imagine trying to get a fix on where an opponent is but you have 300 other bushes and fences and buildings around you. There's nothing about "300 sounds" that denotes volume. Outside, you could have a stream, some crickets, some birds, as well as people moving, gunfire, etc. Sat here now, I doubt I could list 30 concurrent sounds I can hear. But take it into the city, something like Watch Dogs, and you'll have potentially hundreds of individual sounds making up the ambient soundscape. Instead of a 'background sample' playing that doesn't have 3D audio positioning, you can have the sound of a street vendor frying, some people talking, cars in the distance, planes overhead, layered and layered. It wouldn't be confusing any more than real life is. However, the production cost of creating that soundscape will be huge! So much audio data needed. Although very repetitious, so easily covered by libraries for anything real-world related.
but will my brain process it correctly like it does in real life.

If I walk up to NPC to get a quest in say GTA 6 will I be able to hear him over the cars behind me , the helicopter in the air and all the police coming after me each making noises ?
Will the game abruptly cancel all noise but the NPC to deal with it ?

I like the idea of more sounds but how many is going over board and how many are just enough
 
but will my brain process it correctly like it does in real life.
Yes.

If I walk up to NPC to get a quest in say GTA 6 will I be able to hear him over the cars behind me , the helicopter in the air and all the police coming after me each making noises ?
Will the game abruptly cancel all noise but the NPC to deal with it ?
It won't need to. Sounds in the distance will be attenuated by distance, and through 3D audio you'll be able to focus on audio from a specific direction using your own senses, especially when up close and in front. But even then, it'll open up gameplay to standing close and over-hearing, where you have to listen to sounds and you could have environmental audio obscuring it, requiring you to try repositioning without your target noticing, or just listening harder.
 
I think it would be just fine if say every bullet is a sound source. And then it can rack up. Or a flock of birds.

And sure my mind won’t track all of them but if things are realistic enough it won’t be overwhelmed either.

Drop a box of metal balls behind me on a marble floor and let me hear all of them as they roll around in a realistic environment ... I won’t hear all of them but I may notice one of them rolled under a cupboard etc especially if we also get some occlusion and so on.

Should be really cool and will help with the lack of spatial information from that boring flat 4k oled screen ;) (yes I’m a VR fan ;))
 
With audio, it's ever present whether you are looking at it or not. And the sounds need to be working with the animations etc. So if a blacksmith is hammering away at a sword, when you aren't looking at it, the animations and sounds for that NPC still need to be running and be aligned, whether or not your'e rendering it.

This is going to be a massive load on the CPU to try to make really complex soundscapes.

Isn’t this happening already? Even thouth games have frustum culling rendering only what is in front of you, enemies and NPCs nearby are accounted at all time in a 360 area.
 
but will my brain process it correctly like it does in real life.

If I walk up to NPC to get a quest in say GTA 6 will I be able to hear him over the cars behind me , the helicopter in the air and all the police coming after me each making noises ?
Will the game abruptly cancel all noise but the NPC to deal with it ?

I like the idea of more sounds but how many is going over board and how many are just enough

Let’s not confuse sound sources with more sounds. In a game you can have rain sound, thunder sound, birds sound, all sounds. Just not placed in 3D. Rain is just a capture of the sound all drops make, thunder will just ressonate without location, same with birds.
We are not adding more sounds if we locate the origin of 100 raindrops around us, if we locate the thunder sound, or the birds sound. We are just creating a source for the sound, a source our ears can pinpoint.
Nothing changes except that. Atmos has no more sounds than mono, It just adds the sound location.
 
I think the issue is creating convincing ambient sound that matches environment and append those hero sounds on top of it. Creating convincing and accurate ambient can be quite heavy operation. I'm all for better audio even if not everyone can appreciate it. I especially love this in context of vr.

For atmos I think cerny's argument is taken out of context. What was available to really test and what is doable if specification/implementation is improved is whole another thing.

Probably what undid atmos was not technical feasibility but money combined with the fact sony likely would have had to pay for the work done to extend atmos spec/implementation + licence fees on top of that. When you go that far it's not difficult to think that what if I just rolled my own instead of trying to use something that someone else owns/controls and which is not fully serving just my needs.
 
As Milk says, use LOD. When there's no visibility of black smith, just run the audio at intervals. Isn't it done that way already. In RDR, surely you can hear a sample of a blacksmith even if you can't see them.
Yea but not hundreds of sources. He’s a single person.
Yes LOD will be required, we have LOD today for games when it comes to sound already. But once again we are talking about complexity and fidelity and then multiplied by the scale.

Isn’t this happening already? Even thouth games have frustum culling rendering only what is in front of you, enemies and NPCs nearby are accounted at all time in a 360 area.
once again; hundreds of sources (which is the marketing bullet point) vs 32.
The we are talking about sound reflections and refractions x hundreds of sources.

like I don’t want to disingenuous here; the closer to we to trying to real life the amount of processing power increases likely exponentially. We see this with ray tracing.
We will see this happen with wave tracing.
 
Yea; there’s nothing wrong with having the option. If someone wanted to make a cutscene with hundreds of leaves rustling in the wind; yea that’s a pretty sounding thing.
Yes.

It won't need to. Sounds in the distance will be attenuated by distance, and through 3D audio you'll be able to focus on audio from a specific direction using your own senses, especially when up close and in front. But even then, it'll open up gameplay to standing close and over-hearing, where you have to listen to sounds and you could have environmental audio obscuring it, requiring you to try repositioning without your target noticing, or just listening harder.

There is one other issue in all this, and that’s headphone technology. The quality of the headphones being used will make a large difference. It’s not going to sound like real life. It’s going to sound like positional audio in a soundstage limited by your headphones. The size of the drivers matter, the position of the drivers matter, the space within the ear cups matter etc.
 
There is one other issue in all this, and that’s headphone technology. The quality of the headphones being used will make a large difference. It’s not going to sound like real life. It’s going to sound like positional audio in a soundstage limited by your headphones. The size of the drivers matter, the position of the drivers matter, the space within the ear cups matter etc.

That's where the HRTF stuff sony is working on becomes super important. As for the hw quality there is only so much sony can do. Most likely this opens up a good chance for reputable outlets do do product reviews and recommendations. I know I would appreciate knowing which headphones are best for which pricepoint. And this is doubly important for VR games.
 
That's where the HRTF stuff sony is working on becomes super important. As for the hw quality there is only so much sony can do. Most likely this opens up a good chance for reputable outlets do do product reviews and recommendations. I know I would appreciate knowing which headphones are best for which pricepoint. And this is doubly important for VR games.


Rtings does great headphone reviews. They have a number of different soundstage measurements, but you’ll find good soundstage with good audio quality is very expensive.
 
Back
Top