Next Generation Hardware Speculation with a Technical Spin [post E3 2019, pre GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
3D audio is going to require ray tracing, which is something the GPU will be good at. Why make a second ray-tracing device to do the job the GPU was already built to do?
You wouldn't. You'd use the RT hardware to trace rays, and audio hardware to process the audio.

Importantly...

52ec0cfcccdbec88f80b1c0ca8e5cb237abc56fb.png


Using 8 CUs, 20% of a theoretical 36 CU PS5, which are reserved and unavailable for graphics or anything else, you can get 128 voices in 6 ms. 32 in 2 ms. With 4 CUs, a 10% reservation, you could probably do workable sound with that example convolution. That's a fair bit of GPU power taken up. DSP doesn't need all the features of a GPU compute engine, so a far more efficient processor can be used - a tiddly DSP is probably far more efficient at the task. You can't target DSPs in the PC space because they aren't universal, so it makes sense to use compute to power next-gen audio. In a console, the most efficient solution could be used to get a better feature:cost ratio.
 
Last edited:
You wouldn't. You'd use the RT hardware to trace rays, and audio hardware to process the audio.

Importantly...

52ec0cfcccdbec88f80b1c0ca8e5cb237abc56fb.png


Using 8 CUs, 20% of a theoretical 36 CU PS5, which are reserved and unavailable for graphics or anything else, you can get 128 voices in 6 ms. 32 in 2 ms. With 4 CUs, a 10% reservation, you could probably do workable sound with that example convolution. That's a fair bit of GPU power taken up. DSP doesn't need all the features of a GPU compute engine, so a far more efficient processor can be used - a tiddly DSP is probably far more efficient at the task. You can't target DSPs in the PC space because they aren't universal, so it makes sense to use compute to power next-gen audio. In a console, the most efficient solution could be used to get a better feature:cost ratio.

The reservation is flexible. A developer that wants to do cheap hrtf audio can do it on cpu and use 100% of the gpu for graphics. A vr developer that doesn’t care as much about graphics can use up to 20% of the gpu for a wide variety of audio processing. Amd is continually updating their sdk. They dropped DSPs. I don’t know how they made that decision. I’m not up to date on current audio DSPs or how they’d compare in terms of flexibility and overall performance. 8 CUs on an RDNA gpu is a lot of math performance to compete with.
 
You wouldn't. You'd use the RT hardware to trace rays, and audio hardware to process the audio.

Importantly...

52ec0cfcccdbec88f80b1c0ca8e5cb237abc56fb.png


Using 8 CUs, 20% of a theoretical 36 CU PS5, which are reserved and unavailable for graphics or anything else, you can get 128 voices in 6 ms. 32 in 2 ms. With 4 CUs, a 10% reservation, you could probably do workable sound with that example convolution. That's a fair bit of GPU power taken up. DSP doesn't need all the features of a GPU compute engine, so a far more efficient processor can be used - a tiddly DSP is probably far more efficient at the task. You can't target DSPs in the PC space because they aren't universal, so it makes sense to use compute to power next-gen audio. In a console, the most efficient solution could be used to get a better feature:cost ratio.

It means most of the game wold not use 3d audio if it means lost so much CUs for it.
 
The reservation is flexible. A developer that wants to do cheap hrtf audio can do it on cpu and use 100% of the gpu for graphics. A vr developer that doesn’t care as much about graphics can use up to 20% of the gpu for a wide variety of audio processing. Amd is continually updating their sdk. They dropped DSPs. I don’t know how they made that decision. I’m not up to date on current audio DSPs or how they’d compare in terms of flexibility and overall performance. 8 CUs on an RDNA gpu is a lot of math performance to compete with.

Or non VR dev team will not use 3d audio because audio compete with graphics for compute resources. When they decided to create the PS5, they asked to dev from rendering, engine side, audio what they want and they did a postmortem of audio on PS4. Like I said what Christophe Balestra told to Quaz51 is true when audio fights with graphics for resources most of the team will choose graphics. VR is important for them and they probably don't want audio to be a second rate citizen again.
 
To me, Sony planning RT for next gen is in deed the most plausible explanation for MSs unusual acting here, even if it sounds far fetched.

Cerny had interviews back in 2013, maybe 2012, stating RT was considered for PS4 but devs shut the idea down as they'd need to develop new tools and alter work flows too much from what they were doing. With RT being a consideration for ps4, MS would've been pretty sure of RT in PS5 for several years now. IIRC, sony has RT patents dating far back as well. Not too sure of that one, though.

As for rt with NV/MS, wasn't the launch pretty bumpy and claimed by many to be rushed? Poor driver maturity at launch, compared to typical NV drivers, no software support, poor performance by anything outside the top spec card.
 
The reservation is flexible. A developer that wants to do cheap hrtf audio can do it on cpu and use 100% of the gpu for graphics.
Yes, but that means leaving it to the devs, who won't bother. If Cerny wants audio to leap forwards, he can mandate that through the most efficient hardware.

I'm not saying a DSP is a must-have, but it's a solid addition as you get more bang-per-buck. Reduce the GPU 10%, add a tiny DSP, and get 3D audio as a standard feature and USP for your platform while saving a few bucks on having a larger GPU. It's the same argument over having RT acceleration over just doing it on compute - if the additional silicon is efficient enough, the lack of flexibility of it versus pure compute is offset in favour of its inclusion because the speed-up on a task pretty much every game will want will be worth it.
 
Cerny had interviews back in 2013, maybe 2012, stating RT was considered for PS4 but devs shut the idea down as they'd need to develop new tools and alter work flows too much from what they were doing.
Mark Cerny's 'The road to PS4' presentation at GameLabs 2013:

Mark Cerny: "I ended up talking to more than 30 teams in the US Europe and Japan and I'm very glad I did so because the answers we got were not what I thought they would be. It turns out the number one piece of feedback was that they wanted a system with unified memory so that means just one pool of high speed memory not the two that are found on PC or on PlayStation 3. We also learned that the proper CPU count was four or eight and that if we had the money to spend we should invest it in a very powerful GPU and the final piece of feedback we received was that they didn't want exotic, if there was for example a GPU out there that could do real-time ray tracing they did not want it in PlayStation 4. I mean certainly that would be fascinating technology but it will require game teams to take several years throw out all their existing graphics technology and rebuild it from scratch to use that exotic GPU there."​
 
Yes, but that means leaving it to the devs, who won't bother. If Cerny wants audio to leap forwards, he can mandate that through the most efficient hardware.

I'm not saying a DSP is a must-have, but it's a solid addition as you get more bang-per-buck. Reduce the GPU 10%, add a tiny DSP, and get 3D audio as a standard feature and USP for your platform while saving a few bucks on having a larger GPU. It's the same argument over having RT acceleration over just doing it on compute - if the additional silicon is efficient enough, the lack of flexibility of it versus pure compute is offset in favour of its inclusion because the speed-up on a task pretty much every game will want will be worth it.

Can you get the revolution in audio that you want with a tiny DSP? Is that assumption actually true? I'm genuinely interested in this topic, so I've been doing a lot of reading, and it sounds like the "horsepower" that Cerny mentioned being needed for 3D audio will be significant. Basically the way audio is being done in games right now is being completely replaced by things like ambisonics, and higher-order ambisonics and indirect audio will be incredibly expensive relative to how audio is done now. I'm still trying to figure out how Dolby Atmos for headphones compares to the ambisonic formats. It would be very hard for Sony to force devs to use a proprietary format, and ambisonics seems to be the growing standard for VR.

So assuming the PS5 is 9.2 TF (just assuming, not looking to argue over the final specs), 4 CUs is about 1 TF (9 / 36 * TF). How much efficiency gain can you expect from a DSP that it can be pretty small and compete with 1 TF of GPU power.
 
Or non VR dev team will not use 3d audio because audio compete with graphics for compute resources. When they decided to create the PS5, they asked to dev from rendering, engine side, audio what they want and they did a postmortem of audio on PS4. Like I said what Christophe Balestra told to Quaz51 is true when audio fights with graphics for resources most of the team will choose graphics. VR is important for them and they probably don't want audio to be a second rate citizen again.

The reality is they can't force devs to use 3D audio in a particular way. If I'm making a 2D side-scrolling game, do I even care about 3D audio? If it's a DSP some devs will let it sit there unused, especially if they're making a cross-platform game that has to run on lower-end PCs, or PS4. VR developers will use it, because it's integral to the experience. They won't skip it just because someone else didn't use it in another game.
 
Can you get the revolution in audio that you want with a tiny DSP?
I don't know exactly, but it'd be proportionally tiny versus the same power in CUs because CUs have a whole lot of silicon and features a DSP processor doesn't need. Just stands to reason.

How many TFs of compute is needed to do the work of the tiny HEVC decoder blocks in a mobile processor?
 
The reality is they can't force devs to use 3D audio in a particular way. If I'm making a 2D side-scrolling game, do I even care about 3D audio? If it's a DSP some devs will let it sit there unused, especially if they're making a cross-platform game that has to run on lower-end PCs, or PS4. VR developers will use it, because it's integral to the experience. They won't skip it just because someone else didn't use it in another game.

Better said, if its something native to PS5, that the audio hw wont be found in other platforms, it will sit unused except for 1st party exclusives, that means it will sit mostly unused.
 
Better said, if its something native to PS5, that the audio hw wont be found in other platforms, it will sit unused except for 1st party exclusives, that means it will sit mostly unused.
All PSVR2 games will use it. Can 3D audio units be used to process 2D audio ?
 
All PSVR2 games will use it.

How large of a market will PSV2 be, looking at the PSVR install base and its available games?

Can 3D audio units be used to process 2D audio ?

No idea, but i doubt its worth it to go with something exotic (as in, no other platforms have that hardware), which means its use wont be much outside of 1st party AAA devs, and even then it remains to be seen how much use it gets.
 
Better said, if its something native to PS5, that the audio hw wont be found in other platforms, it will sit unused except for 1st party exclusives, that means it will sit mostly unused.
Unless the libraries target it. If games use WWise, which many do, then I'd expect those to map nicely to whatever hardware Sony provides - it'd be kinda stupid of Sony to buy WWise only for them not to target Sony's console audio solution. ;) Sony can also bundle WWise in with their SDK, so Unity and UE games can incorporate it.
 
I don't know exactly, but it'd be proportionally tiny versus the same power in CUs because CUs have a whole lot of silicon and features a DSP processor doesn't need. Just stands to reason.

How many TFs of compute is needed to do the work of the tiny HEVC decoder blocks in a mobile processor?

I'm not an expert on HEVC, but I would guess that there is a large efficiency gain to be had by making a dedicated HEVC decoder. The question is how well the problem maps to the hardware. In the domain of 3D audio, we're talking about physical simulation of audio in 3D space. What are those calculations and is the GPU a good candidate? AMD seems to think so, otherwise they never would have bothered. They discontinued their DSP and switched to OpenCL on the GPU. Maybe they're selling snake oil. I don't know. I would hope not, and my assumption would be whatever efficiency is lost is made moot by the overall computational ability. There's some factor where the efficiency drops to a point where it's worth while to make custom silicon. For dollars of research and fab, I don't know where that line is. As a counter to your HEVC point, where are all of the dedicated physics accelerators? CUs have a lot of silicon a physics accelerator wouldn't need, but they don't exist in the consumer space because I'm assuming you're better off just putting more money into your GPU.
 
Unless the libraries target it. If games use WWise, which many do, then I'd expect those to map nicely to whatever hardware Sony provides - it'd be kinda stupid of Sony to buy WWise only for them not to target Sony's console audio solution. ;) Sony can also bundle WWise in with their SDK, so Unity and UE games can incorporate it.

Exactly they have the software solution. If in the API it use the custom 3d audio DSP it will be used.
 
I'In the domain of 3D audio, we're talking about physical simulation of audio in 3D space.
That's only part of it. You've lots of audio data to mix and process too, as well as the tracing.

Edit: As I understand it, the tracing records the audio energy. It should be possible to trace an audio impulse to record its energy, and then apply that to an audio processor to mix that audio appropriately.

They discontinued their DSP and switched to OpenCL on the GPU. Maybe they're selling snake oil. I don't know.
More likely it added complexity and cost for a feature no-one was using and wasn't going to sell GPUs. Hence move it to compute where it's less efficient but doesn't limit who can use it. You may then get some devs actually using it. PC doesn't need to worry about efficiency as much as consoles do; you can always sell people a faster GPU when their current one isn't good enough for them.

There's some factor where the efficiency drops to a point where it's worth while to make custom silicon. For dollars of research and fab, I don't know where that line is.
Exactly. And none of us knows where that line is. However, it exists, which is why audio on GPU isn't the only sensible option here and why custom audio hardware can't be discounted as unrealistic or unnecessary.
As a counter to your HEVC point, where are all of the dedicated physics accelerators? CUs have a lot of silicon a physics accelerator wouldn't need, but they don't exist in the consumer space because I'm assuming you're better off just putting more money into your GPU.
You can't be sure PPUs are going to get used. Plenty of games have little need for complex physics. All games need audio so putting an audio processor in there, you know it's going to be used extensively. PPUs also aren't small AFAIK, unlike DSPs, having more in common with GPUs than DSPs have.

As MrFox states, 36 CUs looks to be about the size of PS4's SOC. If Sony wanted to stop there for economic reasons, yet Cerny wants next-gen audio, an obvious option would be adding DSP audio as a more economical solution.

Do you not think the acquisition of Audiokinetic in January 2019 was driven by some particular need? Do you think that's a fairly random investment without any specific intent? Taken all together, we've got:

Interview states custom 3D audio hardware (although that's not Cerny's words quoted).
Interview where Cerny says he wants next-gen audio with 3D positional audio and the gold standard will be headphones - he wants proper, effective HRTF audio.
A cost-limited hardware design that really needs optimal hardware economy.
Sony's acquisition of Audiokinetic nearly two years before PS5's release.

The idea of Sony having a custom audio block driven by an audio API fits that full overview better than Sony just using GPU audio, as GPU audio will take a notable amount from the graphics and wouldn't need Audiokinect expertise to create a library.

Again, I'm not saying custom hardware is in, but the argument that it's out is weaker IMO than the argument that it's in, where some are quick to discount the possibility of custom hardware completely.

Edit: Cell should be ideal and tiny (few mm²) and provide PS3 BC. Just saying.
 
Last edited:
@Shifty Geezer I'm not discounting a custom audio processor either. I'm just trying to figure out what the most likely solution is. AMD already has one, and it's an end to end physical physical model that uses ray tracing and a library of audio effects on the gpu. I would guess it's an evolution of that process, because that's just the most likely outcome (evolution vs revolution). I'm sure the ray tracing will be done on the gpu, whether it's amd's solution or a custom one from Sony. That leaves all of the processing for time-varying convolution to be done either on the GPU or on a custom processor.

Your assumption is that a DSP is going to be really small. So how many audio sources and how many time-varying convolution filters is it going to have to calculate per source to achieve next-gen or revolutionary audio? Are there any DSPs on the market that can do that and how big are they? Is it going to be the size of one CU? Two? Three? Four? Half? I have no way of answering that. Cerny said, "With the next console the dream is to show how dramatically different the audio experience can be when we apply significant amounts of hardware horsepower to it.” That doesn't sound small, but who knows.

"The AMD chip also includes a custom unit for 3D audio that Cerny thinks will redefine what sound can do in a videogame." It's described as a custom unit on the AMD chip. That could literally be anything from an entirely separate bespoke block, to modifications on the GPU front-end to further improve on TrueAudio Next. It's really hard to know what it is, but we can see a working solution, that AMD already provides, that has large computational resources for advanced 3D spatial audio like Steam Audio. I'm not aware of any games, besides VR games, that can do all of the things that Steam Audio is capable of.

Audiokinetic will definitely be responsible in part for developing a wwise plugin for the PS5 sdk, whether it's a custom processor or using the gpu for some of the complex computation. They're a software, not a hardware company.
 
That's only part of it. You've lots of audio data to mix and process too, as well as the tracing.

Edit: As I understand it, the tracing records the audio energy. It should be possible to trace an audio impulse to record its energy, and then apply that to an audio processor to mix that audio appropriately.

The tracing can do a bunch of stuff, from audio energy to diffraction. You can use it to model reverb and bypass convolution, you can use it to measure transfer and bypass filtering. Full RT audio would be very alike full RT video in that it could do away with a lot of "shortcuts" at the cost of needing quite some grunt to do it. Good thing is that it requires WAY less grunt to pull it convincingly, because our sense or hearing is, honestly, pretty crap.

Edit: Cell should be ideal and tiny (few mm²) and provide PS3 BC. Just saying.

I, for one, support this motion.
 
Status
Not open for further replies.
Back
Top