DirectX 12: The future of it within the console gaming space (specifically the XB1)

TheWretched · Feb 15, 2015

For all it's worth, I think DX12 will be really good for PC gaming. With my Phenom II X4 945 I've been single thread bottlenecked since... I bought it^^ Not all games, mind you, but a lot. And there were a lot of weird cases as well, like Assassin's Creed 2 or Brotherhood, were neither CPU core was fully loaded, the GPU wasn't, yet the game ran between 30 and 50 Hz all the time, depending on where I was (I guess it's similar to how Unity ran on consoles, where performance could tank any second, for no particular obvious reason).

But I still have a hard time actually believing it'll bring a lot to consoles, unless MS "ported" a lot of the performance hindrances from older DX versions to their console for the older SDKs and now relieves them of this pain. And seemingly Sony did the same with PS4, as performance is more or less comparable to 12 vs 18 CUs.

I mean, sure we'll see improvements on X1. It's still young. Same goes for PS4. But is that down to DX12 or just regular improvements to the SDK? Not that it really matters...

One thing's for sure, DX12 will be great for PC gaming! As well as glNext!

iroboto · Feb 15, 2015

TheWretched said:
For all it's worth, I think DX12 will be really good for PC gaming. With my Phenom II X4 945 I've been single thread bottlenecked since... I bought it^^ Not all games, mind you, but a lot. And there were a lot of weird cases as well, like Assassin's Creed 2 or Brotherhood, were neither CPU core was fully loaded, the GPU wasn't, yet the game ran between 30 and 50 Hz all the time, depending on where I was (I guess it's similar to how Unity ran on consoles, where performance could tank any second, for no particular obvious reason).

But I still have a hard time actually believing it'll bring a lot to consoles, unless MS "ported" a lot of the performance hindrances from older DX versions to their console for the older SDKs and now relieves them of this pain. And seemingly Sony did the same with PS4, as performance is more or less comparable to 12 vs 18 CUs.

I mean, sure we'll see improvements on X1. It's still young. Same goes for PS4. But is that down to DX12 or just regular improvements to the SDK? Not that it really matters...

One thing's for sure, DX12 will be great for PC gaming! As well as glNext!

ahh! We're brothers! Yes my performance is also entirely held back by the Phenom! Did you try unlocking your additional 2 cores? That could help increase things, but DX12 should hopefully allow this processor to do better. I might be able to drag out my aging PC a little longer.

As for MS' initial SDKs, unfortunately developers were given DirectX 11 + xbox functions. It wasn't until March after launch did a faster lower level of DX11 be released, one that likely addresses the amount of overhead involved. DX12 should bring additional improvements, but most of it could be how it provides freedom to developers to shift bottlenecks around. As many have wrote, it should actually bring it in line with the performance profile of Sony's GNM.

Cyan · Feb 15, 2015

iroboto said:
AMD APUs tested under DX12 From Anandtech.

http://www.anandtech.com/show/8968/star-swarm-directx-12-amd-apu-performance

Very interesting results! Especially taking into account Xbox One also sports an APU with DDR3 memory, like those featured in the comparison.

The Big Interview: AMD’s Robert Hallock On Mantle, DirectX 12, PS4/Xbox One, Free-Sync And More

http://gamingbolt.com/the-big-inter...ox-one-free-sync-and-more#oRExOJYHrjMMF87e.99

sebbbi · Feb 16, 2015

Cyan said:
The Big Interview: AMD’s Robert Hallock On Mantle, DirectX 12, PS4/Xbox One, Free-Sync And More

http://gamingbolt.com/the-big-inter...ox-one-free-sync-and-more#oRExOJYHrjMMF87e.99

Unfortunately this interview had practically zero new information. He mostly repeated things already known. That AMD vs Intel APU power consumption comparison was completely bogus: They used Star Swarm to benchmark and AMD result used Mantle and Intel used DX11. DX12 version of Star Swarm is available and we all know that Intel/NVIDIA get huge gains from the new version. Also he completely avoided the CPU question by saying that AMD + Mantle beats Intel + DirectX in Thief with some particular settings. We all know this. I would have liked to hear something new about the forthcoming AMD CPUs and/or GPUs or at least something concrete about the future of the Mantle.

psorcerer · Feb 16, 2015

Andrew Lauritzen said:
Physical address patching/resolution must be done in kernel model for security reasons.

No, I mean that it can be done at runtime by CP, and not at compile/submit time by CPU. If it all can be calculated relative to one offset, per submission, can it?

psorcerer · Feb 16, 2015

Metal_Spirit said:
where only 20% of the GPU full power was in use

I would argue that in 1080p standard modern game we have even worse utilization (for GTX980, for example).
But using "full power" of GPU will mean that paradigm should be shifted from "let's use 9K and 9000xMSAA" to "let's use 1080p, very modest AA, and a much better lighting". Which, sadly, currently is not a mainstream (mostly because of the DX9 problems, where you cannot use your new GPU and do amazing stuff, because of CPU bottlenecks, and instead relied on ridiculous resolutions and AA to somehow load it with any work).

Jwm · Feb 16, 2015

Here is another tweet from Stardock CEO - Andrew gets an article linked in the conversation.

Brad's tweet-

Did a test of DirectX 11 vs. DirectX 12 on an unreleased GPU with an 8core CPU. DX11: 13fps, DX12: 120fps. Lighting and lens effects.

Edit: Now he replied himself, LOL

Deleted member 86764 · Feb 17, 2015

Maybe someone can correct me on this, but from my limited understanding, earlier DX versions had some sort of software limitation on the number of drawcalls that could be made at any time, particularly on PC. If PCs are traditionally limited to drawing many objects as instances of one item, then it’s likely that a demo like Star Swarm is going to show huge increases in framerates for that platform, since I doubt you can have a singular instance of thousands of ships moving independently of one another. So in the case of the Star Swarm demo in DX12, that limitation has essentially been removed leaving the GPU to concentrate on applying its resources to graphics.

The demo is currently not having a GPU bottleneck for a lot of the rendering, since once DX12 is applied, the software bottleneck has been removed. It’s not applying any kind of miracle update to GPUs, it’s simply removing a bottleneck that shouldn’t be there in the first place. I’d also argue that Star Swarm has specifically been created to precisely demonstrate a draw call limitation – the GPUs are clearly able to render the graphics since it’s not exactly a stunningly beautiful benchmark. So the increases of between 300-800% are not going to be applied to all games, only a theoretical game with VERY large draw calls (in excess of maybe 4000). Maybe something like Dying Light might have a nice increase when the viewing distance slider is pulled right up.

From what I understand, consoles don’t have these same limitations and have always been able to make more draw calls when compared to a PC that’s restricted by DX. So the actual benefit to Xbox One will either be negligible or non-existent. Assuming the Xbox One doesn’t have the same issue as a PC on its adapted DX version. And besides I don’t imagine any game out so far has a draw call bottleneck.

So can we really use these PC benchmarks of a specifically draw call limited benchmark, as providing any kind of benefit to consoles? I’m not sure that we can.

I found the following article from 2011 very interesting on the subject:

http://www.bit-tech.net/hardware/graphics/2011/03/16/farewell-to-directx/2

So what sort of performance-overhead are we talking about here? Is DirectX really that big a barrier to high-speed PC gaming? This, of course, depends on the nature of the game you're developing.
'It can vary from almost nothing at all to a huge overhead,' says Huddy. 'If you're just rendering a screen full of pixels which are not terribly complicated, then typically a PC will do just as good a job as a console. These days we have so much horsepower on PCs that on high-resolutions you see some pretty extraordinary-looking PC games, but one of the things that you don't see in PC gaming inside the software architecture is the kind of stuff that we see on consoles all the time.

On consoles, you can draw maybe 10,000 or 20,000 chunks of geometry in a frame, and you can do that at 30-60fps. On a PC, you can't typically draw more than 2-3,000 without getting into trouble with performance, and that's quite surprising - the PC can actually show you only a tenth of the performance if you need a separate batch for each draw call.

DirectX supports instancing, meaning that several trees can be drawn as easily as a single tree. However, Huddy says this isn't still enough to compete with the number of draw calls possible on consoles
Now the PC software architecture – DirectX – has been kind of bent into shape to try to accommodate more and more of the batch calls in a sneaky kind of way. There are the multi-threaded display lists, which come up in DirectX 11 – that helps, but unsurprisingly it only gives you a factor of two at the very best, from what we've seen. And we also support instancing, which means that if you're going to draw a crate, you can actually draw ten crates just as fast as far as DirectX is concerned.

But it's still very hard to throw tremendous variety into a PC game. If you want each of your draw calls to be a bit different, then you can't get over about 2-3,000 draw calls typically - and certainly a maximum amount of 5,000. Games developers definitely have a need for that. Console games often use 10-20,000 draw calls per frame, and that's an easier way to let the artist's vision shine through.'

Of course, the ability to program direct-to-metal (directly to the hardware, rather than going through a standardised software API) is a no-brainer when it comes to consoles, particularly when they're nearing the end of their lifespan. When a console is first launched, you'll want an API so that you can develop good-looking and stable games quickly, but it makes sense to go direct-to-metal towards the end of the console's life, when you're looking to squeeze out as much performance as possible.

Shifty Geezer · Feb 17, 2015

ThePissartist said:
From what I understand, consoles don’t have these same limitations and have always been able to make more draw calls when compared to a PC that’s restricted by DX. So the actual benefit to Xbox One will either be negligible or non-existent. Assuming the Xbox One doesn’t have the same issue as a PC on its adapted DX version.

That may be the case.

And besides I don’t imagine any game out so far has a draw call bottleneck.

You wouldn't know, because every game would be created around such a bottleneck. And for cross-platform titles, it's unlikely the devs will use 10k drawcalls on consoles (nearer 100k seems likely on PS4 given low-level API and Mantle performance) where they are limited to 3000 on the same game on PC. Instead, they'll design the whole game around 3000 calls (batching) and use that same method on consoles.

Hence enabling more calls on PC could result in games with a different design emphasis and visual changes all round. Or, it'll just simplify devs' lives and everything will stay as it is without devs having to fiddle about with graphic optimisations.

Deleted member 86764 · Feb 17, 2015

ThePissartist said:
From what I understand, consoles don’t have these same limitations and have always been able to make more draw calls when compared to a PC that’s restricted by DX. So the actual benefit to Xbox One will either be negligible or non-existent. Assuming the Xbox One doesn’t have the same issue as a PC on its adapted DX version.

Apologies for quoting and correcting myself, but I found an interview with one of the guys on the Metro Redux game:

http://www.eurogamer.net/articles/digitalfoundry-2014-metro-redux-what-its-really-like-to-make-a-multi-platform-game

He states the following:

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

In general - I don't really get why they choose DX11 as a starting point for the console. It's a console! Why care about some legacy stuff at all? On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles. On Xbox One it easily could be one million times slower because of all the bookkeeping the API does.

But Microsoft is not sleeping, really. Each XDK that has been released both before and after the Xbox One launch has brought faster and faster draw-calls to the table. They added tons of features just to work around limitations of the DX11 API model. They even made a DX12/GNM style do-it-yourself API available - although we didn't ship with it on Redux due to time constraints.

It sounds like the current Xbox One DX was essentially verbatim DX11. This means that the Xbox One probably won't have as many issues (once DX12 is applied) with draw calls as it once did, assuming the developer isn't already applying the "GNM style do-it-yourself API".

I wonder whether the Xbox 360's own API already had a more console-like API when released compared to the Xbox One which sounds like it's more of a traditional PC API. Anyway, it sounds like a fix to a problem rather than a wonderful improvement to the Xbox One's capabilities.

Shifty Geezer · Feb 17, 2015

Devs have described both last-gen consoles as able to push tens of thousands of draw calls, so XB360 was quite different to DX11.

iroboto · Feb 17, 2015

I hope that the removal of the draw call barrier could have a chain reaction type effect on the whole rendering setup.

Allow developers full control over where to shift the bottleneck. I feel for weaker GPUs like the Xbox, they perform terribly when work load has highs and lows, it's burst potential is less, so it would benefit from dividing the work up equally over the whole frame budget.

Not having to have draw calls bundled will allow a bit more flexibility there and multiple cores can continually fill the command buffer keeping the GPU from idling. Ideally have a consistent work load from beginning to end as opposed to having smaller loads and big loads and small loads.

We talked earlier that as long as no state change occurred that multiple draw calls would perform the same as a bundled one, so now the available threads can submit work without needing to waste time bundling them.

Xbox one benefits from a very high read/write esram. Able to move 1024 bits in both directions, if having a lot of small draw calls and writing to memory was detrimental, maybe esram as a solution low latency, high read/write etc may be a reasonable workaround. Those DMA engines are capped at 30Gb/s so it wasn't like they were capable of moving a lot of big items well- overall the system looks well suited to perform well with lots of smaller jobs.

We will need to see how things turn out, I'd like to see my predictions come true, I could tell my wife I'm not that stupid after all.

Scott_Arm · Feb 17, 2015

The sdk leak basically showed that Xbox One was running something very similar to standard DX11 up until spring(?) of last year.

Deleted member 86764 · Feb 17, 2015

iroboto said:
We will need to see how things turn out, I'd like to see my predictions come true, I could tell my wife I'm not that stupid after all.

I hide all this stuff from my partner, otherwise she'd just think I'm stupid for finding it interesting.

function · Feb 17, 2015

iroboto said:
We will need to see how things turn out, I'd like to see my predictions come true, I could tell my wife I'm not that stupid after all.

If that doesn't work, at night when she's asleep try writing "Chihuahua" on her forehead in permanent marker. When she wakes up in the morning and goes to do her makeup she'll be like "[iroboto] WTF is this!?!?!" and then you can be like "it's a type of dog".

Boom. Now she feels stupid.

upnorthsox · Feb 17, 2015

......followed by Bam goes the gun........

iroboto · Feb 17, 2015

I wish I had access to the SDK here at work lol. But I do feel rather dumb that when Phil Spencer said

We knew what DirectX12 was doing when we made Xbox One

and the other quote that they went ahead and customized the Xbox One ahead of AMD and Nvidia made me think of only Feature Level 12_0 in Xbox, and what they could bring.

I should have been asking the obvious question which is, what big changes in DX12 would ultimately benefit from the existing architectural of the X1 today. The first place I would start is understanding the benefits of unbundling draw calls, the second is looking at that graphics command processor that's customized, that should be the place of bottleneck according to Anandtech/Ryan Smith. After that we look into how the GCP handles 8 separate threads dumping commands into the graphics queue and ultimately how it schedules work to be done. After that we need to look at how the memory architecture would support it. And then we'll know if X1 is really made for DX12.

If the intention it to open up draw calls as the future mode of rendering, we'll need to identify why. FL 12_0 is a nice to have, but I don't think it's the game changer for X1 anymore, it will help it provide some longevity maybe, as Shifty writes - a lot of these features won't be put to use in every game, a select couple will benefit maybe, exclusives really.

Kaotik · Feb 17, 2015

iroboto said:
I wish I had access to the SDK here at work lol. But I do feel rather dumb that when Phil Spencer said and the other quote that they went ahead and customized the Xbox One ahead of AMD and Nvidia made me think of only Feature Level 12_0 in Xbox, and what they could bring.

You actually think it was Microsoft that did any modifications to Xbox One APU, and not AMD based on what MS wanted? If there's FL 12_0 in Xbox One, it's in all GCN 1.1+ AMD GPUs

mosen · Feb 17, 2015

Kaotik said:
You actually think it was Microsoft that did any modifications to Xbox One APU, and not AMD based on what MS wanted? If there's FL 12_0 in Xbox One, it's in all GCN 1.1+ AMD GPUs

I don't think so. Microsoft goal was/is to be ahead of other companies schedule:

In the first Xbox, Intel and NVIDIA crafted the silicon. In the case of Xbox 360, it was more of a joint effort between Microsoft and ATI / IBM. Though Microsoft's still working with AMD to build out some of its chips this time around, it's also invested millions of dollars in building out verification facilities (among others) on-site in Mountain View and doubling the amount of in-house engineering dedicated to silicon. Holmdahl explains:

"In the consumer space, to control your destiny, you can't just rely on commodity components. You have to be able to make your own silicon. It helps with performance; it helps with the cost; it helps make your product smaller; it helps you create your own IP (always a good thing). I'll argue you're a lot more flexible -- you're not relying on somebody else's schedule; you make your own. So we're obviously heading that way. The stuff we've done over the last 13, 14 years is one example of that within Microsoft. And you're gonna see more and more of that, is my guess, as you go forward."

It's a major shift away from the company's past reliance on external partners, with only AMD serving as collaborator this time around. And like any game console launch, it's another huge investment for the next... five, eight, 10 years? That's an unknown, of course, but it seems likely based on history that we'll have the Xbox One for the foreseeable future. Whatever the future dictates, it looks like we'll see internally developed chips in many of Microsoft's products going forward.

http://www.engadget.com/2013/05/21/building-xbox-one-an-inside-look/

iroboto · Feb 17, 2015

Kaotik said:
You actually think it was Microsoft that did any modifications to Xbox One APU, and not AMD based on what MS wanted? If there's FL 12_0 in Xbox One, it's in all GCN 1.1+ AMD GPUs

There's that possibility, and it's quite high. I'm on a wait and see approach however. Even if it's AMD doing the customization based on what MS wanted, AMD still could be several cycles out from implementing those changes into their chips.

so we took that opportunity with Xbox One and with our customised command processor we've created extensions on top of D3D which fit very nicely into the D3D model and this is something that we'd like to integrate back into mainline 3D on the PC too - this small, very low-level, very efficient object-orientated submission of your draw [and state] commands.

At the very least I know these ones aren't there yet.

DirectX 12: The future of it within the console gaming space (specifically the XB1)

TheWretched

iroboto

Daft Funk

Cyan

orange

sebbbi

psorcerer

psorcerer

Jwm

Deleted member 86764

Guest

Shifty Geezer

uber-Troll!

Deleted member 86764

Guest

Shifty Geezer

uber-Troll!

iroboto

Daft Funk

Scott_Arm

Deleted member 86764

Guest

function

None functional

upnorthsox

iroboto

Daft Funk

Kaotik

Drunk Member

mosen

iroboto

Daft Funk

Similar threads