DirectX 12: The future of it within the console gaming space (specifically the XB1)

Brad & Oxide are also at the "gl next" keynote, expect them to be also showing off nitrous on that low level api as well ;)
 
I thought he did say Star Control was on the way, if not then what have I been excited for? Oh well, GDC is soon enough. They are supposed to have an X1 game, but no clue what it is.

I do wonder what this would open up for open world games, better foliage or trees?

Brad & Oxide are also at the "gl next" keynote, expect them to be also showing off nitrous on that low level api as well ;)

He does play dumb on twitter in regards to this so far. lol
 
I thought he did say Star Control was on the way, if not then what have I been excited for? Oh well, GDC is soon enough. They are supposed to have an X1 game, but no clue what it is.

I do wonder what this would open up for open world games, better foliage or trees?



He does play dumb on twitter in regards to this so far. lol
I wasn't sure if his game is Star Control or if another company is making Star Control. That part wasn't clear.
 
Age of Mythology? Halo Wars 2?

Be odd to pass that to oxide maybe. Star Control needs to remain as SB said. Nm, off topic by a mile.
 
Is anyone else a little annoyed at Brad Wardells disingenuous intentions with his DX12 statements?

He's making it sound like DX12 will lead to huge performance increases for all scene's being rendered ONCE game engines fully support DX12 and once 2nd round of games supporting DX12 arrive circa 2017-18.

Yeah, this is the point I was trying to make earlier. There are lots of people that see this benchmark as providing similar improvements across the board, which couldn't be further from the truth. All we're seeing is a draw call benchmark in a scenario that doesn't (can't?) have bundled calls. It's completely disingenuous.

That's not to take anything away from what's being accomplished, it's a neat improvement - particularly on PC where it's been a bottleneck for a while. Only this whole 8-core CPU and 'unreleased' GPU just sounds like it's feeding the naive.

Draw calls aren't a bottleneck for the Xbox One depending on the API used for a game.
 
Draw calls aren't a bottleneck for the Xbox One depending on the API used for a game.

Unless I'm missing something XB1 still has Win8, which in turn has WDDM 1.3 which is still got some of the CPU bottlenecks that DX12 + WDDM 2.0 resolves ..

So do you know something the rest of us doesn't about the XB1's GameOS ?!
 
You might have missed the text I'd quoted from the Metro devs earlier. They suggested that two APIs are available for the Xbox One; one being the fairly standard DX11 and the other being the "GNM style do-it-yourself" API.

Edit: I'll quote the text again in full:

"Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

In general - I don't really get why they choose DX11 as a starting point for the console. It's a console! Why care about some legacy stuff at all? On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles. On Xbox One it easily could be one million times slower because of all the bookkeeping the API does.

But Microsoft is not sleeping, really. Each XDK that has been released both before and after the Xbox One launch has brought faster and faster draw-calls to the table. They added tons of features just to work around limitations of the DX11 API model. They even made a DX12/GNM style do-it-yourself API available - although we didn't ship with it on Redux due to time constraints."
 
Last edited by a moderator:
They did the same with titanfall and Sebbbi indicates on GCN hardware this is the fastest method of deploying draw calls. One entire thread/core is dedicated there is no deferred context.

While we can see the CPU isn't bottlenecked I agree. We can't prove that a single thread @ 1.6Ghz is enough to feed the GPU either, like it might be poorly utiliZed. If that makes sense? The game can always and has traditional been designed to save on draw calls. That means the work load may not be evenly spread out for the CPU cores and possibly for the GPU. I suspect this has been the largest advantage of the BIG GPUs that have 64CUs etc. if a large last minute job comes they can crush it.
 
They did the same with titanfall and Sebbbi indicates on GCN hardware this is the fastest method of deploying draw calls.

Oh I agree completely; using the DX11 API it's likely best to dedicate one thread to perform draw calls since the legacy API performs the bookkeeping (as mentioned by Oles Shishkovstov), but theoretically the other API doesn't have the same limitations. It was also presumably available sometime during the development of Metro Redux.

So while I agree that DX12 will free up CPU time on games when compared to those developed on DX11, it's not a catch all.

The original point that that Pixel made and I reiterated; the Star Swarm benchmark is specifically designed to show a draw call bottleneck, nothing else. All of the GPUs tested in these results are completely capable of rendering at the high end if there were no draw call limitation, or if the calls could be bundled.

I feel it's unfortunate that people are taking draw call benchmarks as having a comparable influence on framerates across the board, particulary on consoles. We've had people use the Star Swarm benchmark using a PC with similar specification to Xbox One and assume that it means the DX12 benefit will be proportional to all games. Which, obviously, it won't.

So I agree the update will provide extra CPU time, it's just the difference between 11 and 12 won't be as dramatic as some would hope. Increases of 1x to 10x are a fallacy.
 
Right. For me I look at it a little differently. When the CPU is freed from bottleneck and is just dumping in more and more work that would even overload the GCP I see a workload that has no gaps. Or at least we are getting closer to producing workloads that can mimic those GPU tests that have insanely repetitive algorithms that keep your ALUs and ROPS and memory constantly in use.

Way Earlier I posted an article from AMD research in another thread that they were testing HBM modules and they found that 32 CUs doing a compute algorithm required 700GB/s bandwidth for bandwidth to not be the bottleneck. That's an insanely high number that really show cases how under saturated the CUs can be. But the workload of games seldom if ever approach that value, I believe the record for Xbox One is 140GB/s gamecode and that's likely not sustained for more than a blip or so.

So definitely there's a lot to scale but I think the idle time is killing our GPUs, waiting for work to do. With thousands of small draw calls being submitted maybe the GCPs will be able to schedule better and feed the CUs more. Obtaining much more performance out of our hardware than previously thought capable.

If I were to extrapolate poorly on the required bandwidth of 12 CUs: 853/1000 = 0.853
700*.853 = 597
12/32 = .375

597*0.375 = 223GB/s

Esram can only provide 192GB/s theoretical with a write bubble. It can pull and additional 40GB/s from DDR. For a combined max bandwidth of 232. GBP/s. It's just enough. But you'd be doing a graphics test LOL.

In any event if you read the Ubisoft presentation on efficient usage of compute shaders linked here: http://twvideo01.ubm-us.net/o1/vault/gdceurope2014/Presentations/828884_Alexis_Vaisse.pdf

Lol you can see on first attempt they were CPU bound. The eventual solution required a lot of bundling and creating sync points in a very long shader to complete the task. But that's exactly it, it's a very long shader that's waiting around for its group to perform tasks!
 
Last edited:
Unless I'm missing something XB1 still has Win8, which in turn has WDDM 1.3 which is still got some of the CPU bottlenecks that DX12 + WDDM 2.0 resolves
You are missing the fact that there are actually 3 separate OS environments running simultaneously:
1) Windows 8, which runs the UI and DVR apps in the background;
2) statically linked game code which loads when you start the game;
3) hypervisor kernel to manage virtual hardware access between the two.

The Direct3D 11.X component is essentially a graphics driver that talks directly to the actual hardware. There are no dynamically loaded libariers (DLLs) at all - no WDDM, Direct3D runtime, or Windows 8 kernel.
The game executable is assembled from the object code shipped with the existing version of the XDK - that is a self-contained monolithic kernel OS tailored for very specific hardware configuration - and the game source code programmed by the developer.
 
So in theory you could be running games on Xbox one/windows10 that all have different directX APIs due to the containers?
 
Earlier I posted an article from AMD research in another thread that they were testing HBM modules and they found that 32 CUs doing a compute algorithm required 700GB/s bandwidth for bandwidth to not be the bottleneck.

If I were to extrapolate poorly on the required bandwidth of 12 CUs: 853/1000 = 0.853
700*.853 = 597
12/32 = .375

597*0.375 = 223GB/s

Esram can only provide 192GB/s theoretical with a write bubble. It can pull and additional 40GB/s from DDR. For a combined max bandwidth of 232. GBP/s. It's just enough. But you'd be doing a graphics test LOL.

I read that the Xbox One's bandwidth is actually proportionally greater than the optimal bandwidth for a 32CU GPU to remain completely saturated. It's probably fairly unheard of for a GPU to have such a proportionally high bandwidth.

What kind of an improvement would you expect that the Xbox One will get from the switch to DX12?

From my limited understanding it would depend on a few factors; whether a game is using DX11 or the to-the-metal API, or if the game is expected to make a lot of draw calls that can't be bundled. I'd guess that the best scenario would essentially mean freeing up nearly a whole CPU that would otherwise be devoted to to making those calls.

I kind of feel that Microsoft have made good strides on removing that bottleneck for even those using DX11, since the 7th core has become available more recently. Which will obviously benefit any games in DX12 too since it's another CPU.

I would like to hear at what other improvements are possible with DX12 that aren't draw call related.
 
^ Probably wait for GDC, it seems that after that the DX12 (edit: Team) itself will be willing to talk.

Edit: Be cool if Dx12 could talk by itself... ;)
 
Last edited:
You might have missed the text I'd quoted from the Metro devs earlier. They suggested that two APIs are available for the Xbox One; one being the fairly standard DX11 and the other being the "GNM style do-it-yourself" API.

Edit: I'll quote the text again in full:

"Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

In general - I don't really get why they choose DX11 as a starting point for the console. It's a console! Why care about some legacy stuff at all? On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles. On Xbox One it easily could be one million times slower because of all the bookkeeping the API does.

But Microsoft is not sleeping, really. Each XDK that has been released both before and after the Xbox One launch has brought faster and faster draw-calls to the table. They added tons of features just to work around limitations of the DX11 API model. They even made a DX12/GNM style do-it-yourself API available - although we didn't ship with it on Redux due to time constraints."

All of the details of this are in the SDK leak. https://forum.beyond3d.com/threads/xbox-one-november-sdk-leaked.56362/

The "fast semantics" API was previewed in April and released in May, but they were still adding some of the features that are showing up in DX12 as late as October 2014 (descriptor tables). There might not even be a lot of games released that fully leverage that API.
 
I read that the Xbox One's bandwidth is actually proportionally greater than the optimal bandwidth for a 32CU GPU to remain completely saturated. It's probably fairly unheard of for a GPU to have such a proportionally high bandwidth.

What kind of an improvement would you expect that the Xbox One will get from the switch to DX12?
.
I suspect none actually, like in the sense that if games were coded the same today but just switching over to DX12. I think the developers must be purposefully be exploiting the draw call advantage where previously a lot of small jobs (ie if you look at Ubisoft's presentation) it would be painful to utilize as a compute shader because of overhead, so maybe they went through some other path on the GPU, or maybe it wasn't done at all.

You're right I don't see the frames per second or the resolution necessarily increasing, in fact as Psorcerer stated earlier that's probably the worst case scenario, you want to keep things around 1080@30/60fps but you want graphical fidelity to go up, way up. Right now those monster 980 SLI GPUs are being wasted trying to max out their GPUS by doing things like 4K resolution with a high degree of MSAA. We really should be over taxing those GPUS to the point that the game looks like 1080p@30fps but way better.

So I'm hoping in this regard, with DX12 being the thing forward for Xbox, Xbox won't/shouldn't suffer from the eventual decline we all thought it would have --- which is, if it could only do 900p today, in a few years time it can only do 720p. I think ideally the Xbox One will hold steady at 900/1080p but we should see the graphical fidelity ramp up. I think the days of 720p resolution should be gone with the introduction of DX12 however. 720p is a sign of a workload imbalance for CUs, 900p could be a esram limitation.
 
2) statically linked game code which loads when you start the game.

Wait.... Statically linked?!

I know there are three OS's .

And I knew that Title games have been previously explained by the xb1 architects as shipping it's own OS, from within what started out as a VM but became highly customized for the XB1 and now known as a "Partition"...

I just never heard it explained as "statically linked" ..

I know very well what a statically linked program is, in fact .NET Native (ProjectN) is exactly this. Which I closely follow, and develop for..

Do you have a reference to this ' statically linked' assertion of yours? Purely because I would love to know more exactly how this works for xb1
 
You are missing the fact that there are actually 3 separate OS environments running simultaneously:
1) Windows 8, which runs the UI and DVR apps in the background;
2) statically linked game code which loads when you start the game;
3) hypervisor kernel to manage virtual hardware access between the two.
.

Again not questioning the validity of what you're saying, it could very well be the actual details which I was missing... BUT this is how the 3OS's were explained to me at the past \\build\ event ..

https://forum.beyond3d.com/threads/xbox-one-november-sdk-leaked.56362/page-11#post-1818214

If infact the details of the GameOS are that the game is a statically linked program, that's great news.. Sort of confirms what ive been expecting from .NET Native and the upcoming Windows 10 Containers..
 
Back
Top