DirectX 12: The future of it within the console gaming space (specifically the XB1)

Kaotik · Mar 24, 2015

DSoup said:
I'd never heard it referenced before and I don't work on Microsoft platforms myself. I wonder how much support it gets compared to CUDA or OpenCL.

As far I've understood it, pretty much all games these days use compute shaders (for post-processing if nothing else) and for most games, which use Direct3D, the choice for compute-API is DirectCompute.
This might be changing now that more and more games are being planned for multiplatform (SteamOS/Linux) even on PC

Rangers · Mar 25, 2015

Not sure to post here or ESRAM thread, but Gaming Bolt has an article today where Brad Wardell makes some incendiary claims that the new DX12 SDK will solve the X1's resolution woes due to allowing control of ESRAM. And interestingly, supposedly there would be some kind of auto ESRAM "easy button"?

http://gamingbolt.com/dx12-should-r...ssue-on-the-xbox-one-esram-to-receive-new-api

“I’ve never heard Microsoft just come out and, I mean they should just really come out and explain to people why they’re having problems getting games to run at 1080p. But maybe they don’t think their users will understand, basically it has to do with developers aren’t making effective use of the eSRAM API. So in DirectX12 they actually threw it away, they threw away the crappy one in DirectX11 and they’re replacing it with a new one. So that’s pretty huge.”

“They also released a new tool, it’s this optimization tool that will actually algorithmically try to come up with an optimization for the developer. So instead of the developer trying to hand set-up what uses eSRAM, they have their own app to try and do as much of it for them as they can. Third, DirectX11 still serializes stuff from the developer to the GPU. It is low-level but the fact is as low-level as it, it’s still serializing a lot of GPU calls. So it won’t be anywhere near…you won’t get the benefit on Xbox One that you’re getting on the PC.”

Regardless, there will be substantial benefits after all including resolving the resolution issues that the Xbox One still faces till date.

“It’s completely different but you are going to get a substantial benefit. The part I think that users will care about is that it should address the resolution stuff for most people. That’s what I think is the most glaring thing that people are upset about. But it won’t do anything magically. The developers still have to use it, it’s not like your old games will magically be faster.”

“Yeah, it should do that [on resolving the resolution issues due to eSRAM], because in DirectX11 it’s really a pain to make good use of the eSRAM. Where as supposedly in DirectX12 and this is all theory, I haven’t used it myself but the new API is supposed to make it alot easier to optimize your use of the eSRAM memory.

“The API is there for me to use as a tool for the piece of hardware. And the one that was in DirectX11 was not easy, it was a very trial and error process to make use of the eSRAM. In DirectX12 they’ve tried to make it easier to make use with and the easier it is to use, the more likely you’re going to get developers who optimize for it correctly.”

Read more at http://gamingbolt.com/dx12-should-r...-esram-to-receive-new-api#RX3TjaUvjuAcHGEJ.99

iroboto · Mar 26, 2015

Rangers said:
Not sure to post here or ESRAM thread, but Gaming Bolt has an article today where Brad Wardell makes some incendiary claims that the new DX12 SDK will solve the X1's resolution woes due to allowing control of ESRAM. And interestingly, supposedly there would be some kind of auto ESRAM "easy button"?

http://gamingbolt.com/dx12-should-r...ssue-on-the-xbox-one-esram-to-receive-new-api

Read more at http://gamingbolt.com/dx12-should-r...-esram-to-receive-new-api#RX3TjaUvjuAcHGEJ.99

I guess the key point here is how much further would the xbox have to stretch to obtain 1080p.
A lot of people just use TFLOPs as a way to measuring performance, but I feel from a technical perspective, that is not really capturing the entire picture.

I've often wondered how AAA developers truly budget what they can and cannot fit in a game at the start of a project.

And so, after much deliberation, I believe they do it like this

any of you guys are free to correct me.

They look at the resolution and amount of shaders involved as a target and that will ultimately determine the amount of floating point operations that the game will consume. We will say X amount - or they reverse engineer a target from their target hardware or something. But they should have an idea of how heavy their code should be.

Because you are a console developer I guess, say targeting PS4 for example. you know you have a peak 1.84 TFLOP. No problem
That means, every second PS4 is capable of 1.84 Terra floating ops. Okay. So for 60fps, you need 16.6 ms.
So that means it can only do
0.030666666666666 TFLOPs in a 16.6ms frame.
or 30.7 billion flops per frame.

Your game needs to be less than 30.7 billion flops per frame, or it is theoretically impossible to do.
Now interestingly, once that frame counter begins, every clock cycle that is unused or used, the number of flops go away. For reference on PS4 there are 1.84 Million FLOP per millisecond.
So game developers must be budgeting how much time could they have possibly idle when your GPU is doing nothing due to cache misses, waiting to read and write memory, waiting for instructions, changing states, waiting for sync points.. etc

So games will likely be targeting much less than 30.7 billion in this case, they need to factor in latency and stuff. So eventually they come to a budget.
Now this is why I believe when Naughty Dog said hey we can run UC4 at 1080p/60 it's because the GPU code must be less than 30.7 billion flops to make it happen. But they were having troubles with the loss cycles, since once the frame ticker starts, you're losing FLOPs whether you are using them or not. So it runs at 30fps instead. And that gives them access to 61.4 billion flops, how many are wasted, and how many are actually doing work is beyond me. There are only two ways ND will be able to hit this target, they either use less FLOPS to do the same task (optimization) or they find a way to use more FLOPS and have less idle time (saturation).

Same thing applies to Xbox One.
22 Billion flops per frame @ 60 frames per second. Now if the budget of the game is less than 22 Billion flops per frame @ 1080p, it could theoretically fit in XBox's budget. But once again you're going to have to have that uphill climb. Will directX12 up the efficiency so much that there is very little wasted FLOPs? Shrug.

If so, then, yea I guess it can do it, but it's still up to the ability of the developers and the budget of the game.

shredenvain · Mar 26, 2015

iroboto said:
I guess the key point here is how much further would the xbox have to stretch to obtain 1080p.
A lot of people just use TFLOPs as a way to measuring performance, but I feel from a technical perspective, that is not really capturing the entire picture.

I've often wondered how AAA developers truly budget what they can and cannot fit in a game at the start of a project.

And so, after much deliberation, I believe they do it like this any of you guys are free to correct me.

They look at the resolution and amount of shaders involved as a target and that will ultimately determine the amount of floating point operations that the game will consume. We will say X amount - or they reverse engineer a target from their target hardware or something. But they should have an idea of how heavy their code should be.

Because you are a console developer I guess, say targeting PS4 for example. you know you have a peak 1.84 TFLOP. No problem
That means, every second PS4 is capable of 1.84 Terra floating ops. Okay. So for 60fps, you need 16.6 ms.
So that means it can only do
0.030666666666666 TFLOPs in a 16.6ms frame.
or 30.7 billion flops per frame.

Your game needs to be less than 30.7 billion flops per frame, or it is theoretically impossible to do.
Now interestingly, once that frame counter begins, every clock cycle that is unused or used, the number of flops go away. For reference on PS4 there are 1.84 Million FLOP per millisecond.
So game developers must be budgeting how much time could they have possibly idle when your GPU is doing nothing due to cache misses, waiting to read and write memory, waiting for instructions, changing states, waiting for sync points.. etc

So games will likely be targeting much less than 30.7 billion in this case, they need to factor in latency and stuff. So eventually they come to a budget.
Now this is why I believe when Naughty Dog said hey we can run UC4 at 1080p/60 it's because the GPU code must be less than 30.7 billion flops to make it happen. But they were having troubles with the loss cycles, since once the frame ticker starts, you're losing FLOPs whether you are using them or not. So it runs at 30fps instead. And that gives them access to 61.4 billion flops, how many are wasted, and how many are actually doing work is beyond me. There are only two ways ND will be able to hit this target, they either use less FLOPS to do the same task (optimization) or they find a way to use more FLOPS and have less idle time (saturation).

Same thing applies to Xbox One.
22 Billion flops per frame @ 60 frames per second. Now if the budget of the game is less than 22 Billion flops per frame @ 1080p, it could theoretically fit in XBox's budget. But once again you're going to have to have that uphill climb. Will directX12 up the efficiency so much that there is very little wasted FLOPs? Shrug.

If so, then, yea I guess it can do it, but it's still up to the ability of the developers and the budget of the game.

My only comment is that the tflop output for both systems is theoretical and I am sure the actual output can vary depending on what exactly you are using the system for.

iroboto · Mar 26, 2015

shredenvain said:
My only comment is that the tflop output for both systems is theoretical and I am sure the actual output can vary depending on what exactly you are using the system for.

Peak, not theoretical. You'd have to use them all and not have waste, but you're right, I'm not sure how this would translate to GPGPU calculations.

mpg1 · Mar 26, 2015

This video from pcper shows an initial review of DirectX 12 performance (draw calls per second). Seems to be pretty substantial increase over DirectX 11 depending on the hardware. At the end (7:06 mark) they show the results from an AMD APU...this is probably the closest thing we could have to compare to an Xbox One. It basically gets 8X the draw call performance. Not sure how to read into that.

Starx · Mar 28, 2015

DX12 – Multi-Threading Shadow Map Rendering Performance Boost, Can Be Applied to Existing Engines

http://www.dsogaming.com/news/dx12-...nce-boost-can-be-applied-to-existing-engines/

Cyan · Mar 28, 2015

mpg1 said:
This video from pcper shows an initial review of DirectX 12 performance (draw calls per second). Seems to be pretty substantial increase over DirectX 11 depending on the hardware. At the end (7:06 mark) they show the results from an AMD APU...this is probably the closest thing we could have to compare to an Xbox One. It basically gets 8X the draw call performance. Not sure how to read into that.

Additionally, PC World has performed a few tests as well, and they say that the leap is insane.

http://www.pcworld.com/article/2900814/tested-directx-12s-potential-performance-leap-is-insane.html

fehu · Mar 28, 2015

I've always read in interviews that dx are shitty, but never imagined so much...

oldschoolnerd · Mar 28, 2015

So, what's the general consensus around here these days? Is dx12 going to turn the xb1 into a ps4 beating monster, or is dx12 only for PC with xb1 getting small gains?

Personally, the more info that comes out, I'm going with the former...though to be sure if the former is the case, MS have been unnecessarily quiet on the subject.

To take a couple of headline items. Massively increased draw call rates due to parallel submission and 50% reduction in frame rendering time due to using async rendering of shadowmaps. Just these two examples alone would result in huge gains.

What is going on?

Deleted member 86764 · Mar 28, 2015

oldschoolnerd said:
Is dx12 going to turn the xb1 into a ps4 beating monster, or is dx12 only for PC with xb1 getting small gains?

Personally, the more info that comes out, I'm going with the former...

Neither is true. As stated a few pages back, these benchmarks are specifically designed to show drawcall improvements. I'm surprised so many publications are suggesting that the proportional increase across games will be consistent. They won't. Not unless there's a game that uses drawcalls to this excess and can't bundle them.

NRP · Mar 28, 2015

DX12 will benefit PC games most. The Bone may get some small improvements here and there, but likely nothing most people would notice on screen.

iroboto · Mar 28, 2015

NRP said:
DX12 will benefit PC games most. The Bone may get some small improvements here and there, but likely nothing most people would notice on screen.

I think that's a slight misconception.
Xbox One supports to date DX11 with XBO extensions for esram and some bundle features. It also supports a low level version of DX11 which they refer to as fast semantics. The equivalents on PS4 are GNMX and GNM. While we don't have the splits on how many developers use the high level APi vs the low level one, it's pretty safe to assume the high Level one is used in a lot of studios.

There are many people that think just because consoles have low level APIs it means that all developers are capable of using them or that all engines have them employed.

Ryse Son of Rome is considered widely as Xbox Ones best looking game, and it is entirely built on DX11. Fast semantics didn't even become available until March of 2014.

That pretty much goes for all launch games on Xbox. BF4 is such a game as well and I don't necessarily know if they migrated BF4 Hardline to a fast semantics engine.

So if a developer moves their engine like Crytek3 from DX11 to DX12 will mean that they are now moving to a low overhead API, closer to metal code, and a multithreaded renderer which they had none of before.

Having said that though, it wouldn't surprise me to see a lot of games released using DX11.3 and not DX12 as the years go on.

pjbliverpool · Mar 28, 2015

oldschoolnerd said:
So, what's the general consensus around here these days? Is dx12 going to turn the xb1 into a ps4 beating monster,

This absolutely is not going to happen. The PS4 already has it's own low level API which it likely even more efficient (since it's specific to the PS4 architecture) then DX12.

50% reduction in frame rendering time due to using async rendering of shadowmaps. Just these two examples alone would result in huge gains.

This would only be in the most corner of cases, you're never going to double overall frames per second using async compute. The highest estimates I've gheard so far for peak improvements is in the 30% range, with 15-18% being more normal. That, and the PS4 already has access to async compute through it's own API.

NRP · Mar 28, 2015

@iroboto
Why is it "safe" to assume most XBO devs are using the high level API? For any multiplatform title, the XBO version is starting from a significant hardware deficit. I'd think these devs would jump on any performance improvement option available to them. Likewise, I'd think MS would make any and all such performance improvement options available to all devs ASAP, rather than waiting for DX12's official release.

iroboto · Mar 28, 2015

NRP said:
@iroboto
Why is it "safe" to assume most XBO devs are using the high level API? For any multiplatform title, the XBO version is starting from a significant hardware deficit. I'd think these devs would jump on any performance improvement option available to them. Likewise, I'd think MS would make any and all such performance improvement options available to all devs ASAP, rather than waiting for DX12's official release.

Both platforms I mean. Let's look at the number of games that started well before as DX11 engines
BF4
UE4
CryEngine
What Ubisoft uses
Likely whatever EA uses for its sports games.

We've read interviews about "the crew" where they were leveraging GNMX. So that really starts to draw some parts of the landscape. If AAA developers aren't all fully aligned in using low level then it does say a lot about the goals of the company.
1st wave games might have had too hard of a deadline, too much to learn, tools immature, and learning the new hardware. Having to deal with a low level API may have been too much. If the code works why re-write it all just for Xbox One? Who cares that it performs worse, it's MS fault that they didn't put in beefy enough hardware. If you are already learning a new API Like GNM(X) do you really want to invest into learning fast semantics?

I only see console exclusives getting that treatment. Or highly talented companies going that route where resources are plentiful.

But next couple of waves we will likely see more developers head to the lower level.
For developers that haven't left DX11: DX12 is a good reason to upgrade as it benefits PC, mobile and Xbox One.

oldschoolnerd · Mar 28, 2015

ThePissartist said:
Neither is true. As stated a few pages back, these benchmarks are specifically designed to show drawcall improvements. I'm surprised so many publications are suggesting that the proportional increase across games will be consistent. They won't. Not unless there's a game that uses drawcalls to this excess and can't bundle them.

I agree, clearly the tests are synthetic purely to demonstrate one facet, however there are documented (sort of...) examples where the xb1 has been bottlenecked by its single threadred draw call submission. That bottleneck is no more with dx12. Surely that's going to have an impact?

AlNom · Mar 28, 2015

pjbliverpool said:
This would only be in the most corner of cases, you're never going to double overall frames per second using async compute. The highest estimates I've gheard so far for peak improvements is in the 30% range, with 15-18% being more normal. That, and the PS4 already has access to async compute through it's own API.

Regarding shadowmaps, I'm wondering if this would benefit Forward+/Clustered Forward solutions more than Deferred Shading renderers.

hm...

oldschoolnerd · Mar 28, 2015

pjbliverpool said:
This absolutely is not going to happen. The PS4 already has it's own low level API which it likely even more efficient (since it's specific to the PS4 architecture) then DX12.

Now here I have to disgaree. From what I gather the performance gains from dx12 have little to do with getting close to the metal, but everything to do with utilising multiple CPU cores to submit work to the GPU. Dx12 is paradigm shift in the approach rather than getting even closer to the metal.

Having said that, the last time I got close to metal was on the amiga's blitter in about 1986...so I could be misunderstanding the situation...

AlNom · Mar 28, 2015

oldschoolnerd said:
Dx12 is paradigm shift in the approach rather than getting even closer to the metal.

That's not something exclusive to the "creators of DirectX".

Regardless, PS4 has superior GPU specs.

DirectX 12: The future of it within the console gaming space (specifically the XB1)

Kaotik

Drunk Member

Rangers

iroboto

Daft Funk

shredenvain

iroboto

Daft Funk

mpg1

Starx

Cyan

orange

fehu

oldschoolnerd

Deleted member 86764

Guest

NRP

iroboto

Daft Funk

pjbliverpool

B3D Scallywag

NRP

iroboto

Daft Funk

oldschoolnerd

AlNom

Moderator

oldschoolnerd

AlNom

Moderator

Similar threads