DirectX 12 API Preview

PeterAce

Regular
Glad you liked it :)
It's definitely a very good talk, Max. I've been forwarding it around to a number of folks who have asked for a good overview and have received a lot of positive responses about it being very clear and exactly the right level of detail, etc.

Thanks!
 
One thing I'm curious about, besides performance, is stability in PC drivers. It seems like there could be big wins in stability if the API model more closely resembles the hardware state.

Edit:
I guess you might trade some application stability, because of lower level coding, for driver stability. If that's true, it would seem like a good trade, because the application developer is really the one that knows what their application should be doing, rather than having a driver team hack workarounds into their driver to correct what is really an application-side problem.
 
I guess you might trade some application stability, because of lower level coding, for driver stability.
To some extent a bunch of code that used to be in UMDs just moves into the application. At the same time, it's simpler than what a generic driver has to do since apps know which features they need at what times and aren't required to make every obscure combination of things work. Apps can also just pick implementation options that are good (enough) for a given game rather than trying to implement something that is "reasonably good across all games", etc. I think it's fair to guess that apps will also be more likely to shy away from "complicated" implementations in DX12, whereas things that invoke complicated driver paths in DX11 are hidden and get used inadvertently in a lot of cases.

So yeah, I think overall you could get more stability out of this. What you trade is it will be far more common for games not tested on a specific IHV/piece of hardware to break. And when that happens with a game/hardware combo after the game is no longer being patched... well... it'll happen, and it sucks. Right now you effectively have the IHV driver teams to fix stuff, even in cases where the application is doing something unsafe/undefined.

But yeah, overall I think it's an improvement. I'm just gonna be sad when some game breaks. As an aside, I imagine IHVs will continue to get blamed when things break even though the vast majority of issues are likely to be app issues going forward. But I guess that's not too different from now where there's a non-trivial amount of "if (IHV1) { working code that we test; } else { buggy code that we never test; }"...
 
But this is (mostly) about pushing responsibility from driver to application. This pushes more responsibility onto game developers who not necessarily have the expertise driver developers have. End user doesn't care if something doesn't work because of the game logic or driver logic. And I don't see how this new setup would lead to more stable end-to-end solution. Potentially more performant - sure, that's the goal after all. But stability is actually one of trade-offs here, IMO.
 
That also gives more freedom to programmers and make drivers simpler, therefore more robust.
Now it's true not all game dev/studio can have the expertise required for that kind of API but with many engines widely available, it might not be such a problem. (Middleware companies will have the expertise and handle that, that will reduce the number of people in need of support from IHV...)

Alternatively keeping D3D11 around could be an option, or something of that kind, but I'd rather have game engine companies take on the burden.
 
But this is (mostly) about pushing responsibility from driver to application. This pushes more responsibility onto game developers who not necessarily have the expertise driver developers have. End user doesn't care if something doesn't work because of the game logic or driver logic. And I don't see how this new setup would lead to more stable end-to-end solution. Potentially more performant - sure, that's the goal after all. But stability is actually one of trade-offs here, IMO.

This position assumes that no other options will exist, such as DX11 or "D3D12.cap11" or something like it. I can't imagine that DX12 will be such a monumental break from everything that the ONLY option is to write at the lowest level.
 
I just looked, I am guessing the important bit here is the Pixel Shader UAV ordering.

So will we be able to run GRID2's intel effects on AMD or NVIDIA hardware? or is it a hopless case for GRID2?
 
For when could we hope for a new info bite on DX12???

The geek syndrome of abstinence is really hard :rolleyes:

The best I can give you right now is later this year. The talk I gave at //BUILD covered the high-level architecture of D3D 12 but not the API surface itself. The next reveal will cover more of the architecture and have API details as well. I think I might need more than an hour talk though :).

Max McMullen
Direct3D Development Lead
Microsoft
 
Read through the backup slides in Max's talk (http://channel9.msdn.com/Events/Build/2014/3-564) and see what you find :)
I see both PS UAV ordering and conservative rasterization :)

I wonder if conservative rasterization is going to beat compute shader based light/particle tile binning algorithms in performance. When you combine both of these features you could for example bin particles to 8x8 lower resolution grid with conservative rasterization. 128 bpp is enough to keep 8 particles (16 bit index) per tile. That should be enough if you always drop the most distant one when there's no room left (UAV ordering makes that possible). That's 64x reduction in fill rate cost. Afterwards you just render a single full screen pass that gathers the particles and samples all the particle textures from a shared atlas (zero overdraw). The additional bonus is that you don't even need to depth sort your particles before rendering. Finally we can say goodbye for half resolution particles :)

In the last 10 years (Xbox 360 and PS3 era) there has been lots of guys in the forums wondering why consoles offer better performance than equal PCs, and how is it possible that driver / draw call overhead can be so much higher on PC DirectX. I wholeheartedly recommend this talk to anyone wondering about this topic. Also if you look at 1 hour, 0 minute time, you see that even in DirectX 12 there's still quite a bit CPU time spend inside the UMD (user mode driver) on PC. It's impossible to kill all the overhead on PC, because PC has so wide variety of different GPUs. However I must applaud the DirectX team about achieving such a good low level API that works efficiently on such a wide array of hardware (including the tile based mobile chips).
 
When you combine both of these features you could for example bin particles to 8x8 lower resolution grid with conservative rasterization.
Indeed the two features work well together :) There are several cases where you want to use conservative rasterization to invoke any pixels where things could be happening, and then make more complicated decisions in the pixel shader (i.e. append to list, compress, discard, etc). Binning is definitely one of the interesting ones.

It's impossible to kill all the overhead on PC, because PC has so wide variety of different GPUs.
Indeed but it's definitely possible to get it to a level where it is unlikely to ever be a significant bottleneck. For instance, while smaller batches are great most would agree it's a non-goal to be vertex bound feeding the GPU frontend 1 triangle at a time. As long as the CPU submission stuff has a healthy throughput lead on the GPU frontend throughput I think you're in a pretty good place. (Obviously all CPU power reductions are important on modern chips but you get my point.)

It's also worth noting that "apples-to-apples" comparisons between PC and console aren't necessarily that well-defined. While it's easy to point to "UMD time" in the Windows version, that's not really "overhead" if it's something that would need to be done by the application in a console for instance. Similarly while DX12 moves a lot of work out of the UMD into the application (where it can often be specialized more and thus done somewhat more cheaply) it's not as if no work needs to be done there. The overhead of "rendering" is really the sum of work between the application, runtime and driver even though game developers tend to not consider the application portion to be "overhead" for them. Even if you compile a specific UMD into your own application (a la. console) that doesn't magically make rendering "zero overhead" for any useful definition of overhead :)
 
Back
Top