Drivers for old api's vs low level apis

I was just wondering in regards to tools at their disposal, do driver writers for the older high level apis have more than what is exposed to DX12/Vulkan? Any examples?
 
Well, since the WDK (Windows Driver Kit) does not expose so much about the different DirectX APIs, I can guess the opposite: in pre-low overhead APIs a lot of work is made by the runtime and then fewer - and bigger - things are exposed to the driver mostly in a single thread scenario, while with low-overhead APIs there is much less runtime intermediate works, so much more "rude" commands are submitted to the driver in a multi-thread scenario. However, in the first case the driver is "fatter" since it has to do more "interpretation" work of what the runtime produce, while in the second case the driver is "thinner" since it receives commands in a lower abstraction way.
But this is just my guess and I may be wrong since I never saw the the complete d3d12 NDA driver specifications, nor I searched for Vulkan driver specification.
 
Last edited:
But wouldn't having direct hardware access and only having to cater to one piece of hardware/family lead you to believe that a driver for a high-level api might have more tools at there disposal than what d3d12 or vulkan exposes to the application programmer? Also stupid question but its been a while, but what is the runtime responsible for again?
 
What do you mean with "tools"?
As for the (d3d12) runtime, some validation still occurs, but they are a lot less then the d3d11 runtime.
 
I'm hesitant to write anything because the amount of bullshit around this new API-s is so high and I hate to make it even a tiny bit higher since I am in no way an expert on writing display drivers (I have written some USB drivers though).
DX12 and Vulkan are not low level APIs. You don't get to assemble command buffer byte by byte and send it off to GPU. You still call methods/functions to set state and issue Draw/Dispatch commands. The same way you did with DX11 or OpenGL. They are "lower level" or "closer to the metal" in a sense that logic of how to get things done is much closer to how modern GPUs operate. This means bindless resource access. This means that because hardware no longer has some fixed number of registers for texture pointers, you (D3D12/Vulkan programmer) no longer get to assign each individual texture to some specific texture slot. Instead shader core on GPU will get direct address of the texture in memory and no longer just a number of the resource slot where to texture from. And you (D3D12/Vulkan programmer) will pack memory pointers to your textures into descriptor sets. This means driver/runtime doesn't have to keep track what's actually in each of 128 slots and can change at any time even if it's not accessed by active shader. API no longer enables you to set a texture into texture slot and leave it there for 100 frames.
They are also "lower level" in a sense that both D3D12 and Vulkan will not do any extensive error checking. They will simply crash (due to access violation for example) if developer forgets to set some resource for example and GPU tries to use it. The same with resource read/write hazards. With older APIs driver will have to resolve those. Meaning again it will have to keep track of what GPU is writing to and when it's finished. With new APIs developers have to keep this in mind. If they don't you'll just end up with visual corruption or crash.

What D3D runtime does? It implements all the methods that D3D12 exposes to application developer with all the functions D3D drivers are expected to provide (Usermode functions and Kernelmode functions). You can see that usermode functions that drivers expose haven't really changed all that much with D3D12 (WDDM2). Kernel mode on the other hand has some larger changes (GPU virtual memory, residency and context monitoring). You won't get 2 D3D drivers in you package from IHV (D3D11 and D3D12) and all the legacy stuff still has to work. Vulkan is different here in that it's a clean slate API. No legacy to support there (yet).

I think I get what you mean by "high-level api might have more tools at their disposal". That is high-level API might have "the big picture" while a low-level API will deal with small chunks. Trouble is D3D12/Vulkan are not really low-level in this sense and D3D11/OpenGL are really not high-level. They all deal with the same logical objects: vertex buffers, index buffers, textures, render targets, shader programs... which are all fairly low level. If you go one level higher up you end up in 3D renderers or Gameworks libraries and stuff like that. It also in absolutely no way infers that drivers will stay "thin" forever. They are thin because we have thrown away the training wheels. Not because we have prevented driver developers from doing on the fly some deeper analysis of what the hell application is doing and responding accordingly (after which game developers will respond accordingly by analysis of what the hell driver is doing).
 
I'm hesitant to write anything because the amount of bullshit around this new API-s is so high and I hate to make it even a tiny bit higher since I am in no way an expert on writing display drivers (I have written some USB drivers though).
DX12 and Vulkan are not low level APIs. You don't get to assemble command buffer byte by byte and send it off to GPU. You still call methods/functions to set state and issue Draw/Dispatch commands. The same way you did with DX11 or OpenGL.
Good post. But you didn't mention the biggest change. Resource management.

DX11 and OpenGL manage the residency of the resources themselves. Resources are copied to GPU memory on demand and offloaded from GPU memory when some other data needs to fit there (too large data set, or other application). With DX12 and Vulkan the developer needs to write their own resource manager. So far most developers seem to simply allocate big chunks of memory and load stuff in bulk (mostly in level load time). Result is that DX12 and Vulkan games tend have difficulties with GPUs that have less than 2 GB of memory. In DX11 and OpenGL it is fine to overcommit the GPU memory. No problem since the driver manages the exact working set every frame. This obviously causes stalls if the driver notices too many non-resident resources at once. Game would have higher level knowledge of near future resource needs, but there is no way to communicate them to the high level API. Low level APIs fix this problem, but it requires lots of developer effort.

Another big change is how dynamic data is handled. DX11 and OpenGL create shadow copies of resources on demand. In DX12 and Vulkan you need to use fences to ensure that the CPU code isn't modifying a resource that is currently being used by the GPU. As CPU and GPU run asynchronously (GPU can be up to 3 frames late), it is not trivial to create an automated system that is both correct and efficient. With the new lower level APIs this is again much improved. The game developer knows which resources are modified and how often and how much modifications happen per frame. Often a ring buffer is the best solution for data that changes every frame (such as constant buffers and dynamic vertex buffers). Resources that are changed at irregular intervals (less often) require permanent storage (similar solution to existing high level APIs). You of course need to implement that yourself as well :)
 
What do you mean with "tools"?
As for the (d3d12) runtime, some validation still occurs, but they are a lot less then the d3d11 runtime.
Hmmm by tools I mean in D3D12 and Vulkan some of the responsibilities of the driver in previous API's become the responsibility of the application. So for example barriers and fences are things that D3D12 exposes to the application programmer to take on said responsibility. I'm asking if drivers for the older api's have more things of that nature? For example a driver for d3d11 will have a interrupt handler, is it used for more than just the equivalent of fences in d3d12? Stuff like that is what I'm talking about. Basically I'm wondering if driver writers for the old api's have more tools or techniques at there disposal to extract performance from the hardware than its modern lower level api replacements?
 
Hmmm by tools I mean in D3D12 and Vulkan some of the responsibilities of the driver in previous API's become the responsibility of the application. So for example barriers and fences are things that D3D12 exposes to the application programmer to take on said responsibility. I'm asking if drivers for the older api's have more things of that nature? For example a driver for d3d11 will have a interrupt handler, is it used for more than just the equivalent of fences in d3d12? Stuff like that is what I'm talking about. Basically I'm wondering if driver writers for the old api's have more tools or techniques at there disposal to extract performance from the hardware than its modern lower level api replacements?
Understood. I think sebbi write a good response about that. The driver of the old Driver Model had to deal with a different memory management strategy, which most of work was done by the API runtime, the driver itself should do more or less the same things but since it can receive explicitly multiple threads from the application the overhead of the driver itself is a lot lighter. Moreover as MDolenc states, there are less kernel mode calls. Though this picture is old and related to a non final version of the API, it can still give a good summary of how the driver behaves with a proper usage of the low-overhead APIs:

2806.cpucompare.png


Finally, with the new APIs the application can explicitly decide which engines/queues are involved in a particular work (eg: state that a background texture streaming work is given to the copy queue). This was not possible with the previous APIs since there was not the concept of multi-engine. I may dare to state some driver optimizations could be possible if creating single application profiles, but I am not sure if this could be done automatically by the driver runtime with just "some" heuristic work. As for shaders, I think with SM 5.0 there should no be big differences for the driver executing a D3D12 or a D3D11 application (note that I am not talking about resource binding, but just the shader code itself).

I am not aware about more details, even if I could search around some papers I receive from the DX12 EAP they are NDA.
Maybe Andrew Lauritzen could give additional details if he read this thread. ( :
 
Back
Top