optimization of the drivers

Trident

Newcomer
Hi there!

I've been always wondered - how programmers optimize drivers?
I mean - why drivers provide performance boost in some applications in their new versions, why they even provide overall performance boost sometimes? Why they can't do this from the very beginning? From where improvements are come? Better algorithms or what?


P.S sorry for my English!
 
I've been always wondered - how programmers optimize drivers?
Just like any other extraordinarily large and complex code base, you will find bugs in the code that require fixing. For example...
I mean - why drivers provide performance boost in some applications in their new versions, why they even provide overall performance boost sometimes?
That would depend on the "bug" that was fixed, or the performance improvement. As an example, let's say the driver experiences a performance drop when dealing with specific nested loops of a particular shader algorithm due to bad handling of the hardware registers. By fixing that bug, the driver developer may radically accelerate that process for all applications that use that particular shader method. But not all apps use ALL shader methods, and so the register pressure condition may have never affected a large number of applications. Only the apps that call the "bugged" driver code will be affected, both con and pro.

Why they can't do this from the very beginning?
Why does any code product need updates? Windows? Linux? Office applications? A website? A browser? Because code isn't perfect from day zero, and the usage of your product will change over time which will have the tendency to expose different bugs.
 
To be slightly more specific, a lot of the driver code is actually a code compiler for all the various high level and lower level API's and languages that the hardware supports. My use of the term "bug" above is more often pointing to a bad compiler action that results in a less-than-optimal hardware translation.

One of the far more experienced members can give you concrete examples of how certain high-level API calls could translate into a hardware-level assembly instruction set. That conversion is not "pure" and involves a lot of speculative work, sometimes the compiler simply guesses wrong. The result isn't often a complete failure, it's more likely to be bad performance. Sometimes even "bad" performance isn't bad enough to notice, othertimes it is.
 
That would depend on the "bug" that was fixed, or the performance improvement. As an example, let's say the driver experiences a performance drop when dealing with specific nested loops of a particular shader algorithm due to bad handling of the hardware registers. By fixing that bug, the driver developer may radically accelerate that process for all applications that use that particular shader method. But not all apps use ALL shader methods, and so the register pressure condition may have never affected a large number of applications. Only the apps that call the "bugged" driver code will be affected, both con and pro.
I see.

Why does any code product need updates? Windows? Linux? Office applications? A website? A browser? Because code isn't perfect from day zero, and the usage of your product will change over time which will have the tendency to expose different bugs.

But software updates are much less likely to "kill" the whole system than hardware updates. I think it is not a fair comparison...

To be slightly more specific, a lot of the driver code is actually a code compiler for all the various high level and lower level API's and languages that the hardware supports. My use of the term "bug" above is more often pointing to a bad compiler action that results in a less-than-optimal hardware translation.

Oh! So basically driver is a number of methods which are used by programs? And this methods do some stuff using hardware. And so they are just like any other programs but they are operating without abstraction but "to the metal"?
 
Applications (including games) do not call driver methods directly. Applications use APIs (such as DirectX and OpenGL). API calls by the software can be implemented by calling (one or more) driver methods. Drivers are software just like everything else. GPU driver software basically translates high level concepts to a format understood by the GPU hardware, and does the required bit transfers to GPU over PCI-express bus (it might call another driver to perform the actual transfer).

Shader compiler is only a part of the driver. It translates platform independent shader code (HLSL / GLSL) to hardware specific microcode. DirectX actually has a built-in HLSL -> bytecode compiler. The driver thus doesn't need to parse the HLSL code (text) itself, only translate that general purpose bytecode to hardware specific microcode. DirectX shader compiler does the most generic optimizations for the shader code, but the driver shader compiler must do additional optimizations on top of the generic optimizations, because every GPU architecture is different.

Most GPU APIs hide GPU memory management from the application. Application cannot directly modify GPU memory mappings. Usually it's the responsibility of the driver to handle this. APIs support different kind of resources. Some resources are static and some are temporary (temporary resources often need to have multiple copies in memory to hide CPU<->GPU latency). Memory management is not easy. Fragmentation of a limited GPU memory space is always an issue. Life time management (and transfers between CPU<->GPU memory spaces) can be quite complicated as well.
 
In the PC space it's even more complicated, drivers try and correct bad or poor application behavior by identifying patterns in primitive submission or shader code, or identifying the application more directly. I've always though this was a bad idea, but all the vendors do it.

They also have to emulate old bugs that released applications have become dependent on. It's actually surprising to me newer drivers break so little.

Short version is Graphics drivers are complicated pieces of software, and there's is always scope to improve such things.
 
Shader compilers are not the hardest/most interesting pieces of optimization happening in the driver though. Yes, you can produce more optimal code, lower register pressure and do other things, but most of the cost comes from UMD<->KMD<->HW communication.

A simple example would be clears. You can pass every clear down intertwined with some Draw calls. But perhaps you can optimize (as in: remove) some of the calls completely (e.g. Draw won't be visible because there's a clear that follows)? There are many ways you could approach this task and it boils down to implementing something, testing performance in common applications (desktop, UI-heavy tools, games) and figuring out a better strategy for the task. If you can cut down the number of UMD->KMD transitions then you're golden.

Basically most improvements come from batching communication, figuring out what doesn't need to be done[1] and minimizing the GPU idle time. Then there's shader optimization and general code performance improvements.

[1] there's a nice sentence in DX docs that applies here: the fastest pixel is the one you don't draw
 
Drivers have various optimizations for increasing fps on a per-application basis such as:

-Automatic memory management with suballocation and texture caching with priority. This is especially useful in games having dynamic loading of resources.

-Hand-optimized shaders (in PTX/IL) for popular games. This requires in-depth profiling of your game and understanding their shaders and re-writing in your native intermediate shader language to get that extra 10% advantage over your competition!

-Some optimization for using lower precision texture formats to save memory bandwidth.

And like Dominik said, it won't do a command buffer write if you do something unnecessary and performance critical like doing many flushes without a draw call between them. So it also tries to minimize stuff being written on your command buffer so GPU's life becomes easier.
 
Last edited by a moderator:
Back
Top