Shader Compilation on PC: About to become a bigger bottleneck?

All the more reason for there to be a game shader by driver repository where then consumers merely need the game to just download the appropriate package.
I would love for this to be possible. It should be on the developers or Nvidia/AMD/Intel to create this and have optimal packages to optionally download with drivers.
 
I'm of the opinion that I don't care how long these shaders take to pre-compile. Pre-complile everything possible and optimize everything else that you can. The EXPERIENCE itself is what's important... not the 1 extra step required for setup before you can play.

Like I said... something is going to have to be done eventually. This stuff will keep getting worse.. and when your $2000 rig has high FPS but stutters everywhere, and yet the console versions are smooth, then people will start questioning why they even bother with PC anymore. It's in Nvidia/AMD/Intel's best interest to figure this stuff out. Resolutions are high enough, consoles are beginning to get more 60fps games and possibly 120fps games.. TVs which support those refreshrates will become more common... what good is 240hz if you're dealing with hitches as games load stuff?
 
what determines compile times mainly? CPU?

It's a combination of things. CPU handles the job of compiling the shaders but compiler design is another big factor as well ...

If we take a look at the open source Linux Vulkan community driver for AMD such as RADV when it transitioned from the LLVM compiler to the ACO shader compiler, shader compilation times were transparently improved because the ACO shader compiler had a reduced number of redundant optimization passes needed compared to the LLVM compiler backend but as a consequence of making the compiler specific to a certain set of hardware ACO only works on AMD GPUs while the LLVM compiler works on many different architectures ...

By introducing AMDIL which is a vendor specific IR, a vendor like AMD could make a really simple compiler which would vastly improve the shader compilation times. We were able to observe a real world significant improvement in shader compilation times when designing the compiler for a specific vendor HW for a vendor agnostic IR (SPIR-V) so we can only imagine the possibilities if we had both a vendor specific IR and compiler!
 
Possibly Disk IO, need to read and write all the shaders.
Compiled shader cache's are extremely small, I don't think there's much I/O. In my experience it's mainly CPU - Dishonored 2 for example took 7-10 minutes to compile when I just had a hyperthreaded Pentium. When I got my 6-core 9400 it went down to ~2 minutes, and CPU utilization on all cores is pegged at 100% when it's doing this.
 
It's a combination of things. CPU handles the job of compiling the shaders but compiler design is another big factor as well ...

If we take a look at the open source Linux Vulkan community driver for AMD such as RADV when it transitioned from the LLVM compiler to the ACO shader compiler, shader compilation times were transparently improved because the ACO shader compiler had a reduced number of redundant optimization passes needed compared to the LLVM compiler backend but as a consequence of making the compiler specific to a certain set of hardware ACO only works on AMD GPUs while the LLVM compiler works on many different architectures ...

By introducing AMDIL which is a vendor specific IR, a vendor like AMD could make a really simple compiler which would vastly improve the shader compilation times. We were able to observe a real world significant improvement in shader compilation times when designing the compiler for a specific vendor HW for a vendor agnostic IR (SPIR-V) so we can only imagine the possibilities if we had both a vendor specific IR and compiler!

Interesting, thanks!
 
Compiled shader cache's are extremely small, I don't think there's much I/O. In my experience it's mainly CPU - Dishonored 2 for example took 7-10 minutes to compile when I just had a hyperthreaded Pentium. When I got my 6-core 9400 it went down to ~2 minutes, and CPU utilization on all cores is pegged at 100% when it's doing this.
They aren't that small. Detroit: Become Human's cache is 1.26GB on my machine. Regardless I/O does have an effect on reading and making small updates to files... but likely not any sort of bottleneck on a decent CPU.

My 3900X smashes through these processes.
 
I'm of the opinion that I don't care how long these shaders take to pre-compile. Pre-complile everything possible and optimize everything else that you can. The EXPERIENCE itself is what's important... not the 1 extra step required for setup before you can play.
Staring at a shader compilation screen for 5+ minutes for every game you start after a driver update (or game patch) would indeed hamper the 'experience', especially when the new consoles can have you jump back in a game from a suspended state in ~5 seconds.
 
Staring at a shader compilation screen for 5+ minutes for every game you start after a driver update (or game patch) would indeed hamper the 'experience', especially when the new consoles can have you jump back in a game from a suspended state in ~5 seconds.
Are you installing new drivers multiple times during a game playthrough? Changing GPUs often? Patches coming out every day? Are you playing 5 games at a time?

If I delete a game... and I want to play it again I have to re-download it... oh lord have mercy... not that process again!

Faster CPUs means the process goes faster. Like I said... they'll have to figure out something one way or another.
 
Are you installing new drivers multiple times during a game playthrough? Changing GPUs often? Patches coming out every day? Are you playing 5 games at a time?
Drivers are released roughly every month, and often they're required to get the best experience. For long games yes, I can absolutely experience this process if I'm busy and can't devote 20 hours a week to it. And yes, I'll bounce through multiple games in a month, this is really not uncommon.

You have the case in Horizon now where even changing some settings in the GPU control panel screw up the graphics, requiring you to rebuild the shader cache (15 mins) if you want them corrected. Again, egregious example, but this kind of stuff does happen outside of driver updates as well.
If I delete a game... and I want to play it again I have to re-download it... oh lord have mercy... not that process again!
Bizarre and irrelevant comparison.
Faster CPUs means the process goes faster. Like I said... they'll have to figure out something one way or another.
Hence, this thread to hear if anyone in the industry is aware of efforts to improve it.

Yet you're here, simultaneously implying I've being hyperbolic of it being a problem, and saying 'they'll have to figure out something'? I get that you don't care, but no one is forcing you to participate in this thread.
 
Last edited:
Drivers are released roughly every month, and often they're required to get the best experience. For long games yes, I can absolutely experience this process if I'm busy and can't devote 20 hours a week to it. And yes, I'll bounce through multiple games in a month, this is really not uncommon.

You have the case in Horizon now where even changing some settings in the GPU control panel screw up the graphics, requiring you to rebuild the shader cache (15 mins) if you want them corrected. Again, egregious example, but this kind of stuff does happen outside of driver updates as well.

Bizarre and irrelevant comparison.

Hence, this thread to hear if anyone in the industry is aware of efforts to improve it.

Yet you're here, simultaneously implying I've being hyperbolic of it being a problem, and saying 'they'll have to figure out something'? I get that you don't care, but no one is forcing you to participate in this thread.
Ah so once a month you have to sit through a 7-10 min process after a driver update. Sorry if I don't share your sentiments that it's a bigger problem than the games stuttering the entire time while I'm trying to play and enjoy them. And I've been changing many settings in Horizon Zero Dawn and have not had to rebuild my shader cache yet.

No it's not really bizzarre or irrelevant. It's a necessary thing. As will be building or having pre-compiled shader caches in the future. I guess the only option going forward is just that... give people an option. You can deal with stutters throughout the game, and I can build my shader cache one time before I play and enjoy a much better experience.

You see, I'm not simultaneously implying you're being hyperbolic and then saying "they'll have to figure something out"... the fact is it's quite subjective in what people are willing to put up with and both can be true. For you..it's a big deal loading into the game the first time.. and for me, the gameplay experience is more important. From my view, you're exaggerating how much of a hassle it is.. acting like you're going to be constantly updating your drivers and changing some settings which make you have to constantly do it. And the truth is... they DO have to figure something out.. I fully acknowledge that building shader caches are a pain in the ass... but I also understand enough to know why it's something necessary.. and something that will be ever increasingly important in the future.

What you're seeing IS the efforts by the industry to improve the "situation"... the situation of course being that compiling shaders during runtime can cause games to hitch like crazy and ruin the very important first impression.. thus we have a proposed solution.. pre-compiling them at first load after install.

Again, I'm not going to pretend like I'm a programmer and that I know if things can be done differently or not. I'm just a consumer voicing my opinion relevant to this thread.
 

I feel like the blog post isn't clear enough on the deeper understanding as to why shader recompilations are becoming more common in newer APIs ...

For some historical context, D3D7 to D3D9 and OpenGL were initially designed for HW modeled after state machines and this remains true for some GPU designs like we see today on Nvidia hardware which somewhat matches this paradigm. Shader recompilations usually aren't triggered often on these types of architectures because they change rendering states via through some individually set register that sends commands to the fixed function HW units.

Today, D3D12 and Vulkan are designed for theoretical HW that's near stateless. In this model some hardware can potentially emulate these rendering states on their programmable shader cores so changing states would involve patching the shader code which can be very expensive. PSOs in these APIs exist so that some GPUs can associate these rendering states with their binary blobs. Most commonly as we see on mobile GPUs, the blend state has to be emulated with shader code because they might not necessarily have a fixed function blending unit performing these operations. This is true for some other rendering states on AMD HW as well such as the entire input assembler stage. Shader recompilation being triggered by a driver update is the expected outcome due to binary blobs encoding some graphics state and this binary blob can be volatile from different hardware generations or compiler versions.

D3D10 and D3D11 are sort of in between these two extremes where they move away from being designed solely for state machines but they aren't stateless. Their model is based around "state objects" which grouped together other more granular graphics state from the pipeline and this is arguably where shader recompilations started becoming a bigger issue during this period in time as some newer hardware designs were moving towards emulating some of these states. It was clear at that point that runtime shader compilation wasn't going to be sustainable for some architectures so the proposed solution was to do shader compilation during boot up or loading times which is exactly what D3D12/Vulkan was set out to be designed for.

To put it simply, shader recompilation is a ramification of modern HW design ...
 
How much variation in shader compilation time is there depending upon the user's graphics settings?
 
How much variation in shader compilation time is there depending upon the user's graphics settings?
Depends.
Is it just more instances of expensive materials? None.
Is it additional materials / effects? Linear in the number of additional effects.
Are there some uber-shaders in which the most complex code paths are enabled? Now we are talking, shader compilation time can ramp up quickly. Like if you have additional code paths for a better lighting models which are now enabled in all materials, that's getting costly.

But let's be realistic, it's mostly first or second case. And it's not the visual quality, but just the vast amount of distinct materials with just slightly differing parameters which is requiring a likewise tremendous amount of different PSOs which is driving the compilation cost.

Eliminating some run time cost by resolving as much as you can by specialization rather than UBO is a sensible optimization as long as we talk about run time cost, but doesn't apply well to a heterogeneous target platform.
If we see yet another AAA console port showing up with abnormal shader compilation times, it's a not unconditionally applicable optimization which we are seeing backfiring. And mitigating techniques which a native PC application may have applied, such as going for UBO controlled uber shader first, swapping specializations in at run time as required, just haven't been implemented.

Ultimately the question is though whether we actually need that many PSOs. There is just no way all of them are actually visually distinct or non-substitutable. Respectively couldn't have been already stripped from some parameters by baking normalized values back into resources.
 
I'm of the opinion that I don't care how long these shaders take to pre-compile. Pre-complile everything possible and optimize everything else that you can. The EXPERIENCE itself is what's important... not the 1 extra step required for setup before you can play.

Agreed. We already have to wait an hour+ for download and install on a decent internet connection. What does another 10-15m matter in the grand scheme of things.
 
In an age where entitlement and instant gratification are the primary drivers for youth.
 
Back
Top