Shader Compilation on PC: About to become a bigger bottleneck?

It was just claimed thats how long shader compilation took I asked for examples of games that did shader compilation and I tested the games in the list that I had
I also just tested Horizon and it took 7 minutes (my phone doesnt display seconds)
Dishonoured 2 changed settings from auto to ultra then high then low then back to auto and I couldnt get it to do any shader compilation (unless it was instant)
 
Last edited:
It sounds mostly like HZD PC is not a great product. Of course the herd ran out and pre-ordered / bought it without taking that possibility into consideration.
 
So I was just checking out the GPUOpen website and came across this article about porting Detroit: Become Human to PC from PS4. It gives a bit of insight into why they made the decision they did to have a lengthy pre-generated shader pipeline process upon first launch of the game.

It's a really good article which goes over some of the decisions they made, as well as other issues during development.

https://gpuopen.com/learn/porting-detroit-1/

Shader pipelines
As we knew with our OpenGL® engine, the compilation of shaders can take a long time on PC. During the production of the game, we generated a shader cache targeting the GPU model of our workstations. It was taking a whole night to generate a complete shader cache for Detroit: Become Human! This shader cache was provided to everyone each morning. But it didn’t prevent the game from stuttering because the driver still needed to convert that code into native GPU shader assembly.

Vulkan® turned out to be much better than OpenGL® to tackle this issue.

Firstly, Vulkan® doesn’t directly use a high-level shading language such as HLSL, but a standard intermediate shader language called SPIR-V. SPIR-V makes shader compilation faster and easier to optimize for the driver shader compiler. In fact, it is similar in terms of performance to the OpenGL® shader cache system.

In Vulkan®, the shaders must be associated to form a VkPipeline . A VkPipeline can be made with a vertex and a pixel shader for instance. It also contains some render state information (depth tests, stencil, blending, and so on), and the render target’s formats. This information is important for the driver to ensure it has everything it needs to compile shaders in the most efficient way possible.

In OpenGL®, the shader compilation does not know the context of shader usage. The driver still needs to wait for a draw call to generate the GPU binary, and that’s why the first draw call with a new shader can take a long time to execute on the CPU.

With Vulkan®, VkPipeline provides the context of usage, so the driver has all the information needed to generate a GPU binary, and the first draw call has no overhead. We can also update a VkPipelineCache when creating a VkPipeline .

Initially, we tried to create the VkPipelines the first time we needed them. This caused stuttering much like the OpenGL® driver strategy. The VkPipelineCache is then up-to-date, and the stuttering will be gone for the next draw call.

Then we anticipated the creation of the VkPipelines during loading, but it was so slow when the VkPipelineCache was not up-to-date that our background loading strategy was compromised.

In the end, we decided to generate all the VkPipeline during the first launch of the game. This completely eradicated the stuttering issue, but we were now facing a new problem: the generation of the VkPipelineCache was taking a very long time.

Detroit: Become Human has around 99,500 VkPipelines ! The game is using a forward rendering approach, so material shaders contain all the lighting code. Consequently, each shader can take a long time to compile.

We found a few ideas to optimize this process:

  • We optimized our data to be able to load only the SPIR-V intermediate binaries.
  • We optimized our SPIR-V intermediate binaries with SPIR-V optimizer.
  • We made sure that all CPU cores were spending 100% time on VkPipeline creation.
Finally, a big optimization was suggested by Jeff Bolz from NVIDIA and has been very effective in our case.

A lot of VkPipelines are very similar. For instance, some VkPipelines can share the same vertex and pixel shaders, differing only by some render states such as stencil parameters. In this case, the driver can consider internally that it is the same pipeline. But if we create them at the same time, one of the threads will just wait until the other one finishes the task. By nature, our process was sending all the similar VkPipelines at the same time. As a solution, we just re-sorted VkPipelines . The “clones” were put at the end, and their creation ended up much faster.

Performance of the VkPipelines creation is very variable. In particular it depends greatly on the number of hardware threads available. With an AMD Ryzen™ Threadripper™ with 64 hardware threads, it can take only two minutes. But on a low-end PC, it can unfortunately be more than 20 minutes.

The last case is still too long for us. Unfortunately, the only way to improve this time further is to decrease the number of shaders. It requires that we change the way we create materials to share them as much as possible. It was not feasible on Detroit: Become Human because artists would have to rework all the materials. We plan to do proper material instancing in our next game, but it is too late for Detroit: Become Human.

Pretty good article. Considering how the game was designed and that they had never considered the possibility that they might some day port the game to PC.. they made the right decision IMO. The other thing about these shader optimization/caching/generating processes being done like this is that it can scale with CPU cores. It's unfortunate for those with lower end CPUs, but on a high end CPU with many threads, the length of the process decreases tremendously. As CPUs become more powerful and as core count increases in general desktop processors, the time will continue to shrink until it's essentially a non issue.
 
Another Unreal Engine 4 game, another game with shader compilation stuttering and hitching.. I'm getting quite tired of it.

4m45s, 6m11s.. ect

Happens often when new effects or visuals happen for the first time.

I dunno... it's just so disappointing. UE4 and 5 are used, and will be used, in so many games.. I mean, this stuff is tested before release, and they know that it's an issue.. why can't there be a step during installation which runs the game in the background and generates and caches all this stuff before we play it the first time??

I'm going to simply stop buying games which have this issue. A stutter here or there is fine due to the nature of PCs and all the background stuff they're doing, but this repeatable shader compilation stuff happening all over the place with your first experience with the game just isn't acceptable to me anymore. Either the engine, the game devs, the API, the IHVs... whatever... yall have to figure it out.
 
Switch your gaming to consoles and problem solved... ;)
 
Nah, I'm just not buying games with these issues anymore. There's still plenty of games which don't have this problem, or they just havesmall issues here or there. That's where my money will go.

I'll continue to complain about it though in the hopes that things will change in the future.. ;)
Also not a fan of shader comp stutter - especially when many games have shown it is 100% avoidable.
 
Also not a fan of shader comp stutter - especially when many games have shown it is 100% avoidable.
Yea, it's getting really tiresome. UE4 can produce such amazing visuals, it's just a shame that on PC there's all these hitches and stutters all the time. I just tried the Game Pass version of Psychonauts 2, and yea, stutters everywhere when first performing actions and hitting objects producing various effects.

How can they ship titles knowing that this stuff is happening? I can't enjoy a game when it's constantly hitching when performing basic actions for the first time.

I'm stating it plain and simply.. we NEED games to start including the option to allow us to generate shader pipelines and compile shaders before we play the first time. It needs to just become basic practice. I mean, developers literally do this for the consoles, and they do it for a good reason. Allow us users the ability to do it for their own computers. I know the process can take time... but we need the OPTION.

About the time it takes... I was thinking.. maybe this is a dumb idea, I'm not a developer.. but what about the possibility of compiling shaders through the cloud? You launch the game, it begins the process of compiling shaders and connects you to a beefy server which throws a ton of cores at it, compiling them much faster than your CPU alone would be able to.. and instead of 5-10 minutes, it's only 1-2 minutes.. or even faster. That would cut out a lot of the frustration involved.

I dunno, but something has to be figured out. I have no problem waiting to have a proper, smooth experience with a game. These stutters and hitches are unacceptable.
 
About the time it takes... I was thinking.. maybe this is a dumb idea, I'm not a developer.. but what about the possibility of compiling shaders through the cloud? You launch the game, it begins the process of compiling shaders and connects you to a beefy server which throws a ton of cores at it, compiling them much faster than your CPU alone would be able to.. and instead of 5-10 minutes, it's only 1-2 minutes.. or even faster. That would cut out a lot of the frustration involved.

I think it does not have to be "on the cloud," as most compile results are probably the same. It probably only depends on what GPU you have and the driver version (as explained in the article posted above, but I'm not sure about this, maybe someone in the know can shed some light on this). So, someone should be able to build a service which a game sends the hash of a shader to the server along with the GPU ID and driver version and get the compiled shader result back. It can be done with crowd sourcing but it's not even required if that someone has access to many different GPU and drivers, let's say, a GPU vendor.
 
I think it does not have to be "on the cloud," as most compile results are probably the same. It probably only depends on what GPU you have and the driver version (as explained in the article posted above, but I'm not sure about this, maybe someone in the know can shed some light on this). So, someone should be able to build a service which a game sends the hash of a shader to the server along with the GPU ID and driver version and get the compiled shader result back. It can be done with crowd sourcing but it's not even required if that someone has access to many different GPU and drivers, let's say, a GPU vendor.
And now you just described what e.g. Steam is doing under the label "Shader Pre-Caching".

Except it's only functioning for Vulkan, and not DX12 API, since it's based entirely on layer injection, for which the used API needs to have a proper infrastructure in the first place.

Want that to happen for DX12 too? Start pestering Microsoft about adding a proper loader about 5 years back, because getting it to work with that black box past-the-fact is a little bit to late...
 
A really really good 2 part blog post from Matt Pettineo from Ready at Dawn detailing the "Shader Permutation Problem" from a developer/creator point of view.. exploring the questions "How did we get here?" and "How can we fix it?"

https://therealmjp.github.io/posts/shader-permutations-part1/
https://therealmjp.github.io/posts/shader-permutations-part2/

Great read, and very well communicated.

Thanks very much for this, great read.

Speaking of shader caches, new option in the latest Nvidia driver btw:

mooDacu.png
 
Last edited:
Thanks very much for this, great read.

Speaking of shader caches, new option in the latest Nvidia driver btw:

You're welcome! And yea, I was just reading about them adding the ability to change the cache size a bit earlier.. it's a great (long overdue) addition! I know some people who are going to be happy about this, and have been asking for it for quite some time now.

Loaded up Back 4 Blood today and the first thing it does at the loading screen is "Compiling Shaders" with a progress indicator. Took about 20 seconds.

Yea, I've noticed there's becoming more and more games which are doing it. A trend which I hope will continue in the future... at the very least in the form of an option!
 
Back
Top