Shader Complilation on PC: About to become a bigger bottleneck?

Rootax

Veteran
Tbf it was/is a problem in other engines too. But yeah, problem is the most used engine has this problème...

Give me back 20-60s initial load time if this mean shader compilation during this time...
 

Remij

Regular
People have always been apologetic to game developers for some odd reason. As if expecting a game to operate smoothly and do what it's advertised to do is somehow unreasonable.
Yea there certainly are people like that, which doesn't help things. I get quite annoyed when I go onto forums to give feedback, and predicably have the "it's your computer" crew immediately chime in who never seem to have any issues ever. They're so ignorant.. and it's quite frustrating because it diminishes the ability to provide proper feedback when you constantly have to argue with many of these people. I mean, that's what game specific forums are mostly for.. In the end, when we give feedback, we're just trying to improve the product for everyone.

However, there's a very thin line between constructive criticism, and outright harassment. No matter how bad the situation gets on PC, I'd never condone singling out and harassing developers over technical issues.. and that seems to be a real issue with gamers unfortunately. This current situation isn't any one persons fault, and nobody deserves that. No developer actually wants their games to have performance issues. We all know there are very real hurdles and complexities to development, so some level of understanding is necessary as a baseline. That's why its important to discuss and engage with developers, and for developers to engage with their communities as well. For me, I can immediately tell and appreciate when a developer is outgoing and engaging with the community. It gives me far more confidence to provide better feedback (to actually want to take the time to do so) and help make things better, and I really appreciate those developers which do that.

And that's just it.. I'm not asking for anything unreasonable when I ask for games to not hitch and stutter all the time. I expect a game to perform properly the first time I experience it, as anyone should given the requirements are met. As someone who has a little more understanding of the complexity of the platform and its intricacies than the average person, I'm understanding if there needs to be a pre-compilation process that takes some time. If that improves the situation, then I don't just accept it... I expect it.

So I feel that if I'm willing to make that compromise, I absolutely should expect better from developers to put in that work to ensure this issue is as mitigated as possible.

Tbf it was/is a problem in other engines too. But yeah, problem is the most used engine has this problème...

Give me back 20-60s initial load time if this mean shader compilation during this time...

Yes, of course. There's plenty of games in other engines which have this issue. It's not a specific engine/API issue. But yea, as you said, UE is fast becoming the defacto engine for not only AA and Indie developers, but AAA developers now too. So it's naturally going to be the focus. I also find that UE does it in such a predicable way that I can tell when it's Unreal Engine just by how and when it hitches.

And yea definitely, I will always prefer a long initial load, or pre-compile process, if it means a smooth experience. These pre-compilation processes times will likely shorten further as CPUs become faster. A Threadripper chews through many of these pre-compilation processes compared to lower end CPUs with less frequency and cores.
 

Remij

Regular

Watching now.

hitch.png



Slower shaders at first for faster upfront compile. Compile optimized pipeline after the fact on background thread.

I'd rather have a slightly lower framerate until shaders are compiled and cached, than stutters and hitches:yes:
 

Remij

Regular
You know, after having my Steam Deck for 1 week already, I have to wonder about something..

The Deck runs SteamOS which is Linux based, it uses the Proton compatibility layer and translates DirectX graphics APIs to Vulkan. Of course we know that Steam on both Windows and Linux (and of course this means the Deck) can pre-cache shaders utilizing an amazing library/Vulkan layer called "Fossilize". Fossilize essentially captures blobs of pipeline state code, and allows that pipeline state to be uploaded and redistributed by Valve. That code can then be downloaded through Steam, and ran through your GPUs drivers on the CPU, which then generates the compiled shader cache AS the game is downloading. This doesn't require the game to be installed or even running at all.

This is pretty incredible when you think about it.

So on my Steam Deck I download games, and usually always get a "Shader pre-caching update" download first, before the actual game content starts to download. I've noticed they're often in the range of 150MB - 1GB or so.

I made this video of Ghostrunner (UE4 game) running on my PC showing the terrible hitching:


So I wanted to test it out on the Deck. I downloaded it, got the "shader pre-caching update" downloaded, this being the very first time this game has ever been installed/ran on my Deck... and I start it up... and it's perfectly smooth.. as it should be! Unfortunately I don't have a video, but take my word for it.

Plain and simple... This is the solution. This is how I think the industry needs to tackle this issue. Is the future of Steam.. SteamOS and Linux? Is this how we finally get pre-caching crowdsourced and distributed on a wide scale? Why the hell aren't Microsoft working to make this possible for DX12/DX11 on Windows?? Something like Fossilize for DirectX which would allow companies like Valve, and Epic, and the other clients, to collect and redistribute that code so that caches can be created as the game downloads/installs..

Steam on Linux does the same thing.. this isn't just Deck specific. You download Shader pre-caching updates, which are downloaded and ran through the drivers to compile the shaders before you even download/install/play the game. This is how it should be.

DirectX and Windows needs this functionality.. plain and simple. We can't trust that all developers are going to do a great job and consider the PC platform and it's quirks when they are developing their games... so this type of "brute force" solution is required.
 

PSman1700

Legend
MS can do it if they want, they have the manpower and resources thats for sure. Valve is showing it can be done at the least, which is a start. Usually it takes someone to be leading the way.
 

Phantom88

Newcomer
Resetera had its once in a decade quality thread yersterday where a dev opened a reddit-like Q&A thread for users to ask questions. And someone asked about the DX12 shader compilation issue and got a couple of responses, elenarie is a dev at Dice. Mainly what we already knew about the issue, but maybe a bit more.





 
So the shader compilation issue is unsolvable with DX12. Great.

Not unsolvable, but just requires more effort and expertise on the part of the developer. In Dx11 you have fewer stuttering issues because the driver provided by the respective IHVs can to a greater or lesser extent handle/hide what the developer doesn't handle themselves. With Dx12 since the developer has more power and freedom IHVs can't make as many assumptions about how a developer might or might not want something handled in a game.

Regards,
SB
 

DSoup

Series Soup
Legend
Subscriber
Not unsolvable, but just requires more effort and expertise on the part of the developer.
This would seem to an impractical amount of effort, having to include compiled shaders for every possible permutation of OS, driver and API at launch, then keep updating them for perpetuity. This would also seem to undermine the aim of having recompilable shaders, e.g. new drivers delivering better compilaton and better runtime performance. You can't have it both ways.
 
This would seem to an impractical amount of effort, having to include compiled shaders for every possible permutation of OS, driver and API at launch, then keep updating them for perpetuity. This would also seem to undermine the aim of having recompilable shaders, e.g. new drivers delivering better compilaton and better runtime performance. You can't have it both ways.

I don't think anyone has seriously proposed having pre-compiled shaders at launch as an actually viable solution to the problem, that's not how it's ever worked. They've always been compiled by the client PC, it's just that in DX9/DX11 the shaders were far simpler, and the compile process could be hidden by level loads or done upfront - the diminishing opportunities to hide these compilation steps in modern games with no load times was precisely my concern that sparked this thread.

The closest we can get to that solution is when Steam downloads compiled shader caches for Vulkan/OpenGL games from users that have already passed certain sections of games, but that is clearly not sufficient. If this is tackled, and by that I mean 'mitigated as best as it can be', it's probably going to through a multi-pronged approach.

I've asked in the ResetEra thread that the developer thinks those might be, at least to the extent of what proposals they've heard of.
 

Remij

Regular
I'm banned from ResetERA, so I can't post there, but if I could... I'd post this.

(I'm going to spoiler it because it's long lol)

Shader Compilation and Hitching in PC games SUCKS

What is the reality of this situation?
We as gamers, simply can't have any faith that a dev studio will do the required work to fix/mitigate this issue on their own. Whether by lack of knowledge/ability, tools/support, time/budget, or straight up lack of care.. this is not something we can expect consistently across all developers/studios.

What do we as gamers know? We know many games have terrible hitching on PC due to PSO/shader compilation, and it's getting worse. We also know that if we play through a game enough times, the more shader/pipeline state is compiled and cached, and those stutters tend to go away.

Now what seems to be the problem for developers? (I'm guessing here) Obvious things. PC platform complexities (GPUs, Drivers, OSes, APIs) for one. Also engines like Unreal, which allow less technically proficient (or should I say specialized) developers and artists to create games with stunning looking visuals through their intuitive development tools... which can potentially end up unoptimized for PC. That's fine for console development, where the engine can compile the code knowing the exact environment that it will run in. However, when it comes to the PC, it's anything but predicable, and it seems to catch them off guard.

Why? This isn't an Unreal Engine specific problem... other engines have essentially have to do the same thing. Typically though if a studio has a proprietary engine, they have the low level coders with intimate knowledge of exactly how the engine works and so they're very cognizant of how the shaders should be authored in the first place to reduce permutations/pipelines to as low as possible, and how to capture that pipeline data . With Unreal.. that's not a guarantee. By design it allows developers to go wild.. and thus things get out of control quickly. Sure there are amazing developers that can make that engine hum and purr like a kitten on PC... but the vast majority of Unreal developers are indie/AA studios which definitely don't.

Essentially it comes down to gathering as much shaders and pipeline state information as possible so that it can be used to pre-compile shaders, which they can do at initial load, loading screens, or in the background... but they need to know that information so they can compile the shaders/materials they need. The only way to do this.. is essentially by playing through the game over and over again, doing everything and redoing it again slightly different, and then again, and again. The more its done, the better the outcome, the more they can compile upfront.

This process seems to be an issue for developers. Not everything can be easily captured, and maybe it's just not feasible for smaller studios to spend precious time (which means budget) trying to do it the best they can. Maybe some developers miss, or don't bother with doing a thorough job of it? *sigh* I don't know. I always hate to say stuff like that, because you'd think that once a game goes gold, they'd toss it in a random PC and play through it and say "ok something is wrong here".. but I digress..


So... at the end of the day what can be done? Multiple things. Developers already do multiple things that we know of (diligence during development, pre-comp processes, background shader comp, ect) I definitely don't want to come across like as if they aren't already doing anything in their power to mitigate the issue.. there are some freaking heroes out there. However like I said... we cannot rely on hopes and dreams that all developers will or can do all this stuff... so what's the next best thing?

Valve with Steam once again, have essentially solved this problem in an incredible way. When I say essentially, I mean "for the most part"...excluding some fringe cases. They've created a system which essentially allows them to capture GPU independent shader/pipeline state, and replay it. This allows them to collect, upload, and redistribute this GPU independent code alongside their respective game packages. Steam downloads a "pre-caching update" and it will take those files, and run them through the driver and compile the shaders/pipeline as the game downloads. It's called Fossilize, and it's pretty damn cool.

The only issue, is that it's Vulkan only :(

This, is honestly the most elegant "solution" to this problem that I can see for many reasons:
1. It can be done completely independent of the developer. (Doesn't require game specific support)
2. It's done at the Steam level... meaning any games that support Vulkan can benefit from this. (more on this in a sec)
3. Outside of developers doing a better job themselves, this can catch a lot of the more egregious examples of compilation stutters. (Games don't need to be perfect, but much better than they are)
4. It's the least intrusive method because it allows shaders to be compiled as the game downloads.. when you can't play it. Reducing friction, without requiring long pre-compiling processes upon initial launch.
5. Gives the user a choice. (Can be opted-in or out)

User choice is important because these downloads can begin to add up in size when you have multiple games receiving them, and as the pre-caching updates grow in size. If you're on a limited data plan, you may want to forego them.

To me, this entire setup is excellent. Of course the best thing is for developers to do as good of a job on their side as possible first and foremost. After that, it helps catches those cases where developers can't do the job on their own. There's also the fact that this issue will only get harder for developers themselves to deal with in the future, as things get ever more complex. I think most people would accept a bit larger of a download to have a smoother game without hitching. It's about as little friction as one could hope for, given the realities of the PC platform. And it's essentially completely transparent to the user. They don't have to think about anything or do anything special.. they're just downloading their games as usual.

Of course, the caveat is that, as I said, it's Vulkan only. Steam through Proton on Linux is essentially brute forcing this stuff.. because as you know every supported game gets translated to Vulkan. However in Windows-land, we're stuck with Vulkan only games supporting it currently. Which is why we have to push Microsoft, to build a Fossilize equivalent for DirectX APIs. Maybe that's not possible? I don't know... but what I do know is that teams of incredibly talented and intelligent people are putting in massive amounts of work to make this a reality through Vulkan, and it works. It's being put to the test every single day.

If for some reason MS can't do something equivalent and make it work for Windows (and really this should be their highest damn priority!) then I either want Valve to somehow implement VKD3D/DXVK into Steam directly on Windows so it can be done, or.. I'd rather just move over to SteamOS completely and wish for developers to drop DirectX and just go full Vulkan.

That's where I'm at. Thanks for reading.

This post was brought to you by: Hopes and Dreams™
 
Last edited:

DSoup

Series Soup
Legend
Subscriber
The closest we can get to that solution is when Steam downloads compiled shader caches for Vulkan/OpenGL games from users that have already passed certain sections of games, but that is clearly not sufficient. If this is tackled, and by that I mean 'mitigated as best as it can be', it's probably going to through a multi-pronged approach.

I've asked in the ResetEra thread that the developer thinks those might be, at least to the extent of what proposals they've heard of.
A smarter download - but without a return to the horrible days when most PC games had annoying custom installers - is a partial solution, but it doesn't negate the need for driver updates to initiate compilation. If you play a game today, everything may be cool then Nvidia may release a driver update that brings real shader performance improvements but which triggers a recompilation and that happens when you're next playing.

It feels like there needs more an interconnect between the drivers, Steam/EGC/Microsoft stores and game launchers and individual games and that feels like a lot of work. It will be interesting to hear what solutions the dev thinks might be within the control of individual developers. :yes:
 

DSoup

Series Soup
Legend
Subscriber
There are games which show that this is not the case. It's all up to the developers.
As a consumer, when no games exhibit this issue, I'll consider the problem has gone away.
Technical solutions that are not used may as well not exist.
 

pTmdfx

Regular
This would seem to an impractical amount of effort, having to include compiled shaders for every possible permutation of OS, driver and API at launch, then keep updating them for perpetuity. This would also seem to undermine the aim of having recompilable shaders, e.g. new drivers delivering better compilaton and better runtime performance. You can't have it both ways.
With Metal 3, Apple has committed to support offline-compiled GPU binaries, with asynchronous ahead-of-time recompilation at app install time and during OS updates (i.e., presumably IR bitcode is still bundled). Though no doubt that it has a way more compact landscape, having only their own Apple Silicon GPUs, AMD Vega, AMD RDNA 1 & 2, and Intel Gen9.5 graphics to be dealt with.
 

Remij

Regular
With Metal 3, Apple has committed to support offline-compiled GPU binaries, with asynchronous ahead-of-time recompilation at app install time and during OS updates (i.e., presumably IR bitcode is still bundled). Though no doubt that it has a way more compact landscape, having only their own Apple Silicon GPUs, AMD Vega, AMD RDNA 1 & 2, and Intel Gen9.5 graphics to be dealt with.
I was literally just watching this video about it and was going to post it here. Thought it was pretty interesting.

 

DSoup

Series Soup
Legend
Subscriber
With Metal 3, Apple has committed to support offline-compiled GPU binaries, with asynchronous ahead-of-time recompilation at app install time and during OS updates (i.e., presumably IR bitcode is still bundled). Though no doubt that it has a way more compact landscape, having only their own Apple Silicon GPUs, AMD Vega, AMD RDNA 1 & 2, and Intel Gen9.5 graphics to be dealt with.
Apple also employ Bitcode app optimisation - in terms of both size and performance - for the iOS App Store. These approaches are complex but achievable when one party ones the platform, top to bottom: OS, devkit, SDKs, APIs and the Store.

There's nothing to stop AMD and Nividia providing a cloud-based solution to provide pre-compiled shaders, but this has to be built into something. Should it be in the app, or in the store/launcher, or the driver? I think the Store makes sense so that it can update shaders along with other updates. But it doesn't seem solvable without a lot of co-operation amongst a number of parties who not seemingly motivated to co-operate.
 

pTmdfx

Regular
Apple also employ Bitcode app optimisation - in terms of both size and performance - for the iOS App Store. These approaches are complex but achievable when one party ones the platform, top to bottom: OS, devkit, SDKs, APIs and the Store.
Plot twist: They are sunsetting their (CPU) bitcode initiative. 😛

There's nothing to stop AMD and Nividia providing a cloud-based solution to provide pre-compiled shaders, but this has to be built into something. Should it be in the app, or in the store/launcher, or the driver? I think the Store makes sense so that it can update shaders along with other updates. But it doesn't seem solvable without a lot of co-operation amongst a number of parties who not seemingly motivated to co-operate.
I'd argue that, the API and toolchain owner/driver should be the leader to design a solution, and in particular Microsoft is in a position to solve it with a more vertically integrated solution being the OS vendor as well.

AFAIK, Apple is not solving it like their CPU Bitcode attempt — the OS recompilation component of Metal 3 GPU binary support is independent from the App Store, especially considering that Metal on Mac has quite a significant scene outside the Mac App Store with all the productivity/creation tools.
 
Top