Shader Compilation on PC: About to become a bigger bottleneck?

It's interesting. You have someone like Tom Forsyth commenting about how this change is better. I don't have the experience to know for myself, but there are some real industry legends liking the change. Eric Lengyel no slouch either.

*snip*
You have to admit it's ironic how he's working for an employer who can't trust their own driver team to make competent native drivers for D3D10(!) and lower so much for his preaching that "drivers are able to do it better"

It's weird how Intel needs a translation layer for D3D10 when it's basically D3D11 without compute shaders, tessellation, or the lack of reference counting behaviour. You'd think they share driver code for both D3D10 and D3D11 but it's hurts too much see them resorting to a translation layer ...
 
Then those people should go on to use translation layers (DXVK/D3D11on12/etc) or helper libraries (V-EZ/RPS/MA) or some emulation layer but the core API shouldn't be polluted with design patterns that we dropped for initial reasons to be true ...

The industry should at least give the compromise option (graphics pipeline) a chance first at an implementation before it's deprecated with spearate shader object JUST AFTER 1 YEAR! This just reeks of desperation very much like the Khronos Group in the past when they handled OpenGL which ended up as some dumpster fire ...

Are they actually deprecating pipelines or pipeline libraries? That tweet from @phys_ballsocket suggests they're not
 
I'm actually curious if we'll see generation splits, where there were highly proficient dx11 developers that want to go a bit back to the old way, and then newer vulkan, dx12 programmers that have built most of their expertise by working through the teething of the newer apis.
 
Are they actually deprecating pipelines or pipeline libraries? That tweet from @phys_ballsocket suggests they're not
Monolithic pipelines are never going away since mobile GPU drivers don't get updates and have historically bad compilers so developers are going to have to use them. They might eventually get to implementing GPL after a while but I don't expect them to implement separate separate shader objects for years to come if ever ... (if anyone does need SSO over GPL then they're probably going to learn the hard way since the only thing SSO offers over GPL are separate tessellation/geometry shaders which have cursed implementations over there)

SSO is a competitor to GPL since they both fill the utility of state and shader management. Valve employees aren't interested in replacing GPL with SSO for their use case but I'm kinda scared that Khronos Group released an alternative model just after a year where both have yet to be widespread coverage ...

Hopefully SSOs, are only ever implemented on Nvidia or open source drivers (emulation purposes) and nobody else so we can move past this blunder to forget about it since the only purpose for it so far is to emulate OpenGL over Vulkan (Zink project) and it's insane to have other models be subverted with that purpose in mind ...

Vulkan is beyond bloated with quite a few different ways of doing the same things and it's not that old ...

Binding: Descriptor Sets vs Descriptor Indexing vs Buffer Addresses vs Descriptor Buffer

Compilation: Monolithic Pipelines vs GPL vs SSOs (including all the dynamic states)

Geometry: Vertex/Tessellation/Geometry vs Mesh Shading

Framebuffer: Renderpasses vs Dynamic Rendering (multiple ways to do programmable blending as well)

TBF, Vulkan is starting suck a lot when it's not that old for it's age ...
 
If I'm refreshing my memory, the issue with PSOs was that some games like Fortnite, GTA (really any game with a prolific marketplace) can have any combination of skins for weapons, players etc which calls a snowball of shaders before each match. The approach taken by some games is to limit the number of PSOs, but those are typically games that don't have a more controlled game design (not open world, not marketplace etc) that makes this easier? So graphics pipeline libraries was meant to address that issue.

Edit:
Ok, refreshing with this post. I've read it and forgotten it. Basically shaders that accept parameters have to be compiled in all of their variations. So one shader can produce many variants depending on how many parameters it accepts/how many sates each parameter has, and each one has to be compiled.

NumPermutations=NumStatesfeatureA∗NumStatesfeatureB∗NumStatesfeatureC…
 
Last edited:
If I'm refreshing my memory, the issue with PSOs was that some games like Fortnite, GTA (really any game with a prolific marketplace) can have any combination of skins for weapons, players etc which calls a snowball of shaders before each match. The approach taken by some games is to limit the number of PSOs, but those are typically games that don't have a more controlled game design (not open world, not marketplace etc) that makes this easier? So graphics pipeline libraries was meant to address that issue.
@Bold Recent idTech games use an ubershader approach to minimize the number of PSOs and they ban material graphs (Unreal) or shader graphs (Unity) as well ...

GPL does solve the combinatorial explosion of different pre-rasterization pipeline and fragment shader combination ...

Suppose we have 10 of each which are unique in state configuration or shader program:

With PSOs, we'd have potentially 100 (10*10) different pipelines to compile. Just before drawing we have to swap out entire PSOs if we want to do something different ...

In the case of GPL, we'd compile them separately so we compile no more than 20 objects in total. How it works during runtime is we bind our pre-rasterization pipeline and select our fragment shader. Once we issue our draw command, the driver links these objects together just before rendering. If you want to change to a somewhat different combination, you don't have to change both objects at the same time and can opt to change just one of them so that you can reuse the compiled object while the driver compiles the unique object that doesn't have a cached binary. The result is that your driver doesn't perform redundant compilations since it can reuse separate objects and link them together as needed ...

If you change both objects which aren't cached, compilation just as slow as an uncached PSO swap. If only 1 object is changed, then compilation will be faster than an uncached PSO swap. Most important of all is the linking process which allows us to keep the different combinations under control ...

Separate shader objects goes off the deep end by breaking down the pre-rasterization stages into their own objects. You're basically inviting yourself to extend awful functionality like tessellation shaders or geometry shaders to do the same thing. I'm pretty confident that's not how modern AMD HW works and is a regression to GPL in terms of abstraction ...

Hence why I alluded the functionality as vendors fundamentally implementing D3D11/OpenGL in HW earlier ...
 
BTW if I had to rank the biggest problems causing compilation stutters this would be my order ...

1. Material/Shader graphs (no solution)
2. PSOs (Separable pipelines via GPL)
3. Static States leading to redundant pipelines (dynamic states)

PSOs are besides the main issue. It doesn't even compare to the combinatorial explosion caused by the Material/Shader graphs which generates an insane amount of shader variants. None of them are going to implement ubershaders or ban them because of performance issues and loss of artist productivity respectively ...
 
BTW if I had to rank the biggest problems causing compilation stutters this would be my order ...

1. Material/Shader graphs (no solution)
2. PSOs (Separable pipelines via GPL)
3. Static States leading to redundant pipelines (dynamic states)

PSOs are besides the main issue. It doesn't even compare to the combinatorial explosion caused by the Material/Shader graphs which generates an insane amount of shader variants. None of them are going to implement ubershaders or ban them because of performance issues and loss of artist productivity respectively ...
Material/Shader graphs produce a crazy amount of permutations, but aren't you able to rule out large swaths of them by excluding the ones your application doesn't/won't ever use?

Of course I get that there's really two separate problems going on. One is the consumer side... that being how the game performs at run-time with hitches and stuttering.. and the other being the developer side where all 3 points you mentioned mean that compile/build times become increasingly long, stopping, or greatly slowing down productivity as developers wait to be able to work.

Am I wrong to think that developers mostly worry about permutation explosions as a issue of development productivity and not so much runtime performance? I mean sure on console they are concerned about fixed storage capacity and not wanting to waste space.. but other than that, it seems like the concern is mostly to do with productivity.
 
Material/Shader graphs produce a crazy amount of permutations, but aren't you able to rule out large swaths of them by excluding the ones your application doesn't/won't ever use?

Of course I get that there's really two separate problems going on. One is the consumer side... that being how the game performs at run-time with hitches and stuttering.. and the other being the developer side where all 3 points you mentioned mean that compile/build times become increasingly long, stopping, or greatly slowing down productivity as developers wait to be able to work.

Am I wrong to think that developers mostly worry about permutation explosions as a issue of development productivity and not so much runtime performance? I mean sure on console they are concerned about fixed storage capacity and not wanting to waste space.. but other than that, it seems like the concern is mostly to do with productivity.
The permutations are a function features/states used by the artist. There's no way to get rid of them if you don't opt in to use ubershaders instead ...

Shader graphs exist to solve both productivity AND runtime performance. By allowing the driver to compile these shader variants, there's more chances for them to apply optimizations passes to get ideal register allocation/occupancy/memory usage for cheaper shaders. Cutting down on different specialized shaders means that the slowest shader (which is now the slowest program path) forces the driver to take the worst possible path leading to subpar performance ...
 
Which is why Khronos mentioned Vulkan limiting games/game engines dynamism in it's current state. They also clearly articulate that it limits several configuration from reaching it's maximum hardware potential (pretty sure these configuration include some NVIDIA GPU). So now you have software limitations and hardware limitations.

hence why D3D11 is disappearing from AAA games
It's disappearing because most developers are switching to DX12 to implement Ray Tracing, or some of the features in the DX12U package (mesh shaders/variable rate shading .. etc). The pace of DX12 adoption has greatly accelerated after DXR.

I'm reluctant to explain further but things are absolutely better off for
This is the kind of thinking that got us here in the first place, we are only starting to fully unravel the big picture years after the inception of DX12/Vulkan, we have been fed a bunch of misleading points about the advantages of going lower level, but the major disadvantages were carefully hidden, with most developers afraid to talk about it or criticize it, almost a decade later things are clear now, and voices are loud demanding change. This is wrong. This discussion should've happened many years ago, instead of forcing the entire spectrum of developers to go down an undesirable path, then to back peddle on it so late in the game.

Everybody concentrates on the user experience but what about the developer side since they have far more weight from a technical perspective ?
Very weird take, user experience is the reason developers spend so much time writing code. It's the product in the end that matters, not the journey per se. This is the reason we have so many arrogant developers not caring about stuttering on PC at all.

I mean who stands to gain from all of this? software and hardware limitations, bugs, bad memory management, horrendous stuttering. Which developer? Which IHV? what percentage of market share do they have so that they can force the entire industry and the userbase along with them into this sub optimal position?
 
The permutations are a function features/states used by the artist. There's no way to get rid of them if you don't opt in to use ubershaders instead ...

Shader graphs exist to solve both productivity AND runtime performance. By allowing the driver to compile these shader variants, there's more chances for them to apply optimizations passes to get ideal register allocation/occupancy/memory usage for cheaper shaders. Cutting down on different specialized shaders means that the slowest shader (which is now the slowest program path) forces the driver to take the worst possible path leading to subpar performance ...
I gotcha. Makes sense, but when speaking of performance, I was more specifically referring to the compilation stutter you mentioned. I understand that shaders can be optimized by the driver and run more efficiently on the GPU, but with regards to PSOs and compilation at run-time.. less optimal shaders can be compiled faster.

I'm the type of person that would accept less/no compilation hitching at the expense of some performance overhead. At least within reason.
 
Which is why Khronos mentioned Vulkan limiting games/game engines dynamism in it's current state. They also clearly articulate that it limits several configuration from reaching it's maximum hardware potential (pretty sure these configuration include some NVIDIA GPU). So now you have software limitations and hardware limitations.


It's disappearing because most developers are switching to DX12 to implement Ray Tracing, or some of the features in the DX12U package (mesh shaders/variable rate shading .. etc). The pace of DX12 adoption has greatly accelerated after DXR.


This is the kind of thinking that got us here in the first place, we are only starting to fully unravel the big picture years after the inception of DX12/Vulkan, we have been fed a bunch of misleading points about the advantages of going lower level, but the major disadvantages were carefully hidden, with most developers afraid to talk about it or criticize it, almost a decade later things are clear now, and voices are loud demanding change. This is wrong. This discussion should've happened many years ago, instead of forcing the entire spectrum of developers to go down an undesirable path, then to back peddle on it so late in the game.


Very weird take, user experience is the reason developers spend so much time writing code. It's the product in the end that matters, not the journey per se. This is the reason we have so many arrogant developers not caring about stuttering on PC at all.

I mean who stands to gain from all of this? software and hardware limitations, bugs, bad memory management, horrendous stuttering. Which developer? Which IHV? what percentage of market share do they have so that they can force the entire industry and the userbase along with them into this sub optimal position?
I would say only Nvidia, Intel and AMD stand to gain as continually worse performance will drive people to upgrade in an attempt to brute force passed the issues.
 
Another take. I find it interesting how varied the takes are on this. I saw one person suggest it was similar to "tabs vs spaces"

This person doesn't like pipelines at all, but doesn't know if this new solution is the correct one.

An interesting note:
Is ESO the right answer? I don't know. I don't think anyone does. It's super easy to implement on NVIDIA. It's a giant pain, but possible, on AMD and Intel. On Intel, we already paid most of that cost with pipeline libraries so ESO isn't really costing that much more.


 
The industry should at least give the compromise option (graphics pipeline) a chance first at an implementation before it's deprecated with spearate shader object JUST AFTER 1 YEAR!
We're 7-8 years into the new APIs though. How much more time do we need to see that it doesn't work?

It's disappearing because most developers are switching to DX12 to implement Ray Tracing, or some of the features in the DX12U package (mesh shaders/variable rate shading .. etc). The pace of DX12 adoption has greatly accelerated after DXR.
The actual reason for DX11 disappearing is because it is hard enough to write and support one renderer, and when you add the new features which are present only in D3D12 (or VK) it's rather easy to choose which API to support going forward. I'd be wary about saying that D3D12 is getting chosen because it allows things which D3D11 doesn't from the API perspective. It is quite likely to be a simple budgeting choice, not at all related to how each API run on what h/w.

I would say only Nvidia, Intel and AMD stand to gain as continually worse performance will drive people to upgrade in an attempt to brute force passed the issues.
AMD h/w may potentially have issues with SSOs with state changes just like it does in D3D11 probably?
Saying that this is "D3D11 in h/w" though is a bit like saying "our slow RT is the proper way to do it in h/w". Just fix the damn h/w.
 
Last edited:
@DegustatoR the extension for graphics pipeline libraries came out in March 2022. Seems like with dxvk it solved all of their issues and we see how well games run on steamdeck.
Is it a correct assumption though to just transfer what an API translation layer is doing to the renderers themselves? It seems that it's a rather different type of workload to being with (as in translating API calls) while the guys behind DXVK are likely to be waaaay more knowledgable about how APIs and h/w works than the average graphics programmer from some AA-level studio.

Also I kinda find it weird when the whole focus of an API development becomes "so that it wouldn't break that API translation app we have". It seems to me that this should be an afterthought here.

I also don't see what's so bad in options co-existing in the API. To me it seems a much better way of choosing what should be used than just forcing some theoretically great model on everyone. If some way of rendering will be better than the others then it will win natively on its own.
 
We're 7-8 years into the new APIs though. How much more time do we need to see that it doesn't work?
If a consortium like Khronos Group are throwing crap at the wall to see what sticks then they'll start losing major credibility among their own members and stop placing any amount of trust in their standards. If there are major doubts with members, they're just going to take a wait and see approach until Khronos makes another quick potentially bad move by introducing other shortsighted ideas while their last design SSOs are tossed into garbage immediately thereafter ...

The industry should at least get some time to evaluate a new concept rushing things will just lead to more cruft and bloat. I don't even think Khronos has gotten any real world feedback yet on GPL since nearly no one has even implemented it. Those 7-8 and years you talk about were feedback on monolithic pipelines so dismissing GPL which doesn't have any data yet in favour of SSOs is plainly engaging in bad faith ...
 
@DegustatoR the extension for graphics pipeline libraries came out in March 2022. Seems like with dxvk it solved all of their issues and we see how well games run on steamdeck.
The great thing about GPL is that it's NOT intended to be a match for D3D11/OpenGL SSO model. It too has the concept of pipelines but it improves over the pitfalls of the previous default monolithic design so the functionality stands on it's own merits ...

With SSOs, there's no other compelling reason behind the functionality other than to emulate those older APIs in translation layers. DXVK could use it but they make their distaste very clear for it and their uses are well met with GPL ...
 
If a consortium like Khronos Group are throwing crap at the wall to see what sticks then they'll start losing major credibility among their own members and stop placing any amount of trust in their standards.
Why would an additional option added into the API create any trust issues? You don't like it? Don't use it, all previously available options are still there and there are no signs of them being removed, yet at least.

If there are major doubts with members, they're just going to take a wait and see approach until Khronos makes another quick potentially bad move by introducing other shortsighted ideas while their last design SSOs are tossed into garbage immediately thereafter ...
You don't have to be a Khronos member to use Vulkan.

The industry should at least get some time to evaluate a new concept rushing things will just lead to more cruft and bloat.
I'm sorry but why is SSO "cruft and bloat" while GPLs aren't? Isn't adding them both is essentially the same thing - evolving the API with new features and options?

I don't even think Khronos has gotten any real world feedback yet on GPL since nearly no one has even implemented it.
Which is already a bit telling about how the market needs them...

Those 7-8 and years you talk about were feedback on monolithic pipelines so dismissing GPL which doesn't have any data yet in favour of SSOs is plainly engaging in bad faith ...
Correct me if I'm wrong but nobody is dismissing anything with the introduction of SSOs.
 
Back
Top