Shader Compilation on PC: About to become a bigger bottleneck?

Then developers should use Vulkan by all means necessary if they want a less explicit API (less performance as well) and shouldn't ruin the last holdout left (D3D12) for a brighter and more bold vision for the future. Even Apple at least cares where Metal is explicit for their own hardware. Microsoft shouldn't let the dream of a more explicit future to die and stay as being the sole visionary of graphics technology going forwards while Khronos Groups rots in the past like they always have ...

Developers should put their money where there mouths are and use an API not backed by Microsoft or be at their mercy of their whims ...
 

5. Issues​

5.1. RESOLVED: How should implementations which absolutely must link shader stages implement this extension?​

The purpose of this extension is to expose the flexibility of those implementations which allow arbitrary combinations of unlinked but compatible shader stages and state to be bound independently. Attempting to modify this extension to support implementations which do not have this flexibility would defeat the entire purpose of the extension. For this reason, implementations which do not have the required flexibility should not implement this extension.
IHVs whose implementations have such limitations today are encouraged to consider incorporating changes which could remove these limitations into their future hardware roadmaps.

TLDR; implement D3D11/OpenGL in hardware ...

Kill it with fire already ...

The functionality just stinks like a series of bad reactionary design moves Khronos made in the past. They're obsoleting graphics pipeline libraries already which was just a year a ago and that remains unimplemented in most drivers with this extension. They should be ashamed of themselves ...
 
At what point do we accept that low level APIs are just not going to work out in practice in the PC ecosystem? Almost a decade in and you can count on the fingers of one hand the amt of developers who were able to release a game without significant issues and performance regressions compared to their work on DX11. In some of those cases you still have to limit the scope of analysis to a subset of GPUs to avoid said regressions. There isn’t even any progress happening, I would argue it’s only getting worse.
 
@Lurkmass It looks like you can mix pipelines and shader objects. So where one case is better than the other, you can pick your poison.

In environments where VK_EXT_shader_object is supported, applications can choose to use only pipelines, only shader objects, or an arbitrary mix of the two.
 
Developers should put their money where there mouths are and use an API not backed by Microsoft or be at their mercy of their whims ...
Developers are bumbling in the dark, releasing stutter infested games, unable to properly manage RAM and VRAM, and even unable to push performance harder.

As a long time gamer, DX12 represented the single most cause of problems for modern PC gaming in history, we have multiple dedicated thread for the mess DX12 causes alone, we have multiple developers testifying that coding for DX12 is bad, we have a former member of the DX12 board testifying here on the forums that they did major mistakes with DX12. It's baffling. Your claimed future is non existent.

And here it's confirmed by Khronos themselves:
Many of these assumptions have since proven to be unrealistic.

The entire premise is screwed up hard. Here Khronos explains why (in summary Vulkan/DX12 constrain gaming dynamism).

On the application side, many developers considering or implementing Vulkan and similar APIs found them unable to efficiently support important use cases which were easily supportable in earlier APIs. This has not been simply a matter of developers being stuck in an old way of thinking or unwilling to "rein in" an unnecessarily large number of state combinations, but a reflection of the reality that the natural design patterns of the most demanding class of applications which use graphics APIs — video games — are inherently and deeply dependent on the very "dynamism" that pipelines set out to constrain.

And here Khronos admits it all in one paragraph, they simply admit that problems Vulkan and DX12 sough to solve, are transferred to the game instead of the driver, without giving developers the necessary knowhow and tools to solve it (unlike the driver which handled it gracefully), thus directly causing the stutter problem (shader compilation, the very topic of this thread).

As a result, renderers with a choice of API have largely chosen to avoid Vulkan and its "pipelined" contemporaries, while those without a choice have largely just devised workarounds to make these new APIs behave like the old ones — usually in the form of the now nearly ubiquitous hash-n-cache pattern. These applications set various pieces of "pipeline" state independently, then hash it all at draw time and use the hash as a key into an application-managed pipeline cache, reusing an existing pipeline if it exists or creating and caching a new one if it does not. In effect, the messy and inefficient parts of GL drivers that pipelines sought to eliminate have simply moved into applications, except without the benefits of implementation specific knowledge which might have reduced their complexity or improved their performance.

They also admit what we've all suspected from the beginning: DX12/Vulkan DO NOT reduce CPU overhead, but actually directly INCREASE it.

On the driver side, pipelines have provided some of their desired benefits for some implementations, but for others they have largely just shifted draw time overhead to pipeline bind time (while in some cases still not entirely eliminating the draw time overhead in the first place). Implementations where nearly all "pipeline" state is internally dynamic are forced to either redundantly re-bind all of this state each time a pipeline is bound, or to track what state has changed from pipeline to pipeline — either of which creates considerable overhead on CPU-constrained platforms.

Worse yet, these "low level" APIs (quottion marks are Khronos's not mine), actually limited certain GPU archs from accessing their full potential.

For certain implementations, the pipeline abstraction has also locked away a significant amount of the flexibility supported by their hardware, thereby paradoxically leaving many of their capabilities inaccessible in the newer and ostensibly "low level" API, though still accessible through older, high level ones. In effect, this is a return to the old problem of the graphics API artificially constraining applications from accessing the full capabilities of the GPU, only on a different axis.

Even consoles didn't fare well with these "low level" APIs. We finally heard it from an official source, that these API caused havok among games, their benefits were overshadowed by their mistaken assumptions and terrible execution. I expect Microsoft to follow the lead of Khronos (they already announced some updates coming to DX12 soon).

 
Last edited:
At what point do we accept that low level APIs are just not going to work out in practice in the PC ecosystem? Almost a decade in and you can count on the fingers of one hand the amt of developers who were able to release a game without significant issues and performance regressions compared to their work on DX11. In some of those cases you still have to limit the scope of analysis to a subset of GPUs to avoid said regressions. There isn’t even any progress happening, I would argue it’s only getting worse.
Were high level APIs working out at the end of their lives when developers started asking for features like bindless, persistent mapping, ray tracing, mesh shading, and more advanced GPU-driven rendering (ExecuteIndirect) ? Is recessing to old habits and it's problems supposed to be the solution ? The beauty behind Direct3D and it's evolution is that it wasn't afraid to deprecate features like the fixed function pipeline when it was necessary in order to move towards a better future even if it meant breaking compatibility with the old ways. When Direct3D13 comes around will we ever be be able to move forwards by removing cruft like tessellation/geometry shaders and transform feedback for good or will we be stuck with those cursed mistakes so developers can keep using them ?

Find a better design or a model but don't look to the past to repeat the same mistakes ...

@Lurkmass It looks like you can mix pipelines and shader objects. So where one case is better than the other, you can pick your poison.

Drivers shouldn't ever expose the functionality if it's suboptimal to their HW design. I don't trust developers to not abuse the feature if they can ... (they'll probably only use shader objects if they can) ...
 
Were high level APIs working out at the end of their lives when developers started asking for features like bindless, persistent mapping, ray tracing, mesh shading, and more advanced GPU-driven rendering (ExecuteIndirect) ? Is recessing to old habits and it's problems supposed to be the solution ? The beauty behind Direct3D and it's evolution is that it wasn't afraid to deprecate features like the fixed function pipeline when it was necessary in order to move towards a better future even if it meant breaking compatibility with the old ways. When Direct3D13 comes around will we ever be be able to move forwards by removing cruft like tessellation/geometry shaders and transform feedback for good or will we be stuck with those cursed mistakes so developers can keep using them ?

Find a better design or a model but don't look to the past to repeat the same mistakes ...



Drivers shouldn't ever expose the functionality if it's suboptimal to their HW design. I don't trust developers to not abuse the feature if they can ... (they'll probably only use shader objects if they can) ...
They were working much better than what we have today. Other than RT, nothing positive has been delivered. CPU and GPU performance have gotten much worse, VRAM usage has skyrocketed and stuttering is now common place. All of this with zero improvement to any aspect of the visuals/experience outside of RT.
 
Last edited:
Most developers seem to be unable to optimize for every architecture. When you see something like that then there is time for changing the API:
Hogwarts-Legacy-GPU-benchmarks-2-1920x1080.png

 
They were working much better than what we have today. Other than RT nothing positive has been delivered. CPU and GPU performance have gotten much worse, VRAM usage has skyrocketed and stuttering is now common place. All of this with zero improvement to any aspect of the visuals/experience outside of RT.
That's not true and even if it were we're basing our observations with so few and old data points for comparisons. If we take Unreal Engine 5 with Nanite with virtual shadow maps as an example D3D12 runs much better than D3D11 to the point where they deprecated the latter in their engine and I'm sure you heard this yourself from an insider himself at Epic Games ...

If you've seen the benchmarks where resizable BAR is enabled vs disabled, we've seen a speed up in many games and it's virtually mandatory on discrete Intel graphics hardware. I detail recently how being more explicit in regards to memory management will allow us to more consistently extract these gains as opposed to relying on the driver ...
 
That's not true and even if it were we're basing our observations with so few and old data points for comparisons. If we take Unreal Engine 5 with Nanite with virtual shadow maps as an example D3D12 runs much better than D3D11 to the point where they deprecated the latter in their engine and I'm sure you heard this yourself from an insider himself at Epic Games ...

If you've seen the benchmarks where resizable BAR is enabled vs disabled, we've seen a speed up in many games and it's virtually mandatory on discrete Intel graphics hardware. I detail recently how being more explicit in regards to memory management will allow us to more consistently extract these gains as opposed to relying on the driver ...
There aren't any games on UE5 to use as a data point. How about you use an available game from the last 8 years of releases to demonstrate your point. There are certainly a plethora of them, many on Unreal Engine itself. Benchmarks of one were even posted just above.
 
There aren't any games on UE5 to use as a data point. How about you use an available game from the last 8 years of releases to demonstrate your point. There are certainly a plethora of them, many on Unreal Engine itself. Benchmarks of one were even posted just above.
I really can't show you since AAA games have been exclusively using D3D12 for sometime now so they don't have a D3D11 backend for comparison and not for a lack of trying either because tons of developers are now using bindless which is incompatible with older APIs so developers at least seem to think that more explicit APIs are useful to buy into their features ...

Nearly all of the games that did offer both D3D12 and D3D11 aren't really relevant comparison points anymore today ...
 
I really can't show you since AAA games have been exclusively using D3D12 for sometime now so they don't have a D3D11 backend for comparison and not for a lack of trying either because tons of developers are now using bindless which is incompatible with older APIs so developers at least seem to think that more explicit APIs are useful to buy into their features ...

Nearly all of the games that did offer both D3D12 and D3D11 aren't really relevant comparison points anymore today ...
Why aren't the games that did use both relevant? Nothing has changed.

We can look at the performance trends of games becoming hugely more demanding despite the complete lack of any progress to visuals/experience outside of RT. PC GPU performance is drifting further below it's console equivalent as a whole when low level APIs were designed to do the opposite of that.
 
Why aren't the games that did use both relevant? Nothing has changed.

We can look at the performance trends of games becoming hugely more demanding despite the complete lack of any progress to visuals/experience outside of RT. PC GPU performance is drifting further below it's console equivalent as a whole when low level APIs were designed to do the opposite of that.
So you'd rather accept outdated data instead finding a proper comparison point even if it's not possible ? I realize "trust me bro" isn't a very compelling argument but do you really think that absolutely nothing has changed when developers are just now finding out the value of using new features behind more explicit APIs ? Attempting to find quick workarounds to them would result in much lower performance with older APIs hence why offering them is redundant work to them ...

You're just going to have to buy into the sort of "survivorship bias" that's inherent to my proposition but I assure you that there are very much plausible technical reasons that things have changed and for aggregately for the better as well ...
 
So you'd rather accept outdated data instead finding a proper comparison point even if it's not possible ? I realize "trust me bro" isn't a very compelling argument but do you really think that absolutely nothing has changed when developers are just now finding out the value of using new features behind more explicit APIs ? Attempting to find quick workarounds to them would result in much lower performance with older APIs hence why offering them is redundant work to them ...

You're just going to have to buy into the sort of "survivorship bias" that's inherent to my proposition but I assure you that there are very much plausible technical reasons that things have changed and for aggregately for the better as well ...
I'm legitimately asking you why that data is outdated and not relevant, it wasn't a snarky response. In terms of what the gamer experiences from the software, I do feel nothing has changed for the better outside of the titles that incorporate RT.
 
I'm legitimately asking you why that data is outdated and not relevant, it wasn't a snarky response. In terms of what the gamer experiences from the software, I do feel nothing has changed for the better outside of the titles that incorporate RT.
Because the usage patterns behind developers with recent APIs have genuinely changed hence why D3D11 is disappearing from AAA games. They're using D3D12 in ways that's incompatible with D3D11 and trying to make a renderer for the latter would purely be an academic exercise ...

Everybody concentrates on the user experience but what about the developer side since they have far more weight from a technical perspective ? I'm reluctant to explain further but things are absolutely better off for them since they do use functionality that's unique to more explicit APIs to create content ...

If they truly feel that more explicit APIs don't empower them in any form then they can continue to use older APIs but they don't so why is that ?
 
Then those people should go on to use translation layers (DXVK/D3D11on12/etc) or helper libraries (V-EZ/RPS/MA) or some emulation layer but the core API shouldn't be polluted with design patterns that we dropped for initial reasons to be true ...

The industry should at least give the compromise option (graphics pipeline) a chance first at an implementation before it's deprecated with spearate shader object JUST AFTER 1 YEAR! This just reeks of desperation very much like the Khronos Group in the past when they handled OpenGL which ended up as some dumpster fire ...
 
Back
Top