Shader Compilation on PC: About to become a bigger bottleneck?

trinibwoy · Apr 1, 2023

Lurkmass said:
Separate shader objects goes off the deep end by breaking down the pre-rasterization stages into their own objects. You're basically inviting yourself to extend awful functionality like tessellation shaders or geometry shaders to do the same thing. I'm pretty confident that's not how modern AMD HW works and is a regression to GPL in terms of abstraction ...

Hence why I alluded the functionality as vendors fundamentally implementing D3D11/OpenGL in HW earlier ...

This is all super interesting! Could you elaborate further on why SSO’s are similar to D3D11? If the benefit of GPL is greater compile time granularity than PSOs then wouldn’t even more granularity of SSOs bring even greater benefits to reducing the number of compile time permutations?

SSOs seem to be just as “low level” as GPL and PSO because the application is still explicitly telling the driver what it wants to do at linking time. The downside is the hardware has to be more flexible to accommodate more dynamic linking. I’m sure I’m missing something but that’s my understanding from reading the high level descriptions of the GPL and SSO extensions.

There are subtle undertones in those blog posts that point to these extensions being more compatible with some architectures than others. I suspect this is also a source of friction. Purely from a developer’s perspective though are there downsides to SSO’s?

trinibwoy · Apr 1, 2023

Lurkmass said:
BTW if I had to rank the biggest problems causing compilation stutters this would be my order ...

1. Material/Shader graphs (no solution)
2. PSOs (Separable pipelines via GPL)
3. Static States leading to redundant pipelines (dynamic states)

PSOs are besides the main issue. It doesn't even compare to the combinatorial explosion caused by the Material/Shader graphs which generates an insane amount of shader variants. None of them are going to implement ubershaders or ban them because of performance issues and loss of artist productivity respectively ...

That makes sense. Begs the question though, if neither GPL or SSO solve the material permutation problem and therefore will still stutter why even bother?

DavidGraham · Apr 1, 2023

Alright, I am going to be frank as much as possible here, since no one else is going to.

techuse said:
I would say only Nvidia, Intel and AMD stand to gain as continually worse performance will drive people to upgrade in an attempt to brute force passed the issues.

History tells us a little different version of the story here, even Khronos alluded to it.

Lower Level APIs were created on the promise that they need to increase draw calls and reduce driver complexity to eliminate CPU overhead, it was said that DX11 was the limiting factor, and we need to dumb it quickly and move on. AMD in particular championed this direction (along with few developers), AMD was having a very hard time properly optimizing DX11 and OpenGL, they still have by the way, as they only managed to properly optimize for DX11 last year, when they released several drivers that increased fps for old DX11/OGL games many many years after these games were launched. So the problem still exist on their end to this day.

So they created Mantle, and managed to amass some support with several developers, Mantle had the same problems we have now, but to a lesser degree, never the less Microsoft didn't like AMD's move to a new exclusive API, so they quickly assembled DX12, after which AMD ceased their Mantle efforts, and scrapped the plan to implement it in future games.

So all was good in the lands of AMD, DX12 was demoed with Ashes of the Singularity game, which was designed around the API, however the adoption of DX12 was slow as hell, and with each game, problems were introduced, stuttering increased, CPU overhead stayed the same or became worse, performance is not faster but slower than DX11 .. etc. Soon several after developers expressed their disgruntle with DX12, implying you should only move to DX12 if your needs meet with that API. Even the developers of Ashes of the Singularity moved to DX11 in their next game (Star Control), citing that it didn't make sense to ship it with DX12, and they advised people to move to DX12 only for features, not performance.

So while the move to DX12 was highly beneficial to AMD who still can't develop proper drivers for DX11 quickly enough, the rest of the industry suffered, NVIDIA (who has the largest market share) suffered more, the developers suffered tremendously (as stated by Khronos and several other developers), even the developers who supported DX12 initially quickly had their enthusiasm die down! We gamers suffered as well, the experience of playing games was hectic, unpredictable and buggy for almost a decade, even engine designers suffered, as Unity, Frostbite and Unreal are moving away from the direction of DX12's core concept.

https://twitter.com/x/status/1641914613074735105

And this is really the crux of the problem, AMD insisted on changing the core API, swiftly without proper consideration from a more wider industry consensus, Microsoft quickly gave in and made that change a reality without proper considerations too, then Khronos followed their lead as well, and now we have this current situation, people realizing that this core change only benefited so few, while the rest of the industry gained nothing, only suffering and a vastly worse user experience.

We are in 2023, and the problems DX12 set out to solve became worse, CPU overhead is vastly increased with no visual gains, we struggle with our overpowered CPUs rendering last gen graphics! VRAM consumption is blown out of proportion for last gen graphics! stuttering is increased ten folds for practically the same effects, and performance is worse on both AMD and NVIDIA! AMD didn't gain much from this either, as their performance and features lagged behind NVIDIA, their market share dwindled to the absolute lowest point, and they are a generation behind NVIDIA in ray tracing, machine learning and upscaling. On the other hand, we the users had none of the promised explosion of draw calls (thus rendering more stuff on screen), and none of the promised proper utilization of our hardware, in fact it's the opposite really.

So, the question is, did we waste it all so that one IHV can make their driver life easy?

Khronos wants to penetrate more markets and more developers, that means going back to the way things were, when stuff was just easier, and we got more done by the power of more powerful hardware, instead of the API getting in the way. This will again make things harder for AMD no doubt, but to be honest it won't change things much for them, their problems were never just bad DX11 drivers, they should focus on doing hardware better, not cripple APIs so that their hardware can become a little bit better.

https://twitter.com/x/status/1641914617046679555

I say it's time for things to return to the way they were, those who want to do very advanced stuff can stick to the old DX12 paradigm, provided that they handle it's problems better (increased VRAM usage/stutter/increased CPU overhead), they don't get to pick and choose here, they want to go DX12 then they should give it their best shot, cover all corner cases and make sure their code is actually faster and more performant, in short, do a proper job, not a half-assed one with all of the usual suspect problems present. If they can then so be it, if they can't then they should stick to the (new and old) DX11 paradigm, so that all of us can have a better experience.

https://twitter.com/x/status/1641914620943163395

https://twitter.com/x/status/1641914624843866112

Lurkmass · Apr 1, 2023

DegustatoR said:
Why would an additional option added into the API create any trust issues? You don't like it? Don't use it, all previously available options are still there and there are no signs of them being removed, yet at least.

It sends a message that the organization isn't interested in stable and forward looking design for their standards. This backtracking is almost as bad as that time when Khronos Group announced OpenCL 3.0 which was a downgrade compared to it's previous iteration. Did going back to the old ways over there (i.e. OpenCL 1.x features) make strides in the compute space ? The solid answer to that was no since every vendor is now doing their own thing ...

DegustatoR said:
You don't have to be a Khronos member to use Vulkan.

You kinda need Khronos members to implement their standards otherwise no one can use them! Considering their rocky history with OpenGL and OpenCL, they're running on borrowed time. If they start wrecking the only good thing left going for them, their members are going to begin winding down their participation and activities in the group ...

DegustatoR said:
I'm sorry but why is SSO "cruft and bloat" while GPLs aren't? Isn't adding them both is essentially the same thing - evolving the API with new features and options?

"Evolving the API" is a hilariously bad take with SSOs. It's literally "devolving" the API into the familiar and old functionality that's existed before with the exact same problems coming with them when we had *reasons* in the first place to abandon them ...

trinibwoy said:
This is all super interesting! Could you elaborate further on why SSO’s are similar to D3D11? If the benefit of GPL is greater compile time granularity than PSOs then wouldn’t even more granularity of SSOs bring even greater benefits to reducing the number of compile time permutations?

SSOs seem to be just as “low level” as GPL and PSO because the application is still explicitly telling the driver what it wants to do at linking time. The downside is the hardware has to be more flexible to accommodate more dynamic linking. I’m sure I’m missing something but that’s my understanding from reading the high level descriptions of the GPL and SSO extensions.

There are subtle undertones in those blog posts that point to these extensions being more compatible with some architectures than others. I suspect this is also a source of friction. Purely from a developer’s perspective though are there downsides to SSO’s?

SSOs could see greater reduction in compile time on some hypothetical configuration and it could be lower level hence my remark behind *implementing D3D11 in hardware* literally ...

The *best compatibility* with any architecture out there is virtually always going to be default monolithic pipelines. SSOs will bring back hidden recompilations and shader uploads since it doesn't match the way most hardware works as changing these states dynamically incurs performance cost ...

I don't see many drivers ever implementing SSOs so it's strange to see Khronos releasing a competitor to GPL ...

trinibwoy said:
That makes sense. Begs the question though, if neither GPL or SSO solve the material permutation problem and therefore will still stutter why even bother?

GPL/SSO mostly solves the permutations caused by different combinations between the programmable geometry stages and the fragment shader. You're still going to have shader graphs generating at least XXXX unique fragment shaders. Most of the source of problem comes from how many individual shader variants are created by an engine. You could cut down on a decent amount of permutations by factoring out the different configurations of geometry stages but there's potentially far more gains at the other side when id Software showed that PSOs were innocent by using uber shaders to combine the different variants into fewer shaders and by banning artists from using shader graphs ...

PSOs alone just aren't the main issue since it's somewhat of a red herring. It's that PSOs with shader graphs become a deadly combination and the origins of the underlying cause lies closer to shader graphs. Compilation stutters are very likely as long shader graphs stay around ...

BRiT · Apr 1, 2023

DavidGraham said:
So, the question is, did we waste it all so that one IHV can make their driver life easy?

No.

iroboto · Apr 1, 2023

DavidGraham said:
Alright, I am going to be frank as much as possible here, since no one else is going to.

I think this is Completely unrelated to the topic at hand. I had the chance in person to speak to Max McCullen at Build2015 and he was demoing work made by @Andrew Lauritzen at the time. Both have had a large hand in influencing and offering advice to Dx12. And I’ve never seen them suggest anything that you have.

trinibwoy · Apr 1, 2023

Lurkmass said:
It sends a message that the organization isn't interested in stable and forward looking design for their standards.

The “forward looking design” been proven to be unsuccessful in practice though so it does make sense to pivot. Some of the basic assumptions that Mantle/Vulkan/DX12 are based on haven’t held up. Others like multi-threaded command queues seem to be working out well though.

Lurkmass said:
SSOs could see greater reduction in compile time on some hypothetical configuration and it could be lower level hence my remark behind *implementing D3D11 in hardware* literally ...

You’ve said that a few times but what does it mean? Do SSOs force you to submit individual shaders mapped 1:1 to the stages of the classic DX11 pipeline? The blog post implies that you have freedom to bind and link whatever shaders you want and can skip pipeline stages if desired.

Lurkmass said:
The *best compatibility* with any architecture out there is virtually always going to be default monolithic pipelines. SSOs will bring back hidden recompilations and shader uploads since it doesn't match the way most hardware works as changing these states dynamically incurs performance cost ...

I don't see many drivers ever implementing SSOs so it's strange to see Khronos releasing a competitor to GPL ...

I’m trying to read tea leaves here so could be way off but is the actual issue that AMD’s primitive shader approach for pre-raster stages is thrown for a toss with SSO? It seems SSO is objectively a good thing for developers and gamers so maybe the right solution is to change the hardware.

DegustatoR · Apr 1, 2023

Lurkmass said:
It sends a message that the organization isn't interested in stable and forward looking design for their standards.

On quite the contrary, it shows that the organization is evolving the API in areas which proved to be problematic.

Lurkmass said:
"Evolving the API" is a hilariously bad take with SSOs. It's literally "devolving" the API into the familiar and old functionality that's existed before with the exact same problems coming with them when we had *reasons* in the first place to abandon them ...

And these reasons are? What does PSO model provide that SSO would destroy? Especially considering that you can literally use both at the same time.

Lurkmass said:
SSOs could see greater reduction in compile time on some hypothetical configuration and it could be lower level hence my remark behind *implementing D3D11 in hardware* literally ...

Not sure why "implementing D3D11 in hardware" is a bad thing.

Lurkmass said:
SSOs will bring back hidden recompilations and shader uploads since it doesn't match the way most hardware works as changing these states dynamically incurs performance cost ...

So should we just stick with a bad API which produce headaches and issues left and right just because evolving it further will require - oh horror! - some changes to be made to the h/w?
Oh, wait, this is why we moved from D3D11/OGL to D3D12/VK in the first place. It seems to me that the underlying intent here is exactly the same. But now it's bad because?..
Btw AMD didn't sit and wait with Mantle when the "members" and the whole industry will agree and just released it. Seems about the same too.

Scott_Arm · Apr 1, 2023

Dx11 definitely had headaches. The drivers are doing a lot to keep games working, and I believe it has its own issues with unpredictable draw call behaviour. I’d have to go back and research the motivations for dx12.

Lurkmass · Apr 2, 2023

trinibwoy said:
The “forward looking design” been proven to be unsuccessful in practice though so it does make sense to pivot. Some of the basic assumptions that Mantle/Vulkan/DX12 are based on haven’t held up. Others like multi-threaded command queues seem to be working out well though.

Some changes are going to be necessary as time goes on if things don't work out but repeating the same mistakes shouldn't be the intention ...

trinibwoy said:
You’ve said that a few times but what does it mean? Do SSOs force you to submit individual shaders mapped 1:1 to the stages of the classic DX11 pipeline? The blog post implies that you have freedom to bind and link whatever shaders you want and can skip pipeline stages if desired.

SSOs forces HW designs to implement their hardware shader stages to exactly match the software stages. The *freedom to dynamically link shaders* on the fly has been a source driver bugs, increased memory consumption, performance problems and constrains future hardware design. We took that feature away for very good reasons because it wasn't working out for nearly anybody else ...

trinibwoy said:
I’m trying to read tea leaves here so could be way off but is the actual issue that AMD’s primitive shader approach for pre-raster stages is thrown for a toss with SSO? It seems SSO is objectively a good thing for developers and gamers so maybe the right solution is to change the hardware

It's not just AMD that has problems with SSO but yes it does compete with GPL for which I take it very much to be a worrying sign that Khronos is confused like they were in the past with OpenGL and OpenCL. We tried the experiment before with SSOs and if you seriously think changing the hardware design is the solution then you haven't looked hard enough at the problem space ...

It's not just *AMD* that's designed a more monolithic geometry pipeline in hardware. That's the entire concept of mesh shaders as well! Mesh shading was an attempt to move away from the more granular and overloaded model of the traditional geometry pipeline. The industry has tried so hard to move away from features like tessellation and geometry shaders and empowering them would undermine this effort. If that's not the future then what are the chances that the more explicit and monolithic mesh shading pipeline is somehow the future ?

We move away from features that didn't work out to make room in the hardware designs for new features that might work out all the time and developers absolutely shouldn't have a say in them either because they don't have nearly as much foresight as the architects who work for the hardware vendors ...

Scott_Arm · Apr 2, 2023

This is probably a good refresher for people who want to go back and understand why Khronos chose the programming model for Vulkan that it did, and what issues they were trying to address from OpenGL.

It's interesting. Looking at dxvk and vkd3d-proton we have really efficient translation from one api to another because the apis are low enough level that you can. I kind of wonder if it wouldn't be better for the Nvidia, AMD and Intel to design their own APIs that match their hardware, and then basically provide a translation layer so people can write a more high-level API like D3D. Allow people that want to write directly to the native API if they want. I guess this is sort of what happens in the driver, but the problem is we can't interface with the driver directly. It seems fairly obvious that Nvidia, AMD and intels hardware is different enough that trying to have one single API that can abstract in a reasonable way for all of them might be problematic. CPUs architectures are similar enough between AMD and Intel, and even if you look at ARM the general programming model looks the same just with different instructions. GPUs are definitely far more varied, and we probably wouldn't want to force them to agree on having uniform instruction sets across vendors. I don't think we want something like x86 for GPUs.

Edit: I'd really like to understand why the new extension is easier to implement on Nvidia hardware, and much more difficult on AMD. It seems like the graphics pipeline abstraction maps closer to AMD than Nvidia.

Lurkmass · Apr 2, 2023

Scott_Arm said:
Edit: I'd really like to understand why the new extension is easier to implement on Nvidia hardware, and much more difficult on AMD. It seems like the graphics pipeline abstraction maps closer to AMD than Nvidia.

It's because Nvidia implements D3D11 style shader and state management in their hardware!

Just about every graphics state D3D11 is a dynamic state and all shader states and programs (we call this the shader object) can be dynamically linked which matches Nvidia's HW model ...

For all other hardware vendors, not every graphics state or shader can be dynamic. This is why API designers created the concept of "static states" when they came up with the idea of PSOs. Static states and static shader bindings allows driver/compiler to thus optimize code gen based on the pipeline state itself. Drivers are able to apply these optimizations *liberally* since PSOs by design are *immutable* which the states/shaders within the PSOs can't be changed. Mutable states and shaders throws wrench in this because drivers have to decide between quickly generating unoptimized shaders which will severely degrade GPU bound performance or letting the compiler apply optimization passes which will cause far longer compilation stutters among the many other heroics vendors have to go through which are unnecessary code with PSOs in their drivers. If we want to change states or shaders, you create a new PSO and then swap the old one for a new one ...

GPL is a hybrid model where we have monolithic geometry pipelines (all pre-rasterization states and shaders) with dynamic shader linking for fragment shaders ...

Scott_Arm · Apr 2, 2023

Lurkmass said:
It's because Nvidia implements D3D11 style shader and state management in their hardware!

Just about every graphics state D3D11 is a dynamic state and all shader states and programs (we call this the shader object) can be dynamically linked which matches Nvidia's HW model ...

For all other hardware vendors, not every graphics state or shader can be dynamic. This is why API designers created the concept of "static states" when they came up with the idea of PSOs. Static states and static shader bindings allows driver/compiler to thus optimize code gen based on the pipeline state itself. Drivers are able to apply these optimizations *liberally* since PSOs by design are *immutable* which the states/shaders within the PSOs can't be changed. Mutable states and shaders throws wrench in this because drivers have to decide between quickly generating unoptimized shaders which will severely degrade GPU bound performance or letting the compiler apply optimization passes which will cause far longer compilation stutters among the many other heroics vendors have to go through which are unnecessary code with PSOs in their drivers. If we want to change states or shaders, you create a new PSO and then swap the old one for a new one ...

GPL is a hybrid model where we have monolithic geometry pipelines (all pre-rasterization states and shaders) with dynamic shader linking for fragment shaders ...

I had just figured this out by re-reading the Khronos blog and the vulkan-docs. This almost looks like Vulkan is going to offer different paths for Nvidia and AMD, which is kind of wild.

You Can Use Vulkan Without Pipelines Today

Deploying and developing royalty-free open standards for 3D graphics, Virtual and Augmented Reality, Parallel Computing, Neural Networks, and Vision Processing

www.khronos.org

On some implementations, there is no downside. On these implementations, unless your application calls every state setter before every draw, shader objects outperform pipelines on the CPU and perform no worse than pipelines on the GPU. Unlocking the full potential of these implementations has been one of the biggest motivating factors driving the development of this extension.

On other implementations, CPU performance improvements from simpler application code using shader object APIs can outperform equivalent application code redesigned to use pipelines by enough that the cost of extra implementation overhead is outweighed by the performance improvements in the application.

The "implementation" here with no downsides is Nvidia. The "other implementations" are AMD and Intel.

Vulkan-Docs/proposals/VK_EXT_shader_object.adoc at main · KhronosGroup/Vulkan-Docs

The Vulkan API Specification and related tools. Contribute to KhronosGroup/Vulkan-Docs development by creating an account on GitHub.

github.com

5.1. RESOLVED: How should implementations which absolutely must link shader stages implement this extension?

The purpose of this extension is to expose the flexibility of those implementations which allow arbitrary combinations of unlinked but compatible shader stages and state to be bound independently. Attempting to modify this extension to support implementations which do not have this flexibility would defeat the entire purpose of the extension. For this reason, implementations which do not have the required flexibility should not implement this extension.

IHVs whose implementations have such limitations today are encouraged to consider incorporating changes which could remove these limitations into their future hardware roadmaps.

Again, the implementation that should implement the extension is Nvidia and the implementation that should not implement the extension are likely AMD and Intel.

DavidGraham · Apr 2, 2023

Again, since no one here wants to directly attack the subject (seems most people probably didn't read the statement of Khronos, or the other developers statements well enough) I already did in my previous post, this is the crux of the matter actually. AMD insisted on changing the core API, but it didn't/doesn't work out. Not for AMD and certainly not for NVIDIA, and not for the general users.

We don't have to hide behind pretty jagron here to see it clearly.

Scott_Arm said:
Again, the implementation that should implement the extension is Nvidia and the implementation that should not implement the extension are likely AMD and Intel

This would lead to a vastly better experience on NV hardware, and a vastly inferior experience on AMD and Intel, with issues like long compilation times, frequent stutters, VRAM issues .. etc.

Remember games/engines are moving in the direction of more game dynamism which goes against the current DX12/Vulkan paradigm, thus compounding these issues further. So AMD hardware will be left in the status quo, while NVIDIA hardware will return to the more comfortable state of old.

All hardware should use this extension to level the playing field, AMD should make the changes necessary (Khronos actually suggests they do exactly that, they have the lowest market share anyway), or fork out their own Mantle like they did before, and handle this their way.

IHVs whose implementations have such limitations today are encouraged to consider incorporating changes which could remove these limitations into their future hardware roadmaps.

DegustatoR · Apr 2, 2023

Lurkmass said:
It's because Nvidia implements D3D11 style shader and state management in their hardware!

It's because Nvidia h/w has good state management. Has nothing to do with "implementing D3D11 in their h/w".
You make it sound like it's a h/w deficit when in fact it is an advantage - and there are no logical reasons why APIs shouldn't expose this advantage, especially if it leads directly to a better user experience.

Scott_Arm said:
Again, the implementation that should implement the extension is Nvidia and the implementation that should not implement the extension are likely AMD and Intel.

If we go by Faith's posts above Intel has already implemented it. The question is how will it run across all h/w.

Lurkmass · Apr 2, 2023

Games can't use features that aren't exposed by the API/driver and one of Microsoft's Direct3D representatives said they're going to hold out and see how it'll turn out. Developers aren't going drop D3D12 for Vulkan or make a separate backend for one vendor ...

Vendors should just remove obsolete features (D3D11) that go unused to be in line with the rest of the industry (literally everyone else) ...

Nobody is interested in the extension in any official capacity other than open source driver developers (emulation), Nvidia, and Nintendo which is just another current arm of the former. Hopefully, the functionality is going die and fade away into obscurity all the while Microsoft and the other will move on to better things ...

DegustatoR · Apr 2, 2023

Lurkmass said:
Vendors should just remove obsolete features (D3D11) that go unused to be in line with the rest of the industry (literally everyone else) ...

You seriously propose to make Nv h/w worse to be "in line with the rest of the industry"?

Lurkmass said:
Hopefully, the functionality is going die and fade away into obscurity all the while Microsoft and the other will move on to better things ...

Which are? The three options to solving the issue at hand are listed in Khronos' blogpost. One of them is basically fantasy land as it goes directly against the general need of gaming graphics and hasn't really happened over the 8+ years of Mantle+ API development. The other two are both in place now. Which one is a better approach - we will see over the next years.

Lurkmass · Apr 2, 2023

DegustatoR said:
You seriously propose to make Nv h/w worse to be "in line with the rest of the industry"?

Well let me see, you currently have at least 3 hardware vendors shipping D3D12 drivers and 2 of them are pretty clearly worse off (2 vs 1) and that's not counting Qualcomm who also ships D3D12 as well who don't ship native D3D11 drivers and caters to an API (OpenGL ES) that specifically waive conformance tests for SSOs! (it's likely bs 3 vs 1 at that point)

It'd be a surprise if any other hardware vendors wanted to really implement it in their official drivers ...

Also the idea of Nvidia somehow forming a coalition with Intel to force AMD to implement SSOs is comical at best but is downright not even worth a thought at worst when they have even less common ground in regards to this problem. Out of all the mechanisms for state and shader management Intel prefers PSOs the most while SSOs are the worst for them meanwhile it's opposite for Nvidia. Nvidia would have a better chance coalitioning with AMD to force Intel to implement GPL since they have more in common here than either with Intel ...

DegustatoR said:
Which are? The three options to solving the issue at hand are listed in Khronos' blogpost. One of them is basically fantasy land as it goes directly against the general need of gaming graphics and hasn't really happened over the 8+ years of Mantle+ API development. The other two are both in place now. Which one is a better approach - we will see over the next years.

Then maybe Nvidia should start taking Vulkan more seriously if they don't like having a dictator like Microsoft on whether or not they choose to arbitrarily restrict the capabilities of their hardware. Either they work with them or go against them while AMD and Intel tell chase for dominance over D3D ...

trinibwoy · Apr 2, 2023

Lurkmass said:
Vendors should just remove obsolete features (D3D11) that go unused to be in line with the rest of the industry (literally everyone else) ...

Obsolete features meaning geometry and tessellation shaders? GPL supports those shader stages too. The only real difference seems to be that AMD hardware is optimized for rolling vertex/geometry/tessellation shaders into one combined “primitive shader” stage while Nvidia retained support for dynamically linking those discrete stages at runtime.

GPL isn’t any more forward looking than SSO. It simply maps better to AMD hardware.

Lurkmass said:
Nobody is interested in the extension in any official capacity other than open source driver developers (emulation), Nvidia, and Nintendo which is just another current arm of the former. Hopefully, the functionality is going die and fade away into obscurity all the while Microsoft and the other will move on to better things ...

Nvidia has a beta driver out with SSO support on day one so it seems they’re at least trying to fix the problem. Has any IHV implemented official support for GPL since it launched a year ago? It will be interesting to see where Microsoft takes DX13. Maybe they will drop support altogether for the legacy geometry pipeline.

DegustatoR · Apr 2, 2023

Lurkmass said:
Out of all the mechanisms for state and shader management Intel prefers PSOs the most while SSOs are the worst for them meanwhile it's opposite for Nvidia.

It is not "opposite for Nvidia", they do both equally well (they actually do PSOs better than AMD atm if we take the issues with TLOU on PC as a sign of who's doing how in that right now). It is the deficit of non-Nv h/w which should be fixed in h/w - finally, as it's been more than 10 years of IHVs transferring this issue to s/w vendors instead.

And no, this is not "implementing D3D11 in h/w" because it has nothing to do with D3D11 beyond the fact that D3D11 s/w runs better on h/w which has fast state management. If you expose this advantage in a modern API like VK or D3D12 then it suddenly becomes "implementing the new feature of D3D12 in h/w". Sounds quite a bit different, is it?

Lurkmass said:
Then maybe Nvidia should start taking Vulkan more seriously if they don't like having a dictator like Microsoft on whether or not they choose to arbitrarily restrict the capabilities of their hardware. Either they work with them or go against them while AMD and Intel tell chase for dominance over D3D ...

What gives you the idea that Nvidia doesn't work with Microsoft on evolving D3D too? RT was added in D3D12 as an Nvidia exclusive feature. A bunch of other changes to stuff like RS and such were implemented because the original spec missed some things in Nv h/w making it run worse than it could. How's PSOs any different?

Shader Compilation on PC: About to become a bigger bottleneck?

trinibwoy

Meh

trinibwoy

Meh

DavidGraham

Lurkmass

BRiT

(>• •)>⌐■-■ (⌐■-■)

iroboto

Daft Funk

trinibwoy

Meh

DegustatoR

Scott_Arm

Lurkmass

Scott_Arm

Lurkmass

Scott_Arm

You Can Use Vulkan Without Pipelines Today

Vulkan-Docs/proposals/VK_EXT_shader_object.adoc at main · KhronosGroup/Vulkan-Docs

5.1. RESOLVED: How should implementations which absolutely must link shader stages implement this extension?

DavidGraham

DegustatoR

Lurkmass

DegustatoR

Lurkmass

trinibwoy

Meh

DegustatoR

Similar threads

Shader Compilation on PC: About to become a bigger bottleneck?

Meh

Meh

(>• •)>⌐■-■ (⌐■-■)

Daft Funk

Meh

5.1. RESOLVED: How should implementations which absolutely must link shader stages implement this extension?​

Meh

Similar threads

5.1. RESOLVED: How should implementations which absolutely must link shader stages implement this extension?