Direct3D feature levels discussion

Scott_Arm · Oct 17, 2024

techuse said:
Is this going to end up as yet another entry in the long list of failures from Microsoft's API team?

Too early. AMD and Nvidia drivers may just need a ton of work and optimization. Who knows.

It’ll take a while before we find out, if that’s the state of things.

Krteq · Oct 17, 2024

Meanwhile, Work Graphs mesh nodes implemented in Vulkan

GPU Work Graphs mesh nodes now in Vulkan®

We’ve added mesh nodes to our Vulkan® experimental extension, VK_AMDX_shader_enqueue.

gpuopen.com

raytracingfan · Oct 18, 2024

Microsoft itself acknowledges that Work Graphs aren't always the best choice for GPU-driven rendering

D3D12 Work Graphs - DirectX Developer Blog

Official release for Work Graphs in D3D12. This programming model unlocks latent capability for GPU autonomy and enables future evolution.

devblogs.microsoft.com

Despite the potential advantages, the free scheduling model may not always the best target for an app’s workload. Characteristics of the task, such as how it interacts with memory/caches, or the sophistication of hardware over time, may dictate whether some existing approach is better. Like continuing to use `ExecuteIndirect`. Or building producer consumer systems out of relatively long running compute shader threads that cross communicate – clever and fragile. Or using the paradigms in the DirectX Raytracing model, involving shaders splitting up and continuing later. Work graphs are a new tool in the toolbox.

I'm not sure why the blog post claims ExecuteIndirect will be soft-deprecated.

On a related note, what are the advantages and disadvantages of work graphs versus callable shaders as mechanisms for one shader to request the execution of another? Is there a particular technical reason why callable shaders are for the RT pipeline only and work graphs only support compute and mesh nodes? My understanding is that callable shaders are more flexible than work graphs, but does that flexibility come at a performance cost?

Scott_Arm · Oct 18, 2024

@raytracingfan I need to refresh my memory on why work graphs were proposed in the first place. Watching this video which is kind of a technical breakdown.

Ethatron · Oct 18, 2024

raytracingfan said:
Microsoft itself acknowledges that Work Graphs aren't always the best choice for GPU-driven rendering

D3D12 Work Graphs - DirectX Developer Blog

Official release for Work Graphs in D3D12. This programming model unlocks latent capability for GPU autonomy and enables future evolution.

devblogs.microsoft.com

I'm not sure why the blog post claims ExecuteIndirect will be soft-deprecated.

On a related note, what are the advantages and disadvantages of work graphs versus callable shaders as mechanisms for one shader to request the execution of another? Is there a particular technical reason why callable shaders are for the RT pipeline only and work graphs only support compute and mesh nodes? My understanding is that callable shaders are more flexible than work graphs, but does that flexibility come at a performance cost?

Callable shaders are not flexible. You can make one decoupled call at a time, a sequence of calls are executed sequentially. You can't control what type of invocation (reduction or amplification) you want to have for the called shader.

Work Graphs supports raytracing just fine. It's flexible enough that you can coalesce hits according to any criteria you like.

Lurkmass · Oct 19, 2024

raytracingfan said:
I'm not sure why the blog post claims ExecuteIndirect will be soft-deprecated.

Why would you use the ExecuteIndirect (or standardized Vulkan device generated commands) API when Work Graphs with mesh nodes can provide so MUCH MORE with the PSO swapping capability ? You can change shaders and pipeline states (which are included in the PSO swapping functionality) from the GPU timeline in addition to being able to change the indirect draw/dispatch arguments, root signature/constants, or the index/vertex buffer bindings ...

Work Graphs aren't compatible with RT pipelines but who really cares about doing GPU-driven RT pipelines when CPU overhead is hardly the concern over there ? Wouldn't end users prefer to exploit the potential of the GPU being able to perform faster graphics state changes as opposed to having the CPU compile EVERY unique combinatorial variants of PSOs ?

AFAIC, Work Graphs w/ mesh nodes is just Microsoft porting over Xbox specific extensions to it's ExecuteIndirect API implementation on PC ...

raytracingfan said:
On a related note, what are the advantages and disadvantages of work graphs versus callable shaders as mechanisms for one shader to request the execution of another? Is there a particular technical reason why callable shaders are for the RT pipeline only and work graphs only support compute and mesh nodes? My understanding is that callable shaders are more flexible than work graphs, but does that flexibility come at a performance cost?

Callable shaders and Work Graphs don't necessarily compete with each other ...

Callable shaders are mutually exclusive to RT pipelines because a Microsoft representative is near absolutely insistent that GPUs will never have true (general purpose & not exclusive to a specific PSO model) function calls that are performant. Having callable shader restricted to RT PSOs allows GPU compilers to apply inling optimizations more easily ...

RT nodes can't exist within Work Graphs because DXR pipeline's stack-based model clashes with the feed-forward model of Work Graphs. With Work Graphs, recursion is limited to a single node by design since you can't revisit any prior nodes/shaders during execution. With DXR, you can't do arbitrary work amplification as seen with either amplficiation shaders or Work Graphs so the number of threads declared at the start of execution must match at the end of execution ...

Work Graphs execution model can be sort of described as an unhinged in between combination amplification shaders and DXR/RT pipelines ...

Krteq · Nov 11, 2024

If you want to experiment with Work Graphs (even with non-WG capable GPU), AMD provided "Playground" framework/utility/tool

Work Graph Playground: a learning framework for GPU Work Graphs

Read about our latest sample for D3D12 GPU Work Graphs. We're making Work Graphs more accessible with a tutorial framework.

gpuopen.com

mr magoo · Sunday at 10:37 AM

https://twitter.com/x/status/1868046579002921034

Interesting thoughts from Sebbi

IQandHDR · Sunday at 1:52 PM

mr magoo said:
https://twitter.com/x/status/1868046579002921034

Interesting thoughts from Sebbi

If you thought TPM requirements for Windows 11 made people whine, just wait until people hear that their 10 year old GPU doesn't support the new API. :runaway:

chris1515 · Sunday at 2:02 PM

mr magoo said:
https://twitter.com/x/status/1868046579002921034

Interesting thoughts from Sebbi

https://twitter.com/x/status/1868047332300013633

https://twitter.com/x/status/1868224216673534407

https://twitter.com/x/status/1868226166307377431

He want Pascal GPU support to dissapear.

IQandHDR · Sunday at 2:11 PM

I kinda want support for them to go away too.
They are pre-RTX.
Time to rip of the bandaid

HUB will be fun to watch when that happens...the crying will be entertaining

Remij · Sunday at 2:22 PM

Exactly. We're at the point now, almost 4 generations into the RTX line of GPUs.. It's time to drop the old stuff which is impeding progress.

Shifty Geezer · Sunday at 3:06 PM

That's a business decision. How many Pascal-generation GPUs are out there and what's the spending on them like? If the sizeable majority of PC gamers are using more recent archs, it's a net win to abandon GPUs that are lacking features.

However, in the past you could more readily move to newer GPUs because they didn't cost the earth and you consumers were able to upgrade. As the cost to progress has increased, the gains need to be there. And notably, DXD12 was very lacking in that respect.

DegustatoR · Sunday at 3:10 PM

chris1515 said:
He want Pascal GPU support to dissapear.

If there will be a new API then anything below DX12U will likely end up unsupported, so not only Pascal (2016) but also Vega (2019) and RDNA1 (2020) and even "small Turing" (2019).
It would also mean that a crap load of iGPUs wouldn't be able to handle this new API. Intel ships non-DX12U CPUs right now I believe?

It's also not entirely clear that a "clean break" will end up as clean as provided examples of Apple and Qcom would suggest. Both are kinda limited to their own markets and thus can do whatever they want with the h/w, both do have issues handling s/w which weren't written for that h/w as well. Even making DX12U the base feature level for a supposed new API likely won't get rid of all the IHV specific things and optimizations current APIs have. And I'd wager that it shouldn't because something which is "clean and easy" for a programmer has a very high chance of ending up being a total compatibility nightmare for the end users.

IQandHDR · Sunday at 4:52 PM

Shifty Geezer said:
That's a business decision. How many Pascal-generation GPUs are out there and what's the spending on them like? If the sizeable majority of PC gamers are using more recent archs, it's a net win to abandon GPUs that are lacking features.

However, in the past you could more readily move to newer GPUs because they didn't cost the earth and you consumers were able to upgrade. As the cost to progress has increased, the gains need to be there. And notably, DXD12 was very lacking in that respect.

On Steam they are sub 10% in redards to hardware:

NVIDIA GeForce GTX 1050 0.94%
NVIDIA GeForce GTX 1050 Ti 2.02%
NVIDIA GeForce GTX 1060 2.68%
NVIDIA GeForce GTX 1070 1.02%
NVIDIA GeForce GTX 1070 Ti 0.33%
NVIDIA GeForce GTX 1080 0.67%
NVIDIA GeForce GTX 1080 Ti 0.49%

For a totalt of : 8.15%

RTX GPU's are +50%

And some of those pacal CPU's would be fail flat in running modern games, even without RT, unles you think 720p Low settings is a target for developers in 2024.

But again, the whine will be loud, as always with minorities trying to stay relevant

mr magoo · Monday at 1:40 PM

https://twitter.com/x/status/1868557462544208262

All replies are very good. Lots of good stuff here.

raytracingfan · Monday at 6:44 PM

if DX13/Vulkan 2 is released and drops support for anything without full DX12U support, then they can go even further and drop the vertex shader pipeline entirely. Developers can just use mesh shaders or work graphs with mesh nodes for their rasterization needs. Making something like extended dynamic state or graphics pipeline libraries standard would also help cut down on PSO permutations.

DegustatoR · Monday at 6:59 PM

raytracingfan said:
if DX13/Vulkan 2 is released and drops support for anything without full DX12U support, then they can go even further and drop the vertex shader pipeline entirely. Developers can just use mesh shaders or work graphs with mesh nodes for their rasterization needs. Making something like extended dynamic state or graphics pipeline libraries standard would also help cut down on PSO permutations.

Drop from what? API? Not sure if there's any benefit to having less options instead of more. H/w? Nothing stops anyone from doing it even now I think.

troyan · Monday at 7:07 PM

mr magoo said:
https://twitter.com/x/status/1868557462544208262

All replies are very good. Lots of good stuff here.

And DXR is the result from a pure software API (nVidia has Optix). So i dont understand the complain. We want real time raytracing in games. RT Cores accelerating workloads which has been used per compute on the GPU.
Epic is trying to be better than hardware and their result is just bad. DICE implemented Raytracing within a few weeks in Battlefield 5 and the result is still much better than what software Lumen can do.

A RTX2080 with the same rasterizing performance like the GTX1080TI performs much better with Raytracing: https://www.tomshardware.com/reviews/nvidia-pascal-ray_tracing-tested,6085.html
Its nearly twice as fast with Metro Exodus.

homerdog · Monday at 11:07 PM

DegustatoR said:
Drop from what? API? Not sure if there's any benefit to having less options instead of more. H/w? Nothing stops anyone from doing it even now I think.

I was wondering this. Are we talking about features that don't exist in DX12U? Devs get to choose the minimum spec to run their games. Alan Wake 2 effectively didn't support Pascal at least at launch. Of course the game didn't sell well

but I don't think that's the reason.

Direct3D feature levels discussion

Scott_Arm

Krteq

GPU Work Graphs mesh nodes now in Vulkan®

raytracingfan

D3D12 Work Graphs - DirectX Developer Blog

Scott_Arm

Ethatron

D3D12 Work Graphs - DirectX Developer Blog

Lurkmass

Krteq

Work Graph Playground: a learning framework for GPU Work Graphs

mr magoo

IQandHDR

chris1515

IQandHDR

Remij

Shifty Geezer

uber-Troll!

DegustatoR

IQandHDR

mr magoo

raytracingfan

DegustatoR

troyan

homerdog

donator of the year