Direct3D feature levels discussion

Is this going to end up as yet another entry in the long list of failures from Microsoft's API team?

Too early. AMD and Nvidia drivers may just need a ton of work and optimization. Who knows.

It’ll take a while before we find out, if that’s the state of things.
 
Microsoft itself acknowledges that Work Graphs aren't always the best choice for GPU-driven rendering
Despite the potential advantages, the free scheduling model may not always the best target for an app’s workload. Characteristics of the task, such as how it interacts with memory/caches, or the sophistication of hardware over time, may dictate whether some existing approach is better. Like continuing to use `ExecuteIndirect`. Or building producer consumer systems out of relatively long running compute shader threads that cross communicate – clever and fragile. Or using the paradigms in the DirectX Raytracing model, involving shaders splitting up and continuing later. Work graphs are a new tool in the toolbox.
I'm not sure why the blog post claims ExecuteIndirect will be soft-deprecated.

On a related note, what are the advantages and disadvantages of work graphs versus callable shaders as mechanisms for one shader to request the execution of another? Is there a particular technical reason why callable shaders are for the RT pipeline only and work graphs only support compute and mesh nodes? My understanding is that callable shaders are more flexible than work graphs, but does that flexibility come at a performance cost?
 
Microsoft itself acknowledges that Work Graphs aren't always the best choice for GPU-driven rendering

I'm not sure why the blog post claims ExecuteIndirect will be soft-deprecated.

On a related note, what are the advantages and disadvantages of work graphs versus callable shaders as mechanisms for one shader to request the execution of another? Is there a particular technical reason why callable shaders are for the RT pipeline only and work graphs only support compute and mesh nodes? My understanding is that callable shaders are more flexible than work graphs, but does that flexibility come at a performance cost?
Callable shaders are not flexible. You can make one decoupled call at a time, a sequence of calls are executed sequentially. You can't control what type of invocation (reduction or amplification) you want to have for the called shader.

Work Graphs supports raytracing just fine. It's flexible enough that you can coalesce hits according to any criteria you like.
 
I'm not sure why the blog post claims ExecuteIndirect will be soft-deprecated.
Why would you use the ExecuteIndirect (or standardized Vulkan device generated commands) API when Work Graphs with mesh nodes can provide so MUCH MORE with the PSO swapping capability ? You can change shaders and pipeline states (which are included in the PSO swapping functionality) from the GPU timeline in addition to being able to change the indirect draw/dispatch arguments, root signature/constants, or the index/vertex buffer bindings ...

Work Graphs aren't compatible with RT pipelines but who really cares about doing GPU-driven RT pipelines when CPU overhead is hardly the concern over there ? Wouldn't end users prefer to exploit the potential of the GPU being able to perform faster graphics state changes as opposed to having the CPU compile EVERY unique combinatorial variants of PSOs ?

AFAIC, Work Graphs w/ mesh nodes is just Microsoft porting over Xbox specific extensions to it's ExecuteIndirect API implementation on PC ...
On a related note, what are the advantages and disadvantages of work graphs versus callable shaders as mechanisms for one shader to request the execution of another? Is there a particular technical reason why callable shaders are for the RT pipeline only and work graphs only support compute and mesh nodes? My understanding is that callable shaders are more flexible than work graphs, but does that flexibility come at a performance cost?
Callable shaders and Work Graphs don't necessarily compete with each other ...

Callable shaders are mutually exclusive to RT pipelines because a Microsoft representative is near absolutely insistent that GPUs will never have true (general purpose & not exclusive to a specific PSO model) function calls that are performant. Having callable shader restricted to RT PSOs allows GPU compilers to apply inling optimizations more easily ...

RT nodes can't exist within Work Graphs because DXR pipeline's stack-based model clashes with the feed-forward model of Work Graphs. With Work Graphs, recursion is limited to a single node by design since you can't revisit any prior nodes/shaders during execution. With DXR, you can't do arbitrary work amplification as seen with either amplficiation shaders or Work Graphs so the number of threads declared at the start of execution must match at the end of execution ...

Work Graphs execution model can be sort of described as an unhinged in between combination amplification shaders and DXR/RT pipelines ...
 
Back
Top