DX11 Compute Shader Dependencies

Rogon

Newcomer
Hi,

I was wondering if anyone here has a notion of how DX11 schedules compute jobs. As we all know, the threads in a thread-group executes in lock-step, but other thread-groups from the same Dispatch() call may execute on other units.

My question is, can other jobs from other Dispatch() calls also execute in parallel? If so, how is dependencies tracked? For instance, how do you guarantee that the results of compute job A is finished before compute job B starts up (in case B reads a RwBuffer generated by A)?

Thanks.
 
Multiple dispatches can execute in parallel, but only if there are no dependencies. If the output of A is an input to B, B will not start until A finishes.
 
There are no guarantees. It's up to you to synchronize. You can for example use full memory barriers, and then you at least get the guarantee that outstanding writes from one group have been made visible to all others. In practice you can use that to use algorithms which rely on memory-coherency, and to channel information from group to group in a lock-free setup.
You don't have mutexes in DC, and because of that you can't implement lock-based algorithms the regular way.
 
First, thanks both!

@3dcgi: I didn't see this mentioned on Microsoft's SDK documentation, would you mind sharing with us where this is specified?

@ethatron: How do you specify barriers in the DX11 command buffer? And you mention "DC", what is that?

Thanks!
 
Thanks both!

3dcgi: Do you know where in Microsoft's documentation this is specified?

Ethatron: What is DC? Also, how do you make memory barriers in DX11? I didn't see any mentions of it in the DeviceContext classes.
 
The D3D model requires that the inputs of a dispatch or draw call correctly reflect the results of a dispatch or draw call that was executed previously. Hence, there is no explicit dispatch-to-dispatch or draw-to-dispatch synchronization in D3D. The driver and GPU is able to implement it however they want behind the scenes, as long they maintain correctness from the point of view of the programmer. In practice GPU's are certainly capable of having multiple draws and dispatches in flight simultaneously, and the driver is responsible for analyzing your commands to determine dependencies so that it can insert sync points.
 
Rogon, I don't know where anything is documented, just what happens in the hardware. MJP summed it up well.
 
Back
Top