PC DirectX 11 (or OpenGL compute) do not support multiple asynchronous compute queues.
Hopefully OpenGL 5.0 and DirectX 12 will introduce the support, allowing us finally fully utilize the modern GPUs on PC. Currently the only way to take advantage of multiple compute queues of the modern NVIDIA and AMD GPUs is to use CUDA or OpenCL (or Mantle?, I don't know?)
Perhaps @Dave Baumann would be kind enough to advise us on that one?