My interpretation is that there is no software obligation that asynchronous compute's queued commands be subject to the sequential ordering of the graphics and synchronous queue, barring explicit barriers. That's not the same as the GPU being obligated to actually process them physically in a concurrent manner.But isn't Async Compute supposed to be a feature where the GPU does not need to juggle between two distinct modes (graphics/compute)?
That's part of where I was curious about yielding or stalls being a case where it might still be possible to get some asynchronous behavior.
Sort of like this:
AC queue: A B C D E F G
Graphics: a b c d *maybe stall* e f g
The actual sequence might be abcdABCDEFGefg, abcdefgABCDEFG, or if the compute portion has some level of preemption abcdABC*preempt*efgEFG.
Even if they cannot process concurrently, the compute queue's command processing relative to the graphics domain is not fixed.
I'd still prefer more data.So now we have a second test pointing to what both the Oxide employee and AMD_Robert claimed?
The context switch requirement prior to Maxwell 2 is known.
http://www.anandtech.com/show/9124/amd-dives-deep-on-asynchronous-shading
What exactly Maxwell 2 has for a policy is not yet clear.
It may be that it has a more limited subset of options relative to GCN for some kind of switching based on specific events, but if it architecturally has not split compute as heavily from graphics, a solidly running graphics context might still have priority or a lack of preemption makes the benchmark's graphics portion a solid block that the GPU cannot interleave work with.