Not sure if the Sony composer allows you to do that, but cant one just put compute threads in the hiperf 3d queue? It's just a command buffer, in the end. Wether the shader does 3d or not, it (should) not matter...
Did you mean high-priority 3D? That one is diagrammed as not being compute-capable in the Vgleaks docs and it's reserved for VSHELL.
I assume the above scenario is inserting a standalone compute job to run simultaneously with 3D rendering?
It's in the context of audio effects running on the GPU. Latencies were considered acceptable generally, but become very high when the GPU becomes heavily utilized.
I thought the extra compute queues introduced in the latest GCN revision was meant to deal with stuff like that.
The queue improvements cover a different part of the GPU compute process.
Having many queues reduces contention when multiple threads are trying to send commands to the GPU, and it reduces the chance that the in-order queues will stall a ready kernel because its commands are mixed with others that are not.
The expanded number of ACEs also boosts the number of wavefronts that can launch or receive commands in a cycle, and I'm assuming (dangerous to do, but I hope this is the case) the wavefront completion logic is also scaled up.
This significantly enhances the process of getting a wavefront to the point of allocating resources and launching, and where Orbis as it has been disclosed starts becoming inconsistent.
The front end processors need to be able to allocate the necessary resources for their wavefronts, and those become available when the CUs release them and the ready wavefronts win out in the arbitration phase.
Wavefront execution is a pretty coarse thing, and the audio presentation's lament is that a latency-sensitive load cannot count those resources being ready in a timely fashion.
Audio is particularly sensitive, but spiking up to 33ms for startup time in other workloads is going to become noticeable.
It's still the early days, so perhaps some of the extra QoS tweaks or better tools might provide a way around some of the problems faced at launch.
In other unrelated news, if I run folding@home simultaneously with a game (diablo3 in this particular case), my PC will bluescreen, usually within minutes. Or this was the case with the WHQL driver from last year, I haven't gotten around trying with the new release from a couple days ago.
I'm not sure of the reasons, since this among other things depends on the quality of the card's driver development.
However, one tweak that has been used to reduce the chance of this happening is to find a driver setting to extend the timeout for the card to respond. Long-running kernels can take longer to complete and generate responses than the standard timeout. The OS assumes they've locked up, which is something I alluded to earlier.