Isn't async compute simply the fact that a GPU can run compute shaders independently and asynchronously with graphics workloads?
If so, doing it inter-shader instead of intra-shader should be sufficient to meet that definition.
Nobody said that it has to be the most efficient or fastest implementation in existence. Similarly, nobody said that enabling async compute has to be faster than not enabling it: if a particular implementation is such that it can't find inefficiencies to exploit, then so be it.
If so, doing it inter-shader instead of intra-shader should be sufficient to meet that definition.
Nobody said that it has to be the most efficient or fastest implementation in existence. Similarly, nobody said that enabling async compute has to be faster than not enabling it: if a particular implementation is such that it can't find inefficiencies to exploit, then so be it.