Yeah, I was thinking in the case of using the previous rendertarget as a texture in the next pass. There's only so much you can do to prevent a pipeline flush in that case, as far as I can see...Wouldn't it even be possible to avoid pipeline flushes for these situations? It's probably not worth the complexity, but I'm just thinking theoretically.
A pipeline flush (explicit or not) is the only explanation for what you are describing. If there is no pipeline flush, then the rest of your post is wrong, and that's that.Jawed said:There's no pipeline flush there. Read more carefully.
Here's an extreme example that should be easy to understand. Consider a full-screen quad covering millions of pixels. Every multiprocessor has up to 768 pixels in flight at a given time. As soon as all the multiprocessors are filled with threads, every time a warp exits, a new one will enter. As long as the scheduler is smart enough not to run all the warps perfectly in sync, you won't see bubbles then.
To make sure we have the same definition of a pipeline flush, let's also take an example with CUDA. After every call to the API, you are guaranteed that all stages on the GPU are empty, because CUDA is fully synchronous. When the pipeline is full and begins emptying, the pipeline flush has started. When new threads begin entering again, the pipeline flush has ended.