It seems like a further evolution though: a visibility buffer based upon culled triangles for multiple views with a lifetime of multiple frames. That's two non-trivial multipliers for rendering efficiency right there.
The main point of our SIGGRAPH paper was our fine grained GPU cluster culling. We also do multiview culling, use last frame results as occlusion hint and have similar cluster backface culling as described in this paper. Deferred texturing suited the GPU-driven pipeline really well. Just like triangle id buffer, deferred texturing also supports variable rate shading and texture space shading. This deferred texturing technique is actually fully dependent on virtual texturing, making texture space shading/caching techniques a perfect fit for it. We used (virtual) texture space decaling and material blending already in Trials Evolution on Xbox 360 (
http://www.eurogamer.net/articles/digitalfoundry-trials-evolution-tech-interview). These techniques are certainly ready for production use on consoles, but API limitations, such as limited support for multidraw make them less viable on PC (*).
(*) DirectX 12 needs Windows 10 and doesn't support Radeon 5000/6000 series or Geforce Fermi (no drivers despite Nvidia's promises). Vulkan has similar GPU limitations. Additionally Intel doesn't yet have any Windows Vulkan drivers for consumers (only developer beta) and their commitment to Vulkan is unclear. See here:
https://communities.intel.com/thread/104380?start=30&tstart=0.
Quote (Intel representative): The current Plan Of Record is that Intel® is not supporting Vulkan on Windows drivers. The drivers that were made available on Developer.com are intended for Vulkan developers.
Sure there's rough edges (moving camera, dynamic geometry). I can't tell how close to "game engine ready" this is. I'll have to defer to the developers on that!
We have noticed no issues with moving camera or dynamic geometry regarding to GPU-driven rendering. GPU-driven rendering handles fast moving camera better than CPU culling, as the culling information is up-to-date (current frame depth buffer). CPU-based occlusion culling techniques used either last frame data (reprojection and flickering issues) or software rasterized low polygon proxies (another bag of issues and significantly worse culling performance). GPU-driven culling has a huge advantage in shadow map rendering, as the scene depth buffer can be scanned to identify receiving surfaces precisely. This brings big gains over over commonly used cascaded shadow mapping technique. 3x+ gains (1/3 render time) are possible in complex scenes.
It's as if, after Xenos where developers learnt to directly access the MSAA format for their own resolves, AMD decided to keep that concept in GCN knowing that on console the hack could be formalised. And then ignore the consequences for performance on PC.
Their PC compiler could generate shader instructions to load samples directly from MSAA (CMASK) target without decompression. This should be a win in most cases. They already generate complex sequences of ALU instructions for vertex data interpolation, cubemap fetch, register indexing (older GCN cards don't support reg indexing), wave incoherent resource load/store (DX12 bindless), etc. I fail to see why Texture2DMS::Load couldn't be handled similarly by the compiler. But this is no longer an issue, since Polaris improved DCC handles MSAA loads natively. The primitive discard accelerator also removes the bottleneck of subpixel triangles (not hitting any sampling points) increasing MSAA performance. If you want to see current AMD PC results, test against Polaris.