@sebbbi just a curiosity, but how exactly does the thought process play out for all these new render pipelines and techniques? Is it looking into research? Or just a lot of trial and error with some intuition to guide you ?
The thought process goes something like this: What is the minimal amount of data we need to identify each surface pixel in the screen.
Depth only pass would actually be enough. You just need a function (or a lookup) to transform the (x, y, depth) triplet to UV (of the virtual texture). You could for example use a world space sparse 3d grid that describes the UV mapping of each (x, y, z) location. However since you would have N triangles in each grid cell, the grid would require variable size data per pixel (meaning that loop + another indirection is needed). So this kind of method would be only practical for volume rendering (or terrain rendering) with some simple mapping such as triplanar. Unfortunately we are not yet rendering volumes (like media molecule does), so this idea was scrapped.
So in practice, the smallest amount of data you need per pixel is a triangle id. 32 bits is enough for this, and since the triangle id is constant accross a triangle the MSAA trick works perfectly. With 8xMSAA you have 8 pixels per real pixel. You don't need multiple samples per pixel, since the triangle id + screen implicit x, y allows you to calculate perfect analytical AA in the lighting. This method works very well with GPU driven culling, since the culling gives each cluster a number (= the array index of that cluster in the visible cluster list). This way you can numerate each triangle without needing to double pass you geometry (and use atomic counter for visible pixels). Unfortunately this method is incompatible with procedurally generated geometry, such as most terrain rendering implementations and tessellation. Skinning is also awkward. So I scrapped this idea as well. There is actually a Intel paper about this tech (our research was independent of that). I recommend reading it if you are interested.
So we ended up just storing the UV implicitly. This is efficient since our UV address space for currently visible texture data is only 8k * 8k texels, thanks to virtual texturing (with software indirection). This is another reason why we don't use hardware PRT. The other reason is that there is no UpdateTileMappings
Indirect. Only the CPU can change the tile mappings. Hopefully we get this in a future API.
We just used the same tangent presentation as Far Cry did. Tangent frame (instead of normal) allows anisotropic lighting and parallax mapping in the lighting shader. You can also try to reconstruct the tangent from the UV gradient and the depth gradient (calculated in screen space), but the failed pixels will look horrible. In comparison the failed gradient calculation only results in bilinear filtering and that is not clearly noticeable in single pixel wide geometry (check the gradient error image in the slides).