I'd like to challenge your assumption here. Not every byte of VRAM memory needs to be accessed every frame in order to be useful. Having larger video will enable more detailed worlds (no more close-up blurriness) and less compromise. It will enable faster loading times (or no loading times at all once everything is in memory) and instant teleporting in big open worlds.
Loading data to fill 16GB of memory takes 4x longer time than loading data to fill 4GB of memory. Good streaming technologies are essential in reducing the loading time. The gap between storage and RAM speed is getting wider every day. If you load everything during the loading screen, you will need to load for a considerably longer time.
A good streaming technology will not keep up-close details of anything in memory, except those surfaces close to the player character. As the mip map distance is logarithmic, the area around the player that would access highest texture mip level is very small. The streaming system will of course load data from longer radius to ensure that the data is present when needed, but there's no reason to keep all the highest mip level data loaded permanently in memory. If you would do this in a AAA game, then even a 16 GB GPU wouldn't be enough in current games (to provide results identical to a 4 GB GPU with a good streaming system).
I agree that instant teleporting is a problem for all data streaming systems. However the flip side would be to load everything to memory and that drastically increases level loading times. But contrary to common belief, a very fine grained system (such as virtual texturing) actually handles instant teleporting better than coarse grained streaming systems. This is because virtual texturing only needs to load data to render a single frame. You can load 1080p worth of texel pages in <200ms. This feels still instant. With a more coarse grained system (load a whole area), you would need to wait for a lot longer. Loading everything at startup is obviously impossible for open worlds. 50 GB BR disc doesn't fit to the memory (and there might be downloadable DLCs areas in the game world as well). You need at least some form of streaming. My experience is that fine grained is better that coarse grained. But only a handful of developers have implemented fine grained streaming systems, as the engineering and maintenance is a huge effort.
I have been developing several console games (last gen and current) that allowed players to create levels containing all game assets (even 2 GB of assets in single level on a console with 512 MB of memory). We didn't limit asset usage at all. There was a single big game world that contained all the levels. With a fine grained streaming system (including virtual texturing for all texture data) we managed to hit 3-5 second loading time for levels. This is what is possible with a good fine grained streaming system.
I know your're an engine programmer, you live and breath efficiency, but historically there has never been something as too much memory (640kb should be enough *cough* *cough*). Having to worry less about memory pressure will give engine programmers such as yourself more time to do other things. At least until those 16 GB become too small once again.
Wasting memory is easy, but it comes with a cost. HBM1 for example didn't become widely used because it was capped to 4 GB. All current PC games would be fine with 4 GB if memory was used sparingly. But as developers are wasting memory, products with larger amount (8 GB) of slower memory are winning the race. The problem here is that the faster memory would give improved visuals, as faster memory = can use better looking algorithms. But we instead need to settle on larger amount of slower memory, since memory management is not done in a good way. Larger memory size always means that the memory needs to be further away from the processing unit. This means that it is slower. Larger != no compromise.
Custom memory paging (such as software virtual texturing) and custom fine grained streaming systems are complex and require lots of developer resources and maintenance. This is a bit similar to automated caches vs scratchpad memories (Cell SPUs and GPU groupshared memory vs automated L1/L2/L3 CPU caches). Automated system is a bit less efficient in worst case (and uses more energy), but requires much smaller amount of developer work. Hopefully Vega's automated memory paging system delivers similar gains to game memory management. Developer could load huge amount of assets and textures to system RAM without thinking about GPU memory at all, but only the currently active set of memory pages are resident on the GPU (fully automated). In the best case this is like fully automated tiled resources for everything. No developer intervention needed. CUDA (Pascal P100) also offers a paging hint API for the developer. This way you can tell the system in advance if you know that some data is needed. This is a bit similar to CPU cache prefetch hints. Way better than fully manual system, but you also have just right amount of control when you need it. This is the future.