I think it's a good idea for the same reason virtual memory in an OS is a ridiculously good idea. All of the same principles apply, and adding the GPU on to the cache hierarchy is smart!
The question on whether or not to expose explicit control to the application is debatable, and I have no strong opinion. However GL has shot itself in the foot numerous times by trying to "infer" too much about what the user is doing, to the point that there is no clear "fast path" any more... stupid subtle changes can cause chaotic performance. OpenGL 3.0 LM is *supposed* to fix that (by getting rid of and/or layering all of the cruft), but I have no idea if/when it is coming.
I disagree, the OS virtual memory is a totally different beast.
In CPU programs operate with raw addresses, so a dedicated hardware is needed in CPU to get around this and implement memory paging so that it was invisible for a running program.
Graphics aplications, however, use (relatively) high level API. They don't see any raw addresses, just some objects, like textures, framebuffers (we can ignore locking, as it's irrelevant for our discussion). This allows a 3D API to completely hide memory managment from user, in a purely
software way (unlike the OS virtual memory).
The only situation in which a dedicated hardware for paging might make a difference, would be related to granurality of the paging. In the software method (which has been available for ages), the granularity of swapped memory block is an entire texture. A dedicated paging hardware onboard the GPU could reduce the granularity to n*n pixel blocks. This would allow to upload, say, 1 GB texture, let it be swapped out, then render a scene that accesses only a tiny portion of this giant texture.
However, this has already been done long time ago (so I still don't see what's so special about Vista's solution)
(quote from
http://www.thecomputershow.com/computershow/hardware/permedia3.htm)
Additionally, PERMEDIA 3 is expected to be the first graphics processor to offer Virtual Texturing - a capability that automatically manages optimal placement of textures in system and local graphics memory. The PERMEDIA 3 architecture incorporates a demand-page texture sub-system that causes a dedicated DMA unit to download 256x256 pages of textures to local memory when they are first accessed. This will allow software developers to straightforwardly load all textures into system memory, while the hardware autonomously maintains an optimal working set of texture pages cached in available local graphics memory for maximum performance. Virtual Texturing will allow execution of textures from system memory in PCI, as well as AGP systems, provides optimized use of backplane bandwidth and avoids local texture memory fragmentation through virtual to physical texture address mapping.