That's not the definition of
preemption: if the VRAM would be totally controlled by an application, the OS would stack untill the application unstall the request, which would require the application to service other process requests, which would results in application complexity, performance degradation and tons of security issues.
In the context of a more console-like model, the memory requested by the game would be mostly exempt from the demand paging of other processes once the allocation was spun up. Operating systems can exempt or constrain the paging for ranges of memory, though outside of specific purposes it's minimized. If there are specific transition points like the game being pushed to the background, closure, or errors the OS could offer more firm promises if the initial spin-up can be completed.
QoS has nothing to do since it is related to
network priority traffic and packed switching.
I was going with a more informal use of QoS that has been used to describe the level of performance in shared services or hardware outside of networking. For compute hardware, HSA 1.1 included QoS changes unrelated to networking, and discussions about shared or virtualized GPUs have used it in terms of the time slice and resource allocation adjustments made between separate clients.
Game mode performance increase is a myth, it's more about asks UWP application to go background, it is only a mitigation for system with tons of store crapware running and already
claims UWP apps VRAM and memory if needed.
I was not claiming that it was happening now, but speculating as to whether it could be combined with the advancing features for sharing and virtualization in GPUs to allow something like this to have benefits.
VM encapsulation like in XBOX is just used as DRM and nothing more. DX-kernel already give maximum priority to the foreground application.
The OS cannot promise nothing since a PC is not a console: every PC has it's own unique combination of hardware and software and on console this is done more for DRM reasons then for performance. Even on console no-one wants another PS3 style memory management again.
Security and DRM are major drivers of the Xbox One's virtualized setup, but the console also leverages the setup to provide some enforced resource and time budgets to developers. With the game in the foreground, at least 6 cores and 5 GB of DRAM are allocated to the game partition. The reservation's memory is not subject to paging, and the isolation of the game and system partitions also allows for some rather stringent promises on the percentage of time the game will be given use of the GPU.
Additionally, the partitioned system allows for independent OS versioning for the console's application partition versus the OS version seen by a given game.
There would be specific transition points like when switching between foreground and background, and adjustments to time slice and allocation based on the version of the SDK a game opts to use. Other violations of the resources granted and their accessibility typically revolve around game-ending or crash scenarios, such as long-running compute failing to yield at the end of the game partition's GPU time slice. The latter problem could potentially be handled better with modern hardware with features like SRIOV and preemption capabilities the CI architectures lacked.
It's not the only way to go about doing this, since the PS4 has some similar partitioning of CPUs and memory without a hyperviser and multiple guest operating systems.
Whether it's a significant enough use case in PC hardware for the effort is uncertain, but making more strong guarantees than best-effort of the developer is possible for servers. A hypervisor can allocate cores and memory to an instance and leave that allocation's physical resources off-limits to other VMs.