Stability isn't the only important thing, smooth convergence is too. When you suddenly have to get close to the final solution inside the foveation point you still can't afford to flicker from the previous solution ... you should go from blurred to sharp, poorly "denoised" aliased isn't necessarily the same as blurred.
I would solve it this way: Shade higher mip map levels if out of focus (or out of view, occluded etc.). When it comes into focus (or view) interpolate the lower mip maps to fill the higher ones, and increase detail over time with a simple exponential average filter.
I do it this way with my GI stuff and it works fine for this kind of low frequency data, but i don't know how it would work for the full image, especially for specular.
The topic 'object space shading' is very broad. (I adopt the term over texture space shading because most people use it now after the Oxide talk.)
I see those options:
* Store just irradiance and combine with material when building the frame, or store radiance with material already applied?
The former is surely better in this discussed foveated scenario, but also allows lower shading than texture resolution in general. With normal maps being the highest resolution texture usually, that's also quite a loss of detail. If you have denoising in mind however, it's the only option.
* Store just the stuff in frustum, or store the full environment around the camera?
I think the former is the usual assumption, e.g. in Sebbis overview given here some time back, leading to a guess of 1.3 times the shading area.
But the latter could still use less LOD on the cameras back. The information would be still guaranteed to be there if requested.
The latter also becomes more interesting if shading is really expensive. It is what i have in mind when i talk about it, but the memory / shading requirement being maybe up to 8 times more makes it so unattractive.
* Store just the diffuse term, or diffuse and specular?
Can specular be cached at all without looking bad? Using high res frustum model for specular and low res environment model for diffuse?
Gains complexity, but starts to make sense...
This seperation also makes sense if we think about cloud gaming. For a multiplayer game the diffuse part could be shared, multiple servers could calculate accurate GI more easily.
Btw, my personal vision of cloud gaming always was this: Stream diffuse lightmaps and texture / model data, but build the final frame on a thin client (smartphone class low cost HW). This way the latency problem could be solved.
I still think this would be 'cloud gaming done right', and it would also enable VR/AR, but the problem is how to calculate specular on a thin client? There are surely options, but likely no photorealism is possible.
There is common belief diffuse GI would be much more expensive than specular reflections, but this is true only for special cases like perfect mirrors or no specular at all. I think specular will turn out more expensive on the long run.
This is also the main argument how one could convince me about a need for FF RT, and a point where i disagree with many game developers who say reflections are not soooo important or could be faked / approximated.