Why is a high MIP something you'd pull on demand? As in, why is that more latency tolerant? The issue isn't BW but the time it takes from sampler feedback stating during texture sampling (as the object is being drawn), "I need a higher LOD on this texture" and that texture sampler getting new texture data from the SSD.
Texturing on GPUs is only fast and effective because the textures are pre-loaded into the GPU caches for the texture samplers to read. The regular 2D data structure and data access makes caching very effective. The moment texture data isn't in the texture cache, you have a cache miss and stall until the missing texture data, many nanoseconds away, is loaded. At that point, fetching data from SSD is clearly an impossible ask.
The described systems included mip mapping and feedback to load and blend better data in subsequent frames. You want to render a surface. The required LOD isn't in RAM so you use the existing lower LOD to draw that surface, and start the fetching process. When the higher quality LOD is loaded a frame or two later, you either have pop-in or you can blend between LOD levels, aided by SFS if that is present.
When it comes to mid-frame loads as described in that theoretical suggestion in the earlier interview (things to look into for the future), we'd be talking about replacing data that's no longer needed this frame. There's no way mid-rendering data from storage is every going to happen on anything that's not approaching DRAM speeds. The latencies are just too high.