What you're describing is what AI reconstruction does before frame generation. Frame gen already analyses multiple full frames to predict what it should look like when it generates the in-between frames.Could ML incorporate something like a lower resolution rendered frame to help in the frame gen process? Perhaps at a 360p intermediate frame and use the data from it to help generate the rest of the frame.
Reconstruct from lower res frames > analyze full frames > generate in-between frames