Actually it's not impossible. It's actually quite easy, as with 3d data, you have the exact information about the motion vector of each pixel. You know the movement (and acceleration) and rotation (and angular acceleration) of your objects and your camera. With this information you can do cheaper and more correct motion estimation compared to the current codecs and HDTVs. The biggest problem comes with the frame latency and stuttering, as we have to do this on real time. The motion estimated frames are much cheaper to render (around 10x in my testing scenario) than the real frame. This causes noticeable stuttering, unless I queue the frames. Queuing frames however causes noticeable control latency (much like AFR SLI setups).
So it's quite easy yet causes stuttering or control latency. Sounds like it's not so easy at all. And how does motion estimation compensate for lighting and perspective differences?
Also on deferred rendering systems you can easily motion compensate only your g-buffer creation, while recalculating the lighting on every frame. But if you only calculate one extra frame between the rendered frames (like I do), the lighting error between 2 frames is not usually noticeable. On 3d rendering however, you can detect these "non-usual" scenarios. If your camera moves too much in one frame or some light turns off or on, you can do the motion estimation more precisely on that frame (or just render a real frame instead).
In any event, I disagree with your claims. If the framerate is high, the deltas between frames may be small, but if framerates are so high, then you probably don't need to use this mechanism at all. If framerates are low, then the deltas will be higher, causing differences in lighting, position, etc. to be more noticeable. Also, the data stored in the frame and depth buffers are only approximations to what was actually sent down the 3D pipe. How would you get reasonable antialiasing by reusing data from a previous frame?
Films have an advantage that makes then more amenable to MPEG type compression: Motion blur. Since things in motion are blurred anyway, then you can get away with interpolation. In 3D graphics, motion blur is (currently) a post-processing effect, not a natural feature of the rendering process.
Lux_ said:
I agree that as of today, there probably are limitations in current APIs and hardware, that make it not worth the effort - how to manage a some kind of a general data structure that keeps track of changes, how to sync it between GPU and CPU.
Yet, this approach is already in use in smaller scale. For example (if I remember correctly), Crysis uses extrapolation for some lightning calculations, and recalculates only in N frames or when something significant happens. Also instancing is different face of the same cube.
Sure, you can choose not to update some render-to-texture effect, but then you're sacrificing some quality to speed up performance.
If the quality isn't identical, then it's not relevant. There are plenty of ways to speed things up if performance is all you care about; how about applications use simple shaders/textures every other frame? If quality is your concern, and it should be since graphics cards are expensive, then you shouldn't settle for compromises on image quality.
-FUDie