Sure, but in the case of Killzone, you don't have a static image at 2488246086 for 10 pixels, you have 5 pixels because the native buffer is at that size. Are you telling me then, you can somehow pick the "perfect" pixels to be interlaced, when you don't even have the actual image to begin with?
If you have the actual, won't you just output at native 1080p60 then?
You don't have the actual full res image rendered in a single field, no, but you have the actual scene which you're rendering from. The odd field samples the odd locations of the scene, which gives you the odd values in the full-scene dataset. Ditto for the even field giving you the even values in the full-scene dataset.
Eh, maybe this will make more sense if I show it graphically. Time to bust out the MS Paint.
So,
here's your scene prior to actually being rendered. Let's suppose that the scene is not in motion, so it doesn't change "from frame to frame."
Now, if we carry out a full-res render (10 sample points evenly spaced from x=1 through x=10), we sample at
these locations. So, our final "rendered" data set looks like
this. Which is of course "2488246086."
Now, let's consider how this scene would be rendered interlaced. First, we render the odd field. When we render the odd field, we get a 5-sample result, sampling at x=1, x=3, x=5, x=7, and x=9. So,
this.
Now, it's 1/60th of a second later, and we render the even field. When we render the even field, we get a 5-sample result, sampling at x=2, x=4, x=6, x=8, and x=10. So,
this.
We've made the assumption that our output image is the naive blend of these two frames (sample 1 from the odd field gets placed at x=1, sample 1 from the even field gets placed at x=1, sample 2 from the off field gets placed at x=3, etc). So, we get
this, which is the same as the full 10-point render (The blended interlaced result for a non-moving scene is identical to a full-res render, even with naive blending).
Now, let's consider what you did. Your "odd" field started with points one half-pixel to the right of the leftmost part of the scene, which meant you sampled
this.
Your "even" field started with points three half-pixels to the right of the leftmost part of the scene, which meant you sampled
this. Note the rightmost pixel in the even field is actually outside of the bounds of the scene, which is seemingly (?) what drove you to average it with 0.
Your full sampled data set, when the fields are combined, looks like
this. Because of how you generated it, it happens to be a half-pixel shift to the right of the original image (the values in your "combined interlaced" scene are actually linearly-interpolated intermediates of the original scene values). To see this clearly, we can show the original 10-pixel render alongside your "interlaced" result in purple, like
this.
Now, what went wrong in your comparison of the method's accuracy to the original scene is that, rather than consider what was actually being represented (a linearly-resampled variant of the scene with an x-axis shift), you took the delta between the "non-interlaced" dataset and the "interlaced" dataset as though they represented the same spatial region, like
this.