Digital Foundry Article Technical Discussion Archive [2014]

taisui · Mar 5, 2014

Bagel seed said:
The cat switches to a dog half the time :smile:

Regular upscaling is a hybrid cat dog monster all the time

More like you somehow cut a cat into 2 halves, take 1 half and travels to the future, and stitch the cat"s" back together and claim that it's the same as 1 future cat.

well, or at least it looks like a future cat.

HTupolev · Mar 5, 2014

taisui said:
To illustrate the merge reconstruction (naive):

Ya, ya, ya, I know I know, but being able to show this mathematically is pretty cool.

Why is your interlace method using values half-pixel offset from the "native" full-res ordered grid? That doesn't make sense to me, and in your example you seem to be introducing a ton of error by shifting the entire "interlaced" result one half-pixel to the left (which also creates a bunch of error at the right-side edge where you seem to have averaged the "6" with an invisible "0" to the right for some reason).

If you're not accounting for motion, a typical naive interlace will give results identical to the full-res image.

AlNom · Mar 6, 2014

taisui said:
More like you somehow cut a cat into 2 halves, take 1 half and travels to the future, and stitch the cat"s" back together and claim that it's the same as 1 future cat.

well, or at least it looks like a future cat.

Leave Mr. Bigglesworth alone!

taisui · Mar 6, 2014

HTupolev said:
Why is your interlace method using values half-pixel offset from the "native" full-res ordered grid? That doesn't make sense to me, and in your example you seem to be introducing a ton of error by shifting the entire "interlaced" result one half-pixel to the left (which also creates a bunch of error at the right-side edge where you seem to have averaged the "6" with an invisible "0" to the right for some reason).

If you're not accounting for motion, a typical naive interlace will give results identical to the full-res image.

I'm shifting because I'm introducing details (noise), GG probably does the same thing.

The fact that you are not really sampling at the actual rate:

2488246086 actual has 10 samples

You can't do perfect interlace for:
2-8-2-6-8- and
-4-8-4-0-6

because you don't have 10 samples, you have 5, and you shift the pixel to introduce noise, because if you don't, this happens:
3-8-3-3-7-
-3-8-3-3-7

which is as bad and a half resolution, I hope this makes sense.

mosen · Mar 6, 2014

taisui said:
Before you go and try to smear me on things that I had never said, go back and read a little. Or let me refresh the memory, I freaking quantified the additional benefits with an analysis.

Let me say it again and once for all: the technique provides a far superior final image, comparable to a 900P upscaled, at almost 40% less pixel cost. It's really cool.

Better than native 900p upscaled to 1080p? I don't think so.

If your point is that this technique can output near 1080p/60fps experience on non-action sections while player doesn't move (in a multiplayer match) then I should ask you what's the point of such a solution? you can make a game that keeps the resolution and drop the framerate (from 60fps) like Tomb Raider (PS4) or make a game that keeps the framrate and drop the resolution (from 1080p/720p) like Wolfenstein:TNO/Rage.

Killzone's answer to this problem neither keeps the framerate nor the resolution/IQ stable. They have chosen this solution knowingly because their vision:

Running 60 has become this Holy Grail. Suddenly people think if you run 60 your game is better. Technically, that's not really true," he explained. "But what it does do is it makes decisions go from input to on-screen a lot easier. So, having a constant 60 is not actually better than having a 'lot of the time' 60. It sounds weird, but it's actually true. Because usually in the moments where we're going to drop framerate, either you're already dead or it's too late anyway.

http://www.ign.com/articles/2013/09...lls-framerate-isnt-locked-at-60-and-heres-why

Killzone MP framerate is variable between 30-60fps (from 24 player mode on different map to 7 vs 7 player mode) and it's resolution is variable between 720-1080p (from action sections to no movement parts). If you run through the map (alone) you'll miss the 1080p IQ (fast movement) and if you enter in a heavy fight you'll miss your 60fps feeling.

It can't be a perfect answer to every type of games or developers need. While a good 900p game could keep framerate/resolution(IQ) stable through the entire game and offer a better overall experience.

I didn't play Killzone shadow fall personally and it's better to ask those who played it's MP, but your assumption can't always be true. At least not when you're playing it.

taisui · Mar 6, 2014

mosen said:
Better than native 900p upscaled to 1080p? I don't think so.

I didn't say it's better, I said it's comparable, which is quite an achievement really, considering the amount of pixel differences. I personally would still take a progressive frame over this, cause of the artifacts.

If your point is that this technique can output near 1080p/60fps experience on non-action sections while player doesn't move (in a multiplayer match) then I should ask you what's the point of such a solution?

Actually I've proven that perfect reconstruction is not possible, as I have never even thought it would be. Perhaps this is a better question for the "native 1080p" camp.

HTupolev · Mar 6, 2014

taisui said:
2488246086 actual has 10 samples

You can't do perfect interlace for:
2-8-2-6-8- and
-4-8-4-0-6

Actually, that's exactly how interlacing works (in television standards, obviously on the flipped axis). If you have a static image of 2488246086, it will typically be rendered as odd fields of 28268 and even fields of 48406.

In any case, it should be plainly obvious that you introduced error by shifting the entire "interlaced" frame by one half-pixel away from the grid of the original "native" frame.

taisui · Mar 6, 2014

HTupolev said:
Actually, that's exactly how interlacing works (in television standards, obviously on the flipped axis). If you have a static image of 2488246086, it will typically be rendered as odd fields of 28268 and even fields of 48406.

In any case, it should be plainly obvious that you introduced error by shifting the entire "interlaced" frame by one half-pixel away from the grid of the original "native" frame.

Sure, but in the case of Killzone, you don't have a static image at 2488246086 for 10 pixels, you have 5 pixels because the native buffer is at that size. Are you telling me then, you can somehow pick the "perfect" pixels to be interlaced, when you don't even have the actual image to begin with?

If you have the actual, won't you just output at native 1080p60 then?

2nd MP detail analysis:
http://i.imgur.com/LwtoV26.png

Scott_Arm · Mar 6, 2014

Shifty Geezer said:
Nope (not 2D upscales anyhow). This reprojection/interpolation concept will give differing results depending on the difference between the previous frame and current one. An upscale gives differing results depending on neighbouring pixels.

Let's consider a line of pixels of values :
...

Thanks for the post. Interesting way to to explain the interpolated data.

What I was trying to say was this: I don't believe you could call this a 1080p render, the same way you couldn't call an upscaled image a 1080p render. In both cases you are finding a way to create data to fill in the gaps between your render (960x1080 in this case) and your final image (1080p). The way upscaling and this type of interpolation/reprojection achieve that goal is different, and the results are different, but the purpose for doing so is essentially the same - you need to output 1080p, but it is cheaper to render less.

Brad Grenz · Mar 6, 2014

The fact is new techniques like this will simply force us to be more precise with the terminology we use. Maybe the new question is "what resolution do you rasterize opaque geometry in each frame?" Of course real-time 3D graphics is all about creative shortcuts so we shouldn't be immediately dismissive of this temporal reprojection thing (or other future methods not yet conceived of) or seek to denigrate them through association with up-scaling or interlacing.

HTupolev · Mar 6, 2014

taisui said:
Sure, but in the case of Killzone, you don't have a static image at 2488246086 for 10 pixels, you have 5 pixels because the native buffer is at that size. Are you telling me then, you can somehow pick the "perfect" pixels to be interlaced, when you don't even have the actual image to begin with?

If you have the actual, won't you just output at native 1080p60 then?

You don't have the actual full res image rendered in a single field, no, but you have the actual scene which you're rendering from. The odd field samples the odd locations of the scene, which gives you the odd values in the full-scene dataset. Ditto for the even field giving you the even values in the full-scene dataset.

Eh, maybe this will make more sense if I show it graphically. Time to bust out the MS Paint.

So, here's your scene prior to actually being rendered. Let's suppose that the scene is not in motion, so it doesn't change "from frame to frame."

Now, if we carry out a full-res render (10 sample points evenly spaced from x=1 through x=10), we sample at these locations. So, our final "rendered" data set looks like this. Which is of course "2488246086."

Now, let's consider how this scene would be rendered interlaced. First, we render the odd field. When we render the odd field, we get a 5-sample result, sampling at x=1, x=3, x=5, x=7, and x=9. So, this.
Now, it's 1/60th of a second later, and we render the even field. When we render the even field, we get a 5-sample result, sampling at x=2, x=4, x=6, x=8, and x=10. So, this.
We've made the assumption that our output image is the naive blend of these two frames (sample 1 from the odd field gets placed at x=1, sample 1 from the even field gets placed at x=1, sample 2 from the off field gets placed at x=3, etc). So, we get this, which is the same as the full 10-point render (The blended interlaced result for a non-moving scene is identical to a full-res render, even with naive blending).

Now, let's consider what you did. Your "odd" field started with points one half-pixel to the right of the leftmost part of the scene, which meant you sampled this.
Your "even" field started with points three half-pixels to the right of the leftmost part of the scene, which meant you sampled this. Note the rightmost pixel in the even field is actually outside of the bounds of the scene, which is seemingly (?) what drove you to average it with 0.
Your full sampled data set, when the fields are combined, looks like this. Because of how you generated it, it happens to be a half-pixel shift to the right of the original image (the values in your "combined interlaced" scene are actually linearly-interpolated intermediates of the original scene values). To see this clearly, we can show the original 10-pixel render alongside your "interlaced" result in purple, like this.

Now, what went wrong in your comparison of the method's accuracy to the original scene is that, rather than consider what was actually being represented (a linearly-resampled variant of the scene with an x-axis shift), you took the delta between the "non-interlaced" dataset and the "interlaced" dataset as though they represented the same spatial region, like this.

taisui · Mar 6, 2014

HTupolev said:
You don't have the actual full res image rendered in a single field, no, but you have the actual scene which you're rendering from. The odd field samples the odd locations of the scene, which gives you the odd values in the full-scene dataset. Ditto for the even field giving you the even values in the full-scene dataset.

This is where you are wrong though. You simply don't get to "perfectly" sample at the scene that way.
Since your native buffer is half as wide, each of your sample (pixel), will contain half as much detail coming out the rendering pipeline.

For example, 2X the texels will be sampled into 1 pixel, you are losing details that way. You don't get to pick whether you are sampling the odd pixels or the even pixels, it's not possible.
Hence one can judder the camera and try to get more details out of the 2 half resolution frames of the scene, but then you think I'm misleading you by introduce "noise"

I don't know how else to explain this, it's pretty basic rendering concept.

HTupolev · Mar 6, 2014

taisui said:
This is where you are wrong though. You don't get to sample at the scene that way.

Why not? Am I going to get pulled over by the GPU police? Is someone going to come over and tell me where I am and am not allowed to sample within the scene?

Scott_Arm · Mar 6, 2014

Brad Grenz said:
The fact is new techniques like this will simply force us to be more precise with the terminology we use. Maybe the new question is "what resolution do you rasterize opaque geometry in each frame?" Of course real-time 3D graphics is all about creative shortcuts so we shouldn't be immediately dismissive of this temporal reprojection thing (or other future methods not yet conceived of) or seek to denigrate them through association with up-scaling or interlacing.

I don't think anyone is denigrating anything. I just don't think you can say that you're rendering the half of the image that is reprojected/interpolated. Otherwise why couldn't you call upscaling rendering?

I think the results are very good, from what I've seen, and it's a smart idea. I'm looking forward to details on the implementation, and seeing if any other devs take a stab at it. Anything that can get better framerates with good image quality is going to be welcomed by me.

-tkf- · Mar 6, 2014

taisui said:
Before you go and try to smear me on things that I had never said, go back and read a little. Or let me refresh the memory, I freaking quantified the additional benefits with an analysis.

Let me say it again and once for all: the technique provides a far superior final image, comparable to a 900P upscaled, at almost 40% less pixel cost. It's really cool.

I am not trying to smear anything, maybe it's the tone in your posts, which was really what made me wonder how old you were in relation to the joke. But thanks for clarifying your stand.

taisui · Mar 6, 2014

HTupolev said:
Why not? Am I going to get pulled over by the GPU police? Is someone going to come over and tell me where I am and am not allowed to sample within the scene?

black line is your scene, red lines are samples at 1920 wide, blue lines are samples at 960.
The scene is the same, and you are losing details by samples at 960 pixels a frame.
and to compensate the loss of detail, you judder the camera to sample it differently to get more details out of it, the green lines represent the next frame.

HTupolev · Mar 6, 2014

taisui said:
Since your native buffer is half as wide, each of your sample (pixel), will contain half as much detail coming out the rendering pipeline.

No, this is false. Each sample point in a halfsize buffer contains just as much detail as a sample point in the fullsize buffer. The main reason a halfsize buffer contains less information than the fullsize buffer isn't that the pixels contain less info, it's that there are fewer pixels.

If the scene isn't moving, as long as the pixels have coverage that aligns with pixels in the fullsize buffer, they can exactly represent sections of the fullsize buffer. By combining multiple halfsize buffers with coverage that differs from each other, you can perfectly reconstruct the fullsize buffer.

In practice, this gets a little wishy-washy when dealing with effects that require the values of neighboring pixels (in particular, with post-process stuff like bloom). But outside of the occasional caveat which might require that some processing be withheld until the buffers have been combined, what I've said tends to hold.

taisui said:
black line is your scene, red lines are samples at 1920 wide, blue lines are samples at 960.
The scene is the same, and you are losing details by samples at 960 pixels a frame.
and to compensate the loss of detail, you judder the camera to sample it differently to get more details out of it, the green lines represent the next frame.

How did you even choose values? I'm not sure where your sample points are. Unless you're trying to show a perfectly-supersampled result (which that one blue line on the right side disagrees with VERY strongly) (and which you haven't done correctly anyway, as pixels in the halfsize buffer really OUGHT to use halfsize coverage in order to get accurate results; the coverage of blue areas and green areas in a correctly-handled supersampled interlaced video stream should NOT be overlapping like that).

taisui · Mar 6, 2014

I think you lack basic understanding of CG rendering, and I am not a good teacher so I can't explain clearly to set you on the right path. I'll let someone else take a crack at this, or try read about rasterization.

the coverage of blue areas and green areas in a correctly-handled supersampled interlaced video stream should NOT be overlapping like that

except there's no video stream, there is nothing. Each pixel is a sample of the scene out of the rendering pipeline, from triangles, textures, shaders, projected into this tiny dot.

Essentially you are suggesting that you found a way to do lossless compression at 50% ratio constant. You are set for life if you found that, just saying.

TheWretched · Mar 6, 2014

In a full static image, you CAN "50%" you image... but only at twice the framerate. You're sending the image split into 2 interlaced images and weave them together in post. That's how regular PAL does it, too. Either it's sending "progressive" interlaced (25/30Hz with 2 even/odd fields representing progressive images), or like tv soaps, at 60Hz with alternating lines.

In KZ it works for static scenes well. I don't know about movement (haven't played mp). But in this case it could work like a dynamic scaler like in Wipeout. When there's movement you get 50% res at the worst (with interlacing artifacts) or most likely something in the middle, depending on how good the weaving works. While standing still, you can always achieve 100% coverage with your samples.

HTupolev · Mar 6, 2014

taisui said:
I think you lack basic understanding of CG rendering

Well.

It's true that I have no professional experience in the video game industry, or with computer graphics.
I was, however, a student of EE and mathematics, and I have enough interest in computer graphics that I've familiarized myself to some extent with various projections, ways to go about deciding which points lie within a polygon, basic shading concepts...

Enough that I, as part of a year-long school project, once built a rasterizer and simple SH-precomputed-shadowing lighting system pretty much from the ground up, which I used to render simple scenes with simple low-frequency infinite-distance light distributions like these.

Massive credentials of extreme impressiveness? Not at all, but I like to think that I'm not 100% entirely clueless on this stuff.

there's no video stream, there is nothing. Each pixel is a sample of the scene out of the rendering pipeline, from triangles, textures, shaders, projected into this tiny dot.

I admit my choice of words could have been better, but there is be a video stream in at least some abstract sense. Whether you're referring to the stream of pixels at the output of a graphical algorithm or hardware, or the stream of data over an NTSC connection, the way you've represented pixel coverage in your visual a few posts up doesn't really represent how interlacing works.

I think what you're getting hung up on is that you're visualizing pixel coverage, rather than sample point. If your "correct" full-res scene is going to have a notion of pixel coverage where each pixel is its own fully separate bucket like this, then a "correct" interlaced breakdown should use the same coverage for its pixels. So, if odd/even is blue/red, the corresponding interlaced pixel coverage breakdown ought to look sort of like this. In your example, your "interlaced buffer" pixels were doing "something" over areas twice as wide as their counterparts in the non-"interlaced" case (lots of overlap).
When trying to grasp what's going on with either interlacing or temporal reprojection, it can be better to visualize pixels as point-samples than as "buckets."

Essentially you are suggesting that you found a way to do lossless compression at 50% ratio constant.

No, I'm not. I'm suggesting that if a 60mph car can travel a mile every minute, a 30mph car can travel a mile every two minutes.

Digital Foundry Article Technical Discussion Archive [2014]

taisui

HTupolev

AlNom

Moderator

taisui

mosen

taisui

HTupolev

taisui

Scott_Arm

Brad Grenz

Philosopher & Poet

HTupolev

taisui

HTupolev

Scott_Arm

-tkf-

taisui

HTupolev

taisui

TheWretched

HTupolev

Similar threads