Which up scaling technologies bring the best long term option?

invictis

Newcomer
There are a number of up scaling technologies that are being used at the moment, with more on the horizon.
Currently we have ones such as checkerboarding, DLSS, TSAA and some new ones to be released such as AMDs super resolution.

There are distinct differences between them from Machine Learning to Sony's checkerboarding.

Which ones at this point offer the best results at the lowest cost, and which ones will become defacto in the future?

Can Sony improve their checkerboarding to include both axis?

Will Machine Learning become more common, or is the fractured nature of the tech a hold back?

AMDs solution will apparently be open sourced, does that mean it will work on Nvidia cards?

Would be interested to see where this tech goes.
 
cost from what? from workforce, hardware performance, etc?

from hardware performance wise, seems DLSS give the best bang due to its using dedicated tensor cores on top of not needing lots of manual work from the devs
 
cost from what? from workforce, hardware performance, etc?

from hardware performance wise, seems DLSS give the best bang due to its using dedicated tensor cores on top of not needing lots of manual work from the devs
Cost as in dev time etc.
Which one is easier to apply, which one is harder?
Is checkboarding just a flick of the switch, while DLSS required the training section?
 
The "best long-term option" for upscaling technology is going to be something that can be independently implemented and maintained by developers ...

One of the reason's why the developers behind Valorant opted to use UE4's forward renderer (used for VR) instead of the default deferred renderer was because they wanted to use MSAA and didn't want to implement motion vectors at all. Destiny 2 doesn't have an option for TAA because they don't want to implement motion vectors either. In both cases, they don't have very many options to implement upscaling outside of the most naive one ...

DLSS is beyond the realm of most developers/games since most of them are incapable of editing the model so it would be another dependency that they'd have to add which aren't ideal for long-term projects like game engines. The dependency problem could be solved if developers could find a productive way to edit the model ...
 
Regardless of the signal processing, there is one big cheap set of information the algorithms can still use, Surface ID. Though it's more invasive and less of a drop in replacement for TAA.

Visibility buffer with motion vectors would be the ideal input for multi-frame super-resolution.
 
Regardless of the signal processing, there is one big cheap set of information the algorithms can still use, Surface ID. Though it's more invasive and less of a drop in replacement for TAA.

Visibility buffer with motion vectors would be the ideal input for multi-frame super-resolution.
Ps4pro brought support for surface id, so all ps5 models should support it in single pass rendering.
Some checkerboard solutions already use it. (Frostbite and Dark Souls remastered.)

It is also used in some of the games using 'geometry upsampling'. (Sony marketing name apparently.)
In purest form they render image in 1080p and then when upscaling fill empty space within same id range. (And if there is still empty space, some random blur.)
Gives similar effect to what MSAA subsamples to pixels tricks do, although it is possible to miss some polygons details etc. (New polygon id in sample which was not shaded.)

I'm sure we will hear more use cases for it's not just in one system type. (Dark Souls remastered used id buffer to fight ghosting.)
 
One important step forward will be to use proper motion vectors, which is to say forward motion vectors for the previous (SR-)frame towards the current frame, not the backwards motion vectors from the current frame to the previous one. The backwards motion vectors are easier to render, but the forward motion vectors are what really you need for proper sample insertion.
 
One important step forward will be to use proper motion vectors, which is to say forward motion vectors for the previous (SR-)frame towards the current frame, not the backwards motion vectors from the current frame to the previous one. The backwards motion vectors are easier to render, but the forward motion vectors are what really you need for proper sample insertion.

Cache motion vector buffer from previous frame? Interpolate with new frame's motion vectors? You've got the storage and bandwidth now, it might be worth it.

It'd also be nice to have full six axis motion. Adding another channel for depth motion is a bit costly but straightforward. As for rotational motion, you could probably get away with 10:10:10 (use the extra 2 bit channel for whatever). After all you're just doing it in relation to the object itself, and if you're flipping over 90 degrees on any axis you're probably rejecting history anyway as you've got an entirely new view and/or new lighting. Same works for motion blur, though I imagine artists breaking it immediately with some sort of hyper spinny cube thing. Still, even there, add some sort of custom shader for that case and you're probably good.
 
Nah, what you want is not the backwards motion vectors from the previous frame. What you ideally want is the forward motion vectors for all the samples in the previous high resolution frame, so you can do sample insertion for each high resolution individually. Though it might be too expensive.
 
It'd also be nice to have full six axis motion. Adding another channel for depth motion is a bit costly but straightforward. As for rotational motion, you could probably get away with 10:10:10
What for? All you do need is translation and a significance value, to deal with areas which are likely *not* reconstruct-able by patching. Coincidentally, that's also a somewhat good heuristic where you should probably ramp up the render resolution (not limited to shading rate, but also full geometry detail). Depth-variance multiplied by norm of the motion vector multiplied with age of the data should be a good estimate for where the data is gonna be garbage. Take special care if you know you have areas with distinct patterns in texture or micro geometry, or if you know normals are changing rapidly. You should evaluate the likeliness of successful reconstruction on both the source as well as the target buffer.

You might even get away without emitting refined motion vectors explicitly entirely (even though it helps!) by just using dedicated hardware for a straight-forward optical flow analysis.
Optical flow analysis is part of the fully decoupled video encoder engines, so as long as you get access to the output of that (https://developer.nvidia.com/blog/an-introduction-to-the-nvidia-optical-flow-sdk/), and manage to feed it specifically non-ambigous buffers (you don't want to get caught up on repeated patterns in textured content, and absolutely no illumination!), you should get an accurate forward motion vector field down to 4 pixels granularity. And you got enough throughput to do this 4k@120fps forward-only, with significance values for the motion-estimate itself free on the house.

You then only need to choose a threshold where the (adjusted) significance is too low for your quality target, and render these parts in the highest resolution you can afford. For the stuff you can patch, remember to degrade the significance of data you haven't redrawn for a couple of frames (where the motion vector was also non-zero) so you know you need to refresh eventually.

You don't even *want* TSAA side effects like motion blur if you can just get a stable, high frame rate instead, even if you lagging behind by 2-3 frames (as you can't extrapolate as easily as interpolate). And you don't want multi-frame accumulation of details either if you can just opt to "render the stuff that matters at full detail NOW", and get away with rendering nothing but highly simplified proxies into minimal buffers only for the rest.

Think of the whole up-scaling stuff less like a denoiser, but rather take inspiration from video compression. That's all about encoding - or in this case re-creating - only the minimal amount of information necessary to introduce details which are new in a frame, while straight-out copying the rest, as long as you can get away with it.
 
What for?

For good motion blur of course! How accurate can translation only be? I don't know offhand but normal direction, and thus change in lighting condition, is definitely partially rotation dependent. Anyway obviously we're only hitting 60fps at best in higher end titles this gen, that's still fully in motion blur is useful territory. But more importantly, that's just an art decision. The "need" for higher framerates, no motion blur, etc. is just one possible look for a given project. Movies don't work well at high FPS, and I've been one of the less bothered by stuff like The Hobbit and Gemini Man; hell I kind of liked it in 3d. But frames per second has a definitive impact on overall tone of a project, games included, as does motion blur. The "need" for "moar fps" and "moar clean!" above everything is definitely limited to a subset of consumers.

As for variable resolution shading for new information... foveated rendering is already a major headache for varying reasons, I wouldn't want to try including it unless it was already a high end VR title and had a ton of time. But you could probably approximate it with variable rate shading and a spatial upscaler. Use something like FSR on new information for a single part of each frame? Combined with increasing shading samples for those tiles and you've got a similar effect that's a lot easier and more straightforward than trying to implement some crazy variable primary sampling scheme.

And of course any improved way of getting information for TAA and motion blur is welcome, doesn't really matter how it's done. Rotation motion vectors just slot in nicely with what everyone already has. But if there's some improved neural net feature extraction or optical flow method or whatever, well better performance and/or quality is better.
 
For good motion blur of course! How accurate can translation only be? I don't know offhand but normal direction, and thus change in lighting condition, is definitely partially rotation dependent. Anyway obviously we're only hitting 60fps at best in higher end titles this gen, that's still fully in motion blur is useful territory. But more importantly, that's just an art decision. The "need" for higher framerates, no motion blur, etc. is just one possible look for a given project. Movies don't work well at high FPS, and I've been one of the less bothered by stuff like The Hobbit and Gemini Man; hell I kind of liked it in 3d. But frames per second has a definitive impact on overall tone of a project, games included, as does motion blur. The "need" for "moar fps" and "moar clean!" above everything is definitely limited to a subset of consumers.

As for variable resolution shading for new information... foveated rendering is already a major headache for varying reasons, I wouldn't want to try including it unless it was already a high end VR title and had a ton of time. But you could probably approximate it with variable rate shading and a spatial upscaler. Use something like FSR on new information for a single part of each frame? Combined with increasing shading samples for those tiles and you've got a similar effect that's a lot easier and more straightforward than trying to implement some crazy variable primary sampling scheme.

And of course any improved way of getting information for TAA and motion blur is welcome, doesn't really matter how it's done. Rotation motion vectors just slot in nicely with what everyone already has. But if there's some improved neural net feature extraction or optical flow method or whatever, well better performance and/or quality is better.
It certainly would be nice to have full stochastic rasterization with time information.

Perhaps old good rasterization with just motion blur and DoF before shading would be good enough, if every point/sample would have time information. (do blur sampling with visibility buffer?)
Then during shading use timestamp for light locations, animated textures etc.

Would at least partially solve old problems with moving objects and lights and be able to render water and reflective surfaces in a way that gives somewhat accurate result.
 
Back
Top