Nvidia DLSS 1 and 2 antialiasing discussion *spawn*

Those conditions seem pretty strict. That won’t play nice with a bunch of games, notably Doom Eternal.

From the gdc video. Basically you do all rendering in lower res. Then apply deep learning algorithm to combine multiple frames into a new upscaled frame. Then you run all post processing and ui rendering at native resolution. Some part of having less information in due to lower resolution is offsetted by having more frames to work with and having the post processing run at native resolution? Post processing will be tricky as something like gbuffer is in lower res and would require making the post processing algorithms more flexible.
 
apparently at least partly breaking DoF,
Haven't seen any evidence of that, not even with DLSS 1.
like causing ringing artifacts
You can reduce those significantly by scaling from 1440p instead of 1080p.
missing details
Native res with TAA is often missing significantly more details, especially during motion.

Once more, I must state that I haven't seen any of this criticsim diverted to driver sharpening filters like the ones developed by NVIDIA or AMD, despite suffering from the same problems.

DLSS 2 comes down to cost/benefit ratio, it has some drawbacks and significant advantages, among them is the massive performance uplift and increased details. Native res with TAA also has drawbacks and advantages, so depending on your point of view they are either both equal at the moment, or DLSS is slightly better, and that is a significant achievement in and of itself, it only gets better from here.
 
Like I said: DLSS 2.0 does many things well, but for me it doesn't make up for the things it screws up in the process.
I'm not talking about the road there, I'm talking about the things DLSS 2.0 messes up, like causing ringing artifacts, overemphasized lines, apparently at least partly breaking DoF, missing details and specific objects turning into utter garbage
Perhaps we should ease up on the hyperboles. Alex's analysis has shown that 4K-DLSS and 4K-TAA trade blows in terms of artifacts. 4K-DLSS suffers from some oversharpening (which may be adjustable) and imperfectly reconstructs *some* edges. In contrast, 4K-TAA is less temporally stable and is worse at subpixel reconstruction in transparencies. FWIW neither DLSS nor TAA is an appropriate "ground truth" reference signal for the other (you'd need something like a hypothetical 64x supersampled image for that).

And this comparison is almost funny because one of the two competitors is 130% (or 2.3x) faster. I'm going to ask you to follow up on one of your own claims in a past post - you said you disliked DLSS so much that you would prefer to reduce settings "here and there" to achieve performance parity. Okay then, show us what settings you would reduce "here and there" to give 4K-TAA (or no-AA) that 2.3x boost for an iso-performance comparison vs. 4K-DLSS. Then we'll compare image quality and temporal stability.
 
Any ballpark ideas on what the hardware overhead is for achieving DLSS2.0 (watts, transistors, memory)? Mainly curious if there's a practical lower-bounds on what types of products this could be implemented in (5W mobile SoCs, in particular). Barring any future miracles in fabrication, it would be nice to have some light at the end of the tunnel for mobile hardware.
 
Any ballpark ideas on what the hardware overhead is for achieving DLSS2.0 (watts, transistors, memory)? Mainly curious if there's a practical lower-bounds on what types of products this could be implemented in (5W mobile SoCs, in particular). Barring any future miracles in fabrication, it would be nice to have some light at the end of the tunnel for mobile hardware.
Well. Running a model is significantly lighter than training. Your phone is using machine learning all the time for specific phrases etc. So we do see ML used on mobile phones and it’s increasing. It’s probably applicable to see it used one day for mobile games.
 
Comparison against blurrier TAA, so same as the FF XV comparison but now having a sharpening effect added on top. Not sure if it's the sharpening that brings out lost textures details from before.

Wonder how well it'd do here, at 6m32s comparison,

 
Not sure if it's the sharpening that brings out lost textures details from before
Here is a good preso on DLSS 1.0 - https://developer.nvidia.com/dlss-gdc-2019
In basics, it consists of 2 neural networks.

1. The first one performs jitter aware temporarily stable anti-aliasing for low-res frame and provides much better quality (stability and clarity) in comparison with traditional "hand-crafted" Temporal AA from what I've seen in Marco Salvi's presentations (he is a guy behind NN based temporall AA in NVIDIA, you can google his presos, there are many of them). The TAA neural network was trained on downsampled 16K images (downscaled with Lanczos filtering for additional clarity and sharpness), hence the network also learns Lanczos-style sharpenning this way. The mapping of the network is 1:1 input to output resolution, so it can't reconstruct texture details in higher output resolutions (1:4 input to output would likely require much higher complexity Neural Net, which would not be able to run fast enough during real-time inference phase).

2. The second network takes the low-res output of the first NN and upscales it to higher resolution frame. I would call it statictical based reconstruction because NNs can learn and generalize well enough to reconstruct many objects edges with different orientations in higher res. That's a single shot edge reconstruction, NN can do well enough here, you can try it youself here - http://waifu2x.udp.jp/index.ru.html
This network will magnify any shimmering and flickering since it works in a single frame space without any temporal feedback, you need very temporarily stable input to produce stable output. Obviously, it can't neither hallucinate or reconstruct texture details in higher resolution space. Hallucination (GAN style NN's) creates different details from frame to frame hence is not applicable neither from the art design or image quality perspectives. Reconstruction requires either temporal reuse (prior frames knowledge) or additional buffers (but for edges reconstruction only).

The main thing that was fixed in DLSS 2.0 is reconstruction in higher resolution space, now there is just a single NN, which takes a jittered and warped high res history frame (privious frame warped via MVECs) and jittered current frame upscaled via integer scaling or some other method to higher res. Then the network blends two frames likely with per-pixel blending factors (unlike TAAU algos which use a uniform value across the screen), also NN replaces all heuristics - neighborhood clamping (and so on) and does it very well. Hence DLSS 2.0 is capable of reconstructing pixel perfect higher resolution images, it's even capable of reconstruting subpixel details as screenshots show.
 
And this comparison is almost funny because one of the two competitors is 130% (or 2.3x) faster. I'm going to ask you to follow up on one of your own claims in a past post - you said you disliked DLSS so much that you would prefer to reduce settings "here and there" to achieve performance parity. Okay then, show us what settings you would reduce "here and there" to give 4K-TAA (or no-AA) that 2.3x boost for an iso-performance comparison vs. 4K-DLSS. Then we'll compare image quality and temporal stability.

A game like Control isnt even the best case for DLSS 2.0. It really shines when small geometry must be reconstructed. Here is an example from Wolfenstein YB:
1080P TAA


4K/DLSS Performance


4K TAA
 
A game like Control isnt even the best case for DLSS 2.0. It really shines when small geometry must be reconstructed. Here is an example from Wolfenstein YB:
1080P TAA


4K/DLSS Performance


4K TAA
You mean the windows surrounded by those horrendous overemphasized sills or whatever they're called? Or the ringing artifacts in the wires? Or the bright white antennaes which shouldn't really look like that at the distance
 
Here is a good preso on DLSS 1.0 - https://developer.nvidia.com/dlss-gdc-2019
In basics, it consists of 2 neural networks.

1. The first one performs jitter aware temporarily stable anti-aliasing for low-res frame and provides much better quality (stability and clarity) in comparison with traditional "hand-crafted" Temporal AA from what I've seen in Marco Salvi's presentations (he is a guy behind NN based temporall AA in NVIDIA, you can google his presos, there are many of them). The TAA neural network was trained on downsampled 16K images (downscaled with Lanczos filtering for additional clarity and sharpness), hence the network also learns Lanczos-style sharpenning this way. The mapping of the network is 1:1 input to output resolution, so it can't reconstruct texture details in higher output resolutions (1:4 input to output would likely require much higher complexity Neural Net, which would not be able to run fast enough during real-time inference phase).

2. The second network takes the low-res output of the first NN and upscales it to higher resolution frame. I would call it statictical based reconstruction because NNs can learn and generalize well enough to reconstruct many objects edges with different orientations in higher res. That's a single shot edge reconstruction, NN can do well enough here, you can try it youself here - http://waifu2x.udp.jp/index.ru.html
This network will magnify any shimmering and flickering since it works in a single frame space without any temporal feedback, you need very temporarily stable input to produce stable output. Obviously, it can't neither hallucinate or reconstruct texture details in higher resolution space. Hallucination (GAN style NN's) creates different details from frame to frame hence is not applicable neither from the art design or image quality perspectives. Reconstruction requires either temporal reuse (prior frames knowledge) or additional buffers (but for edges reconstruction only).

The main thing that was fixed in DLSS 2.0 is reconstruction in higher resolution space, now there is just a single NN, which takes a jittered and warped high res history frame (privious frame warped via MVECs) and jittered current frame upscaled via integer scaling or some other method to higher res. Then the network blends two frames likely with per-pixel blending factors (unlike TAAU algos which use a uniform value across the screen), also NN replaces all heuristics - neighborhood clamping (and so on) and does it very well. Hence DLSS 2.0 is capable of reconstructing pixel perfect higher resolution images, it's even capable of reconstruting subpixel details as screenshots show.

I remember the waifu2x project, but once DLSS released, it dawned that it was far more simpler to train the NN on anime/manga images than doing it for a 70GB game and hoping that it'd not lose the details in the scene. The GDC talk is a whopping 590MB, hard to use in ubuntu.
I didn't read earlier about the TAA neural network? Or 'statictical based reconstruction'. Or MVECs now?

I'm not sure why single NN across the games has become a better solution, the problem is more suited to specialization and overfitting than generalization
 
Unless the game is ultra-stylized like say cel shaded, I doubt the NN can learn much specific from any given game. Not enough dimensions to generalize beyond a couple simple moving geometric shapes.
 
Unless the game is ultra-stylized like say cel shaded, I doubt the NN can learn much specific from any given game. Not enough dimensions to generalize beyond a couple simple moving geometric shapes.

I wonder if anyone has tried to do game with similar idea as gaugan has. i.e. draw just what you would like to have and let neural network hallucinate content. "here be water, here be rocks,..." This is pretty cool

http://nvidia-research-mingyuliu.com/gaugan
 
I wonder if anyone has tried to do game with similar idea as gaugan has. i.e. draw just what you would like to have and let neural network hallucinate content. "here be water, here be rocks,..." This is pretty cool
It's easily broken. We're a very long way from being able to just make convincing art on the fly. About the closest you can get I guess is procedural texture placements, whether algorithmic or NN.
 
There's a sharpening slider in Youngblood for those interested.

I've made some comparisons during my own playthrough.
Putting the slider on 2.0 (default for TAA?) gives sharpening halos with DLSS Quality.
On 1.0 the halos are gone but the image looks softer than TAA in native I'd say.

You could probably find some middle ground between these numbers where the ringing will be less visible and the image will be sharp enough.
But generally speaking, from my Youngblood experience, DLSS (2.0) will still be noticeable (either from oversharpening or excessive blur) when playing on a desktop monitor.
When it is used on a (4K) TV though I'd imagine that most people will gladly trade some oversharpening for about twice as much performance. Isn't sharpening all the rage these days anyway?

Would be cool if NV would provide DLSS in native resolution though (i.e. "DLSS 2x"). It does work better than TAA as an AA solution for the most part, removing the blurring and ghosting associated with the latter.
 
Last edited:
Have you played the game? That's how windows are supposed to look like when you get closer, the problem is, you think TAA image is the ground truth, it's not.
Yes, I have, even though relatively shortly, wasn't my cup of tea as a game. TAA isn't the ground truth, but I'm pretty sure at that distance they're not supposed to look like they do on the DLSS shot. When you get closer, sure, but not at that distance, regardless of how they look in TAA shot.
 
If DLSS doesn't need training per game any more, how long until we get to use it on old titles? Like, SD titles upressed to HD. Can DLSS cope with pixelised text?
 
Back
Top