Nvidia DLSS 1 and 2 antialiasing discussion *spawn*

PS. upscaling it is, yeah calling that supersampling is taking the piss.

ok. There's a big difference between saying it's a DLSS has poor performance, and saying it's a straight up lie what the algorithm is doing.
MfA's on the money here. DLSS is not doing supersampling which is taking more data than pixels and getting more samples per pixel. DLSS is definitely undersampling.

The choice of calling it supersampling is marketing driven, because everyone knows supersampling is the best, most demanding AA technique. It seems a name deliberately chosen to suggest that you are using some form of supersampling technique.
 
DLUS would be more honest for an undersampling filter. Does it upscale or does it try to anti-alias an image?

PS. upscaling it is, yeah calling that supersampling is taking the piss.

PPS. if the filter isn't spatio-temporal it's going to be severely restricted in introducing new high frequency detail.

Indeed DLSS is not a supersampling technique, quite the opposite.
If you have aliasing artifacts at your output resolution, the worse thing you can do is reduce the resolution, as that just will introduce even more aliasing artifacts.
And that's exactly what DLSS does. Trying to make a bad situation better by first making it worse is just stupid.

As an example you can subsample a circle to the point where it actually becomes a square.
No matter how many super sampled circles you show to your neural network, it will never know how to turn this square back into a circle,
because the square could be a circle but actually also just a square.

That said DLSS is just one of many upscalers and there are a lot of them, also ones trying to add detail like fractal scalers etc
 
MfA's on the money here. DLSS is not doing supersampling which is taking more data than pixels and getting more samples per pixel. DLSS is definitely undersampling.

The choice of calling it supersampling is marketing driven, because everyone knows supersampling is the best, most demanding AA technique. It seems a name deliberately chosen to suggest that you are using some form of supersampling technique.
You mean in the sense the algorithm is taking a 4K image and bringing it back down to 1440p? That would be traditional supersampling if that is the definition, this algorithm is not doing that.

Definition from Wikipedia:
Supersampling is a spatial anti-aliasing method, i.e. a method used to remove aliasing (jagged and pixelated edges, colloquially known as "jaggies") from images rendered in computer games or other computer programs that generate imagery. Aliasing occurs because unlike real-world objects, which have continuous smooth curves and lines, a computer screen shows the viewer a large number of small squares. These pixels all have the same size, and each one has a single color. A line can only be shown as a collection of pixels, and therefore appears jagged unless it is perfectly horizontal or vertical. The aim of supersampling is to reduce this effect. Color samples are taken at several instances inside the pixel (not just at the center as normal), and an average color value is calculated. This is achieved by rendering the image at a much higher resolution than the one being displayed, then shrinking it to the desired size, using the extra pixels for calculation. The result is a downsampled image with smoother transitions from one line of pixels to another along the edges of objects.

The number of samples determines the quality of the output.

Which is how the algorithm is indeed trained by. The up-resolution to 4K is not supersampling by definition.

This is the first step of DLSS before the up-resolution step. You can run DLSS without up-resolution as well.
 
No, it is supersampling of sorts. It's bringing up more (hallucinated) samples, than there are pixels. Even they are hallucinations from the AI they are samples that are not directly infered from the existing pixels, like an algorith would do, but rather from training and previous knowledge.

An example I can think of (problably not the best) is: I give you a text with many of the letters missing and based on your training/knowledge/experience you can fill in the blanks. If you do a perfect job, the result will be indistinguisable from the real text. Your experience/training has provided the missing samples (supersampling). Without training/previous knowledge there's no algorithm that can fill in the blanks, only based on the letters that I provide you (undersampling).
 
An example I can think of (problably not the best) is: I give you a text with many of the letters missing and based on your training/knowledge/experience you can fill in the blanks. If you do a perfect job, the result will be indistinguisable from the real text. Your experience/training has provided the missing samples (supersampling). Without training/previous knowledge there's no algorithm that can fill in the blanks, only based on the letters that I provide you (undersampling).

Isn't this example more like context-aware error correction? Humans are very good at this.
 
Added, and I'm curious what a simple bicubic or lanczos upscaler would make from the 1440p
The issue here is pretty interesting because DLSS is trained from supersampled images so that's something that's hard to accommodate for in our discussion.

So if we work with 1080p TAA vs 1080p DLSS: Nvidia pulled a 1080p no AA image. Creates the same image at 4K and supersamples it back down to 1080p. That supersampled image is what the model is trained against for the AA portion.

If you want to make test Nvidia DLSS performance:
You need to run the same image supersampled.
1440p supersampled to 4K
vs 1440p DLSS
And see how close DLSS is with the source.
Then 1440p supersampled needs to somehow go through a basic upscaler and compare that image vs 4K DLSS
 
An example I can think of (problably not the best) is: I give you a text with many of the letters missing and based on your training/knowledge/experience you can fill in the blanks. If you do a perfect job, the result will be indistinguisable from the real text. Your experience/training has provided the missing samples (supersampling). Without training/previous knowledge there's no algorithm that can fill in the blanks, only based on the letters that I provide you (undersampling).

Now do it with a neighbourhood of just 5 words ... there's only so much a local non temporal filter can do.
 
Now do it with a neighbourhood of just 5 words ... there's only so much a local non temporal filter can do.
AI Up-Resolution is a very successful ML application. The greatest restriction is how much processing time is allowed to complete the task.
That puts heavy restrictions on how many inputs can be processed and the type of algorithms or number of policies it can run to get everything down to (IIRC, <7 ms)
This is also an algorithm that works post rendering IIRC. Developers are not providing additional inputs into the algorithm except perhaps a motion vector.

I don't understand your rebuttal, missing input data is the problem we're trying to solve with these algorithms. The goal is to not have to generate those input data because it's expensive to.
 
You mean in the sense the algorithm is taking a 4K image and bringing it back down to 1440p?
No, In the sense that numerically, if you want 4K, or 8 million pixels on screen, you have to render more than 4K, more than 8 million samples, to get supersampled data. That's mathematical. For 2x SSAA, you'd need to render 16 million samples. If you aren't generating more samples than pixels, you aren't supersampling.

Inferring or hallucinating or reconstructing data from less data than pixels cannot ever be supersampling. It needs a different term. It doesn't matter how supersampled the source data is when training, as it's the step at the end generating the pixels that matter.

If DLSS can legitimately be called supersampling, then a game that shows pre-raytraced images on a bunch of in-game models could legitimately be called real-time raytracing. ;) DLSS is machine upscaling, and there shouldn't be any association with oversampling of the scene. Even if ML based upscaling ends up producing better results than supersampling, it should never be called supersampling as that's technically incorrect.

If you do a perfect job, the result will be indistinguisable from the real text. Your experience/training has provided the missing samples (supersampling)
That's redefining what supersampling is! ;) Supersampling is a method, not a result. If you use rasterising to produce a scene indistinguishable from a the same scene raytraced, you haven't thus raytraced the scene. Supersampling is the process of taking more samples than there are pixels and averaging them to produce a pixel value. Anything that has a less than or equal number of pixel samples to the number of pixels of the display cannot be supersampling and should be called something else that clearly defines what it's doing. Deep Learning Reconstruction or Deep Learning Hallucination or Deep Learning Upscaling or Deep Learning Imagery Imbetterment, but not Deep Learning Super Sampling.

Of course Deep Learning Upscaling doesn't sound like it's doing as much as Deep Learning Super Sampling, regardless of results, and rather than sell a card who's secondary main feature is a form of upscaling that gamers associate with 'yuck', nV decided to associate the results with supersampling which gamers associate with 'tasty'.
 
Last edited:
No, In the sense that numerically, if you want 4K, or 8 million pixels on screen, you have to render more than 4K, more than 8 million samples, to get supersampled data. That's mathematical. For 2x SSAA, you'd need to render 16 million samples. If you aren't generating more samples than pixels, you aren't supersampling.

Inferring or hallucinating or reconstructing data from less data than pixels cannot ever be supersampling. It needs a different term. It doesn't matter how supersampled the source data is when training, as it's the step at the end generating the pixels that matter.

If DLSS can legitimately be called supersampling, then a game that shows pre-raytraced images on a bunch of in-game models could legitimately be called real-time raytracing. ;) DLSS is machine upscaling, and there shouldn't be any association with oversampling of the scene. Even if ML based upscaling ends up producing better results than supersampling, it should never be called supersampling as that's technically incorrect.
But DLSS is an anti-aliasing method. As with Supersampling.
The Deep Learning algo is trained off supersampled images and trained to recreate them using the input image. By definition, it cannot do better than the source, which is a supersampled image.
4KDLSS is misleading because it didn't supersample 4K to 8K and train off 4K super sampled images. It did 4K to 1440p supersampling. Then it performed an up-resolution to 4K.
 
AI Up-Resolution is a very successful ML application.

Multi-frame superresolution using neural networks is a popular research area for video. AFAICS DLSS is single frame, this has some inherent limitations ... you can't really use the hallucination approach because it's very unlikely to be temporally stable.
 
That's redefining what supersampling is! ;) Supersampling is a method, not a result. If you use rasterising to produce a scene indistinguishable from a the same scene raytraced, you haven't thus raytraced the scene. Supersampling is the process of taking more samples than there are pixels and averaging them to produce a pixel value. Anything that has a less than or equal number of pixel samples to the number of pixels of the display cannot be supersampling and should be called something else that clearly defines what it's doing. Deep Learning Reconstruction or Deep Learning Hallucination or Deep Learning Upscaling or Deep Learning Imagery Imbetterment, but not Deep Learning Super Sampling.

Of course Deep Learning Upscaling doesn't sound like it's doing as much as Deep Learning Super Sampling, regardless of results, and rather than sell a card who's secondary main feature is a form of upscaling that gamers associate with 'yuck', nV decided to associate the results with supersampling which gamers associate with 'tasty'.

Fair enough in regards to the actual and very specific definition of what supersampling method is. However, I was going more by your loose definition here.

DLSS is not doing supersampling which is taking more data than pixels and getting more samples per pixel. DLSS is definitely undersampling.

Which again, DLSS is taking far more data than the provided pixels in order to produce the end result. And that data are pixels, pixels provided by the ground truth images during training.

EDIT: I'm not saying that the name is appropiate BTW, but it's definitely not undersampling, since it's a case of providing more samples than there are pixels.
EDIT2: And for what it's worth, generally/in a loose sense, I also tend to consider temporal methods as supersampling, in the sense that they provide more samples than there are pixels. even though I know that supersampling is defined as an spatiall AA method.
 
Last edited:
Multi-frame superresolution using neural networks is a popular research area for video. AFAICS DLSS is single frame, this has some inherent limitations ... you can't really use the hallucination approach because it's very unlikely to be temporally stable.
I believe motion vectors are an added input for DLSS so I'm also unsure.
All AA techniques can produce artifacts or shimmering though. It becomes a question of preference I think on the user on which they prefer?
 
But DLSS is an anti-aliasing method. As with Supersampling.
If just using antialiasing and not upscaling, it's still not supersampling but a different form of AA. MLAA could produce results exactly like SS on bold shapes, and it was using 'more information' than pixels as it was coded with identified pixel patterns, but it wasn't ever supersampling because it wasn't sampling the source scene at more samples than pixels. DLSS is not sampling the source scene at more samples than pixels. It's sampling the scene at the same number of samples as pixels, and then picking data from a library or whatever to try to match those samples to a reference image. That method may end up the best method ever for antialiasing, but it still ain't supersampling. ;)

The Deep Learning algo is trained off supersampled images and trained to recreate them using the input image. By definition, it cannot do better than the source, which is a supersampled image.
Potentially ML reconstruction could do better than supersampling. Indeed, DLSS may one day produce far better results than 2xSS which isn't great on edges.
 
If just using antialiasing and not upscaling, it's still not supersampling but a different form of AA. MLAA could produce results exactly like SS on bold shapes, and it was using 'more information' than pixels as it was coded with identified pixel patterns, but it wasn't ever supersampling because it wasn't sampling the source scene at more samples than pixels. DLSS is not sampling the source scene at more samples than pixels. It's sampling the scene at the same number of samples as pixels, and then picking data from a library or whatever to try to match those samples to a reference image. That method may end up the best method ever for antialiasing, but it still ain't supersampling. ;)

Potentially ML reconstruction could do better than supersampling. Indeed, DLSS may one day produce far better results than 2xSS which isn't great on edges.
Right, I don't disagree with this. I'm strictly talking about the naming here.
For analogy sake: We have a race track and we have a car.
If I feed it information on with the best formula 1 driver driving that car around the lap vs a civilian in the same car driving that lap - we're going to see the AI drive very differently. It will attempt to reproduce the results on which it was trained by.

So F1AI vs CivilianAI. Those drivers aren't behind the wheel, but that's how it was trained. And the track time of the AI for F1 would be certainly faster than the civilian one.

We could do the same thing in rendering land and have it trained by MSAA, FXAA, SSAA, TAA, and the deep learning would try to emulate those antialiasing methods provided it was trained by them.
In this case, nvidia chose to SSAA to be the output, thus I'm unsure if it's technically misleading that they called it DLSS, as they used Deep Learning to emulate SSAA.
I need to be clear, it's not picking data from a library. I thought it was, but reading nvidia's document it's not. It's being trained to emulate SSAA.

f(x) = 1440SSAA(x) where X is the 1440p aliased image.
How close that gets to ground truth is the accuracy factor. The missing puzzle in our discussion is what SSAA technique Nvidia chose to deploy.
 
Okay, my definition needed to be precise about what data.

Yeah I was going by nª samples>pixels == supersampling, my bad really. We are talking about upsampling really here, whihc would be the opposite of undersampling rather than supersampling. The name really is inapropiate then, in light of that.

Iroboto already mentioned another of the posible misunderstandings, in that I understand is as being a upsampled 1440p, then upscaled to 4K. Infering that it is supersampled 4K is wrong, but I don't think the name does that? I think they've always been open about being lower resolution.
 
So, looks like Tensor Cores will be addressable through the recently released DirectML API, the same way RT Cores are addressable through DXR.


Microsoft has partnered with NVIDIA, an industry leader in both graphics and AI in our design and implementation of metacommands.

We are confident that when vendors implement the operators themselves (vs using our HLSL shaders), they will get better performance for two reasons: their direct knowledge of how their hardware works and their ability to leverage dedicated ML compute cores on their chips. Knowledge of cache sizes and SIMD lanes, plus more control over scheduling are a few examples of the types of advantages vendors have when writing metacommands. Unleashing hardware that is typically not utilized by D3D12 to benefit machine learning helps prove out incredible performance boosts.

perfchart.png

https://devblogs.microsoft.com/directx/gaming-with-windows-ml/
 
Back
Top