Nvidia DLSS 1 and 2 antialiasing discussion *spawn*

NVIDIA said they will improve the DLSS implementation in Metro and Battlefield, they probably had to rush it to meet some deadlines, they will apply more training to get to FF15 levels.

Rush it? They've had multiple months now. If it will take multiple months for satisfactory DLSS for each new game, that doesn't imply good things for DLSS.

Not to mention that only select few games will ever get decent DLSS as it has to be trained by NV for each game to get the best results.

Regards,
SB
 
Big surprise, a game with a quasi infinite variety of viewpoints and a variety of different maps, is harder to hallucinate sharply than a demo on rails.
(and FF15 had plenty of undersampling artefacts, as discussed before)
DLSS is not an algorithm you make once and has deterministic outcome, it's highly data dependent, and the more data it needs to predict the worse it get's on average.

I didnt talk about the undersampling. I talked about the sharpness of the picture.
 
Rush it? They've had multiple months now. If it will take multiple months for satisfactory DLSS for each new game, that doesn't imply good things for DLSS.

Not to mention that only select few games will ever get decent DLSS as it has to be trained by NV for each game to get the best results.

Regards,
SB
Yea that's the big downside, a shame it can't process it locally and then upload the data to the cloud or something. Call it crowd-training or something so that it just gets better over time the more people play it.
 
Sorry but I don't think you understand how DLSS works. Training is a significant stage between the developers and Nvidia on their massive image processing ML systems prior to actual implementation.
They also appear to do training on the fly, however.

This makes sense to me. According to nVidia, they get frame data from game devs and use that to improve their models. However, those models probably can't take into account all of the possible variations of settings and whatnot. And the models probably also aren't durable between levels/scenes in the games. So you end up with a set of learning models per game, models which make it easier for the on-the-fly learning to get to a good answer quickly. Which would explain why the effectiveness seems to not kick in until a little bit into a scene.

I know that nVidia reps have claimed that their learning setup produces actual pixel colors from the learning algorithm, but I really really doubt it. I just can't believe that a learning model like that could ever produce really good results.

At least with a system that decides how many samples to apply per-pixel, the worst-case behavior is tightly constrained.
 
Rush it? They've had multiple months now. If it will take multiple months for satisfactory DLSS for each new game, that doesn't imply good things for DLSS.
They are working on multiple titles simultaneously, there are over 30 titles with announced support, not to mention benchmarks and demos. And each title takes variable time and processing power.
 
The training is done on 64xSSAA rendering using NVIDIAs supercomputer and the data for end-user portion of DLSS is delivered with the drivers, you can't do that "on the fly"
Sure you can. I described how earlier. It takes a bit of extra memory bandwidth and storage, as you need to store some relevant portion of the inputs. The score derived is a quick calculation that requires the upscaler to do a small amount of extra work. This then gets pumped into the learning model as another data set.

The pre-training makes this better, but it can be done on the fly just as well as in the datacenter. The difference is that the datacenter training has to try to make one model to fit a huge number of different variables, while the on-the-fly training, if it's done, only has one set of settings, one resolution, and one type of scene to deal with at a time.

Basically, there's no way to explain this algorithm if it changes its result over time (as Tom's Hardware reported) except that it does some on-the-fly training. The reason for the pre-training is to get close, so that the on-the-fly training doesn't have as much to do. Without that training, a game might take minutes to converge on a decent result, and might take minutes again the next time there's a very large change in scene composition. The pre-training cuts that down dramatically. But I still think the on-the-fly training is almost certainly being performed.
 
Sure you can. I described how earlier. It takes a bit of extra memory bandwidth and storage, as you need to store some relevant portion of the inputs. The score derived is a quick calculation that requires the upscaler to do a small amount of extra work. This then gets pumped into the learning model as another data set.

The pre-training makes this better, but it can be done on the fly just as well as in the datacenter. The difference is that the datacenter training has to try to make one model to fit a huge number of different variables, while the on-the-fly training, if it's done, only has one set of settings, one resolution, and one type of scene to deal with at a time.

Basically, there's no way to explain this algorithm if it changes its result over time (as Tom's Hardware reported) except that it does some on-the-fly training. The reason for the pre-training is to get close, so that the on-the-fly training doesn't have as much to do. Without that training, a game might take minutes to converge on a decent result, and might take minutes again the next time there's a very large change in scene composition. The pre-training cuts that down dramatically. But I still think the on-the-fly training is almost certainly being performed.
How exactly do you propose the "on-the-fly training" works when the end user doesn't render native res with 64xSSAA? You only have the reference data to work with.
I'm pretty sure any possible upgrades come via future driver updates if/when NVIDIA releases new reference data for end users. You can't train from lower resolution scenes.
 
Basically, there's no way to explain this algorithm if it changes its result over time (as Tom's Hardware reported) except that it does some on-the-fly training.

How about temporal reprojection? Does it change over minutes or over seconds? Where did you get that idea from anyway?
 
How exactly do you propose the "on-the-fly training" works when the end user doesn't render native res with 64xSSAA? You only have the reference data to work with.
I'm pretty sure any possible upgrades come via future driver updates if/when NVIDIA releases new reference data for end users. You can't train from lower resolution scenes.
By measuring contrast.

To train a learning model, you need three things:
1) Input data.
2) A compute process which includes the learning model.
3) A scoring system.

The trick for on-the-fly training is that scoring system. You could do it by using high-sample AA on a small fraction of pixels, and only using those pixels in training. But there are other hacks that might work really well too.
 
By measuring contrast.

To train a learning model, you need three things:
1) Input data.
2) A compute process which includes the learning model.
3) A scoring system.

The trick for on-the-fly training is that scoring system. You could do it by using high-sample AA on a small fraction of pixels, and only using those pixels in training. But there are other hacks that might work really well too.


Granted, I'm not expert on the matter by any stretch of imagination, but I'm not sure what you're suggesting would actually work either
Regardless of that, let's assume it could work like you described, now we just have to wait and see if there's ever going to be anything like that. So far nothing has suggested any such possibility, in fact everything told by NVIDIA and game devs is saying there's no such elements involved.
 
Metro will get a day one patch which will improve DLSS over the review code:

https://www.metrothegame.com/news/patch-notes-summary/

I'm still suprised about the bad quality in Metro and Battlefield. The implementation in both games is far away from FF15 which looks ultra sharp and crisp with DLSS...

FFXV has an art style conducive to deep learning reconstruction, lots of repeated patterns everywhere and half the assets are rather low poly/detail means the reconstruction needs less training and will be more likely to be accurate. The more detail there is in a scene however, and the more varied the scene, the worse deep learning becomes at reconstruction. As it needs to guess at ever more possibilities for what those pixels are supposed to be.
 
Granted, I'm not expert on the matter by any stretch of imagination, but I'm not sure what you're suggesting would actually work either
Regardless of that, let's assume it could work like you described, now we just have to wait and see if there's ever going to be anything like that. So far nothing has suggested any such possibility, in fact everything told by NVIDIA and game devs is saying there's no such elements involved.
Yeah, it's possible I've inferred more than is there. It doesn't matter all that much. But if it's working the way I think it is, then it's something that nVidia could easily keep secret.

I just don't think there's any way the algorithm can change over time if it's not doing something like this. There are all sorts of ways to deal with the issue for rapid on-the-fly training, such as using the large-scale learning algorithm to produce a simplified model set which has fewer tweakable parameters. Fewer parameters which can change values accelerates learning.
 
Yeah, it's possible I've inferred more than is there. It doesn't matter all that much. But if it's working the way I think it is, then it's something that nVidia could easily keep secret.

I just don't think there's any way the algorithm can change over time if it's not doing something like this. There are all sorts of ways to deal with the issue for rapid on-the-fly training, such as using the large-scale learning algorithm to produce a simplified model set which has fewer tweakable parameters. Fewer parameters which can change values accelerates learning.
What if nvidia continues training at their servers and releases updates once in a while?
 
What if nvidia continues training at their servers and releases updates once in a while?
If my supposition is correct, the issue isn't so much training. The thing is, actually running the learning model on each pixel has to be fast. And I'm just skeptical that a fully pre-trained system can get there.

Here's the basic idea. Let's say that they have access to 16 input numbers when computing the DLSS result per-pixel and a simple learning model. With 16 input numbers, the learning algorithm will try to find some number of correlations between those 16 inputs and the final result. Each correlation factor increases the size of the model. For instance, if they just took the bare 2-point correlation (which has a different value for each pair of 16 inputs, including the self-correlation), then they've got 136 independent factors (N(N+1)/2). The model might even take into account 3-point correlations (which will have many, many more). But typically they'll discard factors which are too small in value to be meaningful, and only keep the top N factors (say, 50 or so). In that case, that means that in order to compute the final DLSS result, they've got to perform something like 50 scalar multiply-add operations to produce the answer.

And, well, that's probably not enough to get a good result. You could probably train a model like the above to work really well on a narrow set of different scene types. But there are a huge number. This is where deep learning comes in.

With deep learning, instead of just a single network, you split the network into multiple levels. At the lowest level you have a set of nodes which are used to compute the final numbers. The next level up decides which of those nodes to consider to be more important than others. Conceptually it's like the lower-level nodes compute the final results, while the higher-level nodes determine which set of low-level nodes to use. So it's kind of like the low-level nodes represent a class of models, while the high-level nodes determine which models to use.

More low-level nodes might represent good accuracy on certain scenes, while more high-level nodes represent flexibility in the number of different sorts of scenes you can render well.

The reason why on-the-fly training might be useful is that the pre-baked model has to cope with every single game situation that is ever thrown at the video card. The huge number of potential configurations, games, and situations within those games could easily cause an optimal learning algorithm for the above to explode drastically. You could gain a little bit of mileage by simply asking the game dev to tell you when you've loaded a particular level, so that nVidia just trains different learning models for each level, and switches the model as you progress throughout the game.

But it might be better to break up the info a bit: use the datacenter for doing the lion's share of effort in training the model. But let the video card take care of the rest by re-training some of the high-level nodes as it renders scenes. You'd have to take this re-training into account in the datacenter training, but it should increase the flexibility of the system as a whole, because it makes it so that the learning model doesn't really need to retain all of the information about all of the different types of models: it just kicks things off and lets the video card figure the rest out as it goes.

Basically, with on-the-fly training you could get away with a far, far smaller learning model, which means higher performance.

Of course, the above argument might not actually apply. It's conceivable I misunderstood what Tom's Hardware measured, with it taking a little bit for the model to "kick in". Perhaps different scenes in different games really are similar enough that a single model with no more than a couple hundred parameters or so can do the trick. But it seems unlikely.
 
The reason why on-the-fly training might be useful is that the pre-baked model has to cope with every single game situation that is ever thrown at the video card. The huge number of potential configurations, games, and situations within those games could easily cause an optimal learning algorithm for the above to explode drastically.

I agree with all these limitations of a ML done a priori solution. The difference is you take that to mean Nvidia's solution must be doing something better. I take it to mean Nvidia is just shoehorning a dead-end use for the tensor cores they've included in their GPUs.
 
DLSS Explained (super dumbed down explanation with a pinch of PR Boogaloo)
https://www.nvidia.com/en-us/geforce/news/nvidia-dlss-your-questions-answered/
Q: How does DLSS work?
A: The DLSS team first extracts many aliased frames from the target game, and then for each one we generate a matching “perfect frame” using either super-sampling or accumulation rendering. These paired frames are fed to NVIDIA’s supercomputer. The supercomputer trains the DLSS model to recognize aliased inputs and generate high quality anti-aliased images that match the “perfect frame” as closely as possible. We then repeat the process, but this time we train the model to generate additional pixels rather than applying AA. This has the effect of increasing the resolution of the input. Combining both techniques enables the GPU to render the full monitor resolution at higher frame rates.
 
Last edited:
Back
Top