Nvidia DLSS 1 and 2 antialiasing discussion *spawn*

NVIDIA added DLSS along side RTX in the indie game Deliver Us the Moon, DLSS is now upgraded to a new version that includes 3 quality tiers: Performance, Balanced and, Quality, controlling the rendering resolution of DLSS./
Yeah, I heard DLSS in the game is quite improved and good to see ongoing improvements! Can't wait for some reviews on the game.
 
The most recent implementation in Control was called DLSS but was just a post process upscale done on shader cores. Nvidia said more work had to be done to get a version using the tensor cores performant enough.
Will this be the first known run of DLSS on tensor? That means we’re looking at an actual DNN model running real time?
 
Will this be the first known run of DLSS on tensor? That means we’re looking at an actual DNN model running real time?
No, there are basically 3 stages for DLSS:
Before Control, running on the Tensor cores with variable quality
Control running on the shader cores with great quality
After Control, running back again on the Tensor cores with great quality

Have you tried it yet? Please give results! Super interested to hear more
I tried the game, I couldn't spot any difference between DLSS Quality and DLSS off at first glance, of course more investigations are needed, NVIDIA says it has some advantages and disadvantages.

Pros:

deliver-us-the-moon-fortuna-nvidia-dlss-comparison-001.png


deliver-us-the-moon-fortuna-nvidia-dlss-comparison-002.png


deliver-us-the-moon-fortuna-nvidia-dlss-comparison-003-v2.png


Cons:

deliver-us-the-moon-fortuna-nvidia-dlss-comparison-005.png


deliver-us-the-moon-fortuna-nvidia-dlss-comparison-006.png
 
Will this be the first known run of DLSS on tensor? That means we’re looking at an actual DNN model running real time?
There were previous implementations but IMO they were quite bad and at best simply matched standard upscaling, often times being quite a bit worse.

Hopefully some high quality video comparisons emerge as using screens to judge IQ is meaningless.
 
There were previous implementations but IMO they were quite bad and at best simply matched standard upscaling, often times being quite a bit worse.
Hopefully some high quality video comparisons emerge as using screens to judge IQ is meaningless.
That's not what he asked! If he wanted your personal opinion I believe he would have so stated.
Are you in a position to comment on whether it is an actual DNN model running, or do you lack the knowledge to comment?
 
Will this be the first known run of DLSS on tensor? That means we’re looking at an actual DNN model running real time?
Weren't the previous implementations of "DLSS" outside of Control, all using the tensor cores? It seems quite obvious to me that they were, given the difference in quality we're seeing, and the fact that certain resolutions were only supported on certain graphics cards in some games..referring to the fact that the performance penalty for running DLSS with lower resolutions on higher end cards was disabled because performance would be worse than just running the game at said lower res.

Anyway, whether it's a DNN done on the tensor cores, or shader based algorithm... doesn't particularly matter to me. The upscaling is impressive and if we can expect this quality or higher quality in the future, plus these improvements to framerates.. I'm happy. We just need more implementations in more games.
 
Weren't the previous implementations of "DLSS" outside of Control, all using the tensor cores? It seems quite obvious to me that they were, given the difference in quality we're seeing, and the fact that certain resolutions were only supported on certain graphics cards in some games..referring to the fact that the performance penalty for running DLSS with lower resolutions on higher end cards was disabled because performance would be worse than just running the game at said lower res.

Anyway, whether it's a DNN done on the tensor cores, or shader based algorithm... doesn't particularly matter to me. The upscaling is impressive and if we can expect this quality or higher quality in the future, plus these improvements to framerates.. I'm happy. We just need more implementations in more games.
This information that they ever ran DLSS on CUDA is new to me; so I'll leave it as, I am now unsure.
CUDA can do the same thing as Tensor; but just at a processing speed disadvantage but the flexibility of being able to run a multitude of different types of machine learning algorithms. Tensors are less flexible; but much faster at what they do. If it was running CUDA before; you could run a more complex network on Tensor in the same time frame as CUDA.
 
This information that they ever ran DLSS on CUDA is new to me; so I'll leave it as, I am now unsure.
CUDA can do the same thing as Tensor; but just at a processing speed disadvantage but the flexibility of being able to run a multitude of different types of machine learning algorithms. Tensors are less flexible; but much faster at what they do. If it was running CUDA before; you could run a more complex network on Tensor in the same time frame as CUDA.

As stated here back then, Control did not use tensor cores and the algorithm was not ML, but 'inspired by ML results', which i see as a marketing way of saying 'ok fine, we learned don't need ML or tensor cores just to upscale, but let's stick at the term DLSS'.
But that's just my opinion / impression, and it does not matter how people think or what terms they use.

However, your view on the technical side seems somewhat off to me, so i'll phrase it my way in hope for correction if i'm wrong:
1. There is no DLSS running only on Tensor cores. That's not possible because tensors can do only math operations but no logics / branches. So every ML algorithm occupies cuda cores, and tensors help with the bulk of work.
2. It could be their newest method dos not use any kind of ML algorithm at all, but still use tensors for acceleration. (Actually i guess that's the case. Matrix multiplications are useful for many things.)
3. I assume there exist low precision data type extensions for game APIs that indirectly expose tensors to any shader types.

IMO upscaling with a factor <= 2 is a (too simple) problem where handcrafted solutions should lead to better results at better performance (think about hints of temporal reconstruction, velocity vectors etc.), and Control showed this is indeed the case (or at leas it was back then).

Edit: ... said all this more then once, but it might help with technical misconceptions coming from marketing.
 
1. There is no DLSS running only on Tensor cores. That's not possible because tensors can do only math operations but no logics / branches. So every ML algorithm occupies cuda cores, and tensors help with the bulk of work.
Most ML algorithms will not branch. If branching was involved within the model itself, it would complete in different time frames. Running multiple policies and selecting the best one would involve branching let's say; but I don't believe that is what is happening here. Neural networks are done by weight and activation functions.

Decision trees are perhaps the only thing I can think of that branch (or have branch like qualities) at the moment, nothing else really comes to mind.

2. It could be their newest method dos not use any kind of ML algorithm at all, but still use tensors for acceleration. (Actually i guess that's the case. Matrix multiplications are useful for many things.)
Possibly. It's not really DLSS then. It's just a different sampling method.

3. I assume there exist low precision data type extensions for game APIs that indirectly expose tensors to any shader types.

IMO upscaling with a factor <= 2 is a (too simple) problem where handcrafted solutions should lead to better results at better performance (think about hints of temporal reconstruction, velocity vectors etc.), and Control showed this is indeed the case (or at leas it was back then).

Possibly. That seems really specific to something game works would support, I don't think it's something they would open to developers.
 
Most ML algorithms will not branch.
But you still need to do the math operations in order and handle the data flow, both handled by Cuda cores?

Do you think Cuda cores could be available to other tasks a lot while running ML algorithms? Probably not more than usual, because Tensor ops are fast but still just basic operations not whole algorithms.
And do you think most data can be kept in on chip memory, so BW is low?
(Knowing nothing about ML, but wondering if it's a good candidate for async background tasks)

I don't think it's something they would open to developers.
Yeah, i think they don't want / need to tell how their upscaling works becasue it's propertiary tech of theirs.
But what i see here is lots of confusion and wrong assumptions, like...
* Tensor cores are only for ML and they work independent from the rest of GPU.
* Tensor cores are useless if you do not run ML.
* Tensor cores can only be utilized from NV itself, or Cuda at best, with other devs having no access at all.

I'm still unsure here myself (mostly about the last point), which should not be the case for a developer interested in GPUs.
I really dislike NVs marketing. Although effective, there should be more attention on clarifying things technically from their side.
If my actual understanding is right, this only is a result from hearing how GTX1660 'shrinked' it's tensors so only fp16 remained. This fact gave me the best impression about how things really are, making tensors finally more 'useful' than just a target of critique.

Oh, one more question. Offtopic, but i really think we want other ML applications than upscaling in games.
There is a very nice example of AI shown here, see videos at the bottom: https://www.quantamagazine.org/arti...ers-tool-use-in-hide-and-seek-games-20191118/
Those characters play hide and seek games in a 3D environment, using dynamic objects simulated by physics engine. Just like games work.
And they learned to use rigid bodies to block doors. Super awesome, but they do not mention performance requirements. What could we expect here?
 
Last edited by a moderator:
That's interesting and possibly would be an amazing way of evolving a games difficulty according to player skill/progress/learning curve.
 
Looks like DLSS is strong in Wolfesntein Youngblood, sometimes reconstructing some elements even better than native resolution.
This is technically impossible - you can't improve over what the artist originally intended. Even if the results appears to look better in your eyes, they're not better because they're wrong.
 
sometimes reconstructing some elements even better than native resolution.
I assume it is because TAA runs at higher quality at the lower resolution than in native 4K, which would make sense because the higher the resolution, the less AA you need.
If TAA contributes to the input of DLSS, lower resolution but better AA should become better data at some point, explaining why the upscaled end result can beat native high res.

But i don't know if TAA is used at all with DLSS, or if DLSS replaces it?
 
This is technically impossible - you can't improve over what the artist originally intended. Even if the results appears to look better in your eyes, they're not better because they're wrong.
You can if the native res is combined with TAA, which doesn't represent the elements correctly, in the DF video, DLSS reconstructed a circle better than native res+ TAA.
But i don't know if TAA is used at all with DLSS, or if DLSS replaces it?
No it's immediately disabled upon selecting DLSS, same for Variable Rate Shading.
 
Back
Top