Nvidia DLSS 1 and 2 antialiasing discussion *spawn*

I think it's 100% marketing, they made a new card that outperformed the previous gen by 30% so to fluff that up they created a new aa technique that ran wayyyyyy better on the new hardware (or conversely much worse on the previous gen, which isn't inconceivable considering nvidias past actions) to make the performance divide much larger.

The fact that it isn't particularly good would give credence to the idea
 
I think it's 100% marketing, they made a new card that outperformed the previous gen by 30% so to fluff that up they created a new aa technique that ran wayyyyyy better on the new hardware (or conversely much worse on the previous gen, which isn't inconceivable considering nvidias past actions) to make the performance divide much larger.

The fact that it isn't particularly good would give credence to the idea
I would wait proper tests with the native resolution version before judging it a failed attempt.

If it's faster than normal TAA and handless noise better and even within single frame, it or similar techniques have place in games.
 
I would wait proper tests with the native resolution version before judging it a failed attempt.
If it's faster than normal TAA and handless noise better and even within single frame, it or similar techniques have place in games.
It's not a case of whether they have place in games or not ; it's a matter of whether that silicon is est spent on Tensor cores versus CUs. If denoising and upscaling can be handled better and more efficiently on shaders, it's a failed attempt (well, shoe-horned attempt). And it would seem upscaling can be handled better on shaders, so now we have to see about denoising.
 
It just illustrates what shifty Geezer was saying, it's just marketing hype on what else they did to make use of that non-gaming die space. (4k AA..?)

The RTX "Gaming" line is just leftover HPC dies, and in this case, they preferred to keep the same design rather than having a cut down "GTX" part without Tensor and RT cores because there wouldn't be enough of a difference from Pascal at 12nm. Keeping one design is probably cost effective for Nvidia as well, it's just they have to find uses for the Tensor and RT cores for general gaming applications. They didn't add Tensor and RT cores because games need those, they've added RTX and DLSS because they had to make use of the additional hardware to justify the cost increase.
 
Last edited:
I'm anxiously waiting to see what DLSS 2x is all about, as it stands DLSS is interesting, but it's not something I'd choose over native resolution. I think solutions like dynamic resolutions and checkerboard rendering are really great for TVs, where you're supposed to be quite far away, but on PC... Not so hot, at least from my perspective. I sit on my desk and have the monitor quite close in which case I really don't want to see a blurry image. Whenever a game is not performing as well as I want it to be, Resolution, Render Quality and Texture Quality are definitely the last options I'll tweak. I really love sharpness I guess :p
 
They didn't add Tensor and RT cores because games need those, they've added RTX and DLSS because they had to make use of the additional hardware to justify the cost increase.
By creating more costs and headache to support RTX and DLSS with game developers, game engines, graphics APIs, driver support, super computer machine learning, constantly updated algorithms and much much more than that? Come on!

The fact that NVIDIA pioneered DXR with Microsoft means that this was their intention from the start, and even since long time ago. You could actually see it if you trace back the origins of stuff like VXAO, VXGI and HFTS. These effects have some basic elements of ray tracing in them. And they were exclusive only to NVIDIA. The company was heading in the direction of ray tracing in games long before they introduced RTX.

If denoising and upscaling can be handled better and more efficiently on shaders,
Common sense dictates that denoising can't be done better on shaders than Tensor cores, since you are taking away shaders that could've participated in rasterization. Diligating denoising to tensor units frees those shaders to help in traditional gaming workloads to increase performance.
 
Last edited by a moderator:
Common sense dictates that denoising can't be done better on shaders than Tensor cores, since you are taking away shaders that could've participated in rasterization. Diligating denoising to tensor units frees those shaders to help in traditional gaming workloads to increase performance.
But you are using silicon up to implement those Tensor cores. As already mentioned, Insomniac's temporal injection gets great results from a 1.8 TF GPU that's also rendering the game. So let's say Insomniac are dedicating a huge 20% GPU budget on upscaling, that'd be 0.36 TFs. Let's call it 0.4 TF. If the Tensor cores take up more die space than 0.4 TFs of CU would (which they do), they could be replaced with CUs and perform reconstruction on those CUs.

The same applies to denoising. If an algorithm can be implemented in shaders that is as/more effective per mm² silicon than Tensor cores, then it would be better to increase shader count rather than add Tensor cores. Tensor cores were invented for ML applications in the automative industries, etc., where flexibility in learning and applying AI is the priority. That doesn't mean they are ideal for realtime graphic solutions. nVidia are pursuing that route and using their architecture that way, but there could well be better options built from the ground up for realtime graphics instead of based on generic machine-learning solutions.
 
Tensor Cores are nearly 8x faster than one TPC (128 Cores). So you think they need 8x more space than a whole TPC?
And it doesnt make sense to use one implementation of one developer. Insomniac doesnt go around and will implement it into every engine.

Indie developers can go to nVidia and let them train DLSS for them. It saves them money and time.
 
But you are using silicon up to implement those Tensor cores. As already mentioned, Insomniac's temporal injection gets great results from a 1.8 TF GPU that's also rendering the game.
Great results in what exactly? their game? this could be the only game where their technique proves to have great results. They might have needed deep changes in the pipeline to make their technique works well, or might have needed extensive hand tuning. It could also not work well in other types of games like a shooter or a racing game. What's their limit on fps? Could it work above 60? 90? 120? Even their base resolution might be much higher than DLSS for the tech to work well. Also don't exclude the fact that their tech works within a fixed platform, where they know and control their budget quite well. this might not be that suitable in a variable environments like PCs.

But DLSS doesn't need any of that, it doesn't need modifying the engine of the game, nor does it need specific titles to work well. Works at any fps/resolution. Does the handtuning automatically through AI. And it seem to upscale from 1440p to 4K with ease giving the user a large fps increase.
 
Tensor cores were invented for ML applications in the automative industries, etc., where flexibility in learning and applying AI is the priority. That doesn't mean they are ideal for realtime graphic solutions. nVidia are pursuing that route and using their architecture that way, but there could well be better options built from the ground up for realtime graphics instead of based on generic machine-learning solutions.

I would think it’s the exact opposite. Given the dynamic nature of real-time rendering a generic ML solution would be far more attractive than having to write custom implementations for every engine/game/scene. Of course coming up with a truly generic solution is an immense challenge given the tremendous number of variables the model would need to handle.

The notion that DLSS was invented to justify the inclusion of tensor cores is a bit silly. Tensors obviously accelerate RT denoising so there is a very clear benefit already. Whether you can do upsampling without Tensors is sorta irrelevant as that’s not their primary purpose.

DLSS is just icing on the cake, another marketing checkbox for good measure. Given nVidia’s experience in professional image processing they likely realized very early on that ML techniques were applicable to real-time use cases as well. The preliminary results are at least as good as existing image space AA methods and by definition these ML models will improve over time.

I wouldn’t want to use it though, feels icky knowing that the image on my screen is upsampled. The console folks seem to have gotten used to it but it’s DLSS 2x or bust IMO.
 
Great results in what exactly? their game? this could be the only game where their technique proves to have great results. They might have needed deep changes in the pipeline to make their technique works well, or might have needed extensive hand tuning. It could also not work well in other types of games like a shooter or a racing game. What's their limit on fps? Could it work above 60? 90? 120? Even their base resolution might be much higher than DLSS for the tech to work well. Also don't exclude the fact that their tech works within a fixed platform, where they know and control their budget quite well. this might not be that suitable in a variable environments like PCs.
I have no idea. None of us does - the details aren't described. However, there are plenty of reconstruction techniques like checkerboarding where it's obvious you only need expose some in-game data to a post-process compute solution to get good results. We know DLSS examines an image to determine which boundaries are which objects - with an ID buffer, you don't need that.

As such, no-one should be claiming DLSS is something wonderful until the alternatives are fully considered. The fact you've immediately dismissed out-of-hand the likelihood of computer-based solutions being able to generate comparable results shows a lack of open-mindedness in this debate. No-one should be supporting nVidia or poo-pooing DLSS until we've decent comparisons. The only thing I caution against is looking at nVidia's PR for clarification. However, what is certain is that good-quality reconstruction can be achieved through a not-too-significant amount of compute, and it's certain that dedicating the Tensor core silicon budget to CUs wouldn't have stopped the new GPUs using reconstruction on compute, and nVidia could have gone that route and provided their own reconstruction libraries just like their own AA solutions and own DLSS reconstruction solution.

I would think it’s the exact opposite. Given the dynamic nature of real-time rendering a generic ML solution would be far more attractive than having to write custom implementations for every engine/game/scene.
Why are people thinking all these solutions are customised to a specific game/engine/scene?? MLAA, checkerboarding, and KZ:Shadow Fall, are all variations on a simple concept, adding more complexity and information processing with each evolution. You take a rendered pixel and examine its neighbours (spatial and temporal) to determine interpolation values. Now yes, Temporal Injection is presently an unknown, but there's little reason to think its voodoo as opposed to just another variation on these evolving techniques. And if Temporal Injection is to be dismissed because we don't know how it works, fine - compare DLSS to known quantities like the best checkboarding out there, such as Horizon Zero Dawn or whatever that landmark currently is.
 
As such, no-one should be claiming DLSS is something wonderful until the alternatives are fully considered.
The fact you've immediately dismissed out-of-hand the likelihood of computer-based solutions being able to generate comparable results shows a lack of open-mindedness in this debate
I have not claimed DLSS is a wonderful ultimate solution, and I am not dismissing the other options, I am just responding to the claims that DLSS is just a spur of the moment spawn and an excuse to work the tensor cores out, a notion that I find extremely silly and childish.
what is certain is that good-quality reconstruction can be achieved through a not-too-significant amount of compute,
This needs engine support.
and nVidia could have gone that route and provided their own reconstruction libraries just like their own AA solutions and own DLSS reconstruction solution.
Again, needs engine support, needs developers to actively work on it, and implement it into all of their engines and PC titles. And most developers don't actually bother. DLSS doesn't need that, it's an external solution outside of any game engine or specific implementation, which means it has the potential of achieving wide adoption with literally no efforts involved on the side of developers.

What NVIDIA is doing is just democratizing a checkerboard like technique into a wide range of PC games. Then there is also the aspect of Super Sampling to achieve higher quality.
 
MechWarrior 5: Mercenaries Developer on NVIDIA RTX and NVIDIA DLSS: We’ll Get the Greatest Benefit from Doing Both
Sep 21, 2018

As promised, here’s another in-depth interview focused on the implementation of the brand new NVIDIA RTX (raytracing) and DLSS (deep learning supersampling) technologies featured on the new GeForce RTX cards.

This time around, we talked with Alexander Garden, Producer at Piranha Games, the studio behind MechWarrior 5: Mercenaries. The first single-player focused MechWarrior experience since 2002, this game was made with Unreal Engine 4 specifically for the PC platform. It’s now due sometime in 2019, meaning Piranha Games will have plenty of time to optimize both NVIDIA RTX and NVIDIA DLSS features ahead of launch.
https://wccftech.com/mechwarrior-5-mercenaries-dev-nvidia-rtx-dlss/


Weighing the trade-offs of Nvidia DLSS for image quality and performance
Sep 22, 2018
Eurogamer's Digital Foundry has produced an excellent dive into the tech with side-by-side comparisons of TAA versus DLSS in the two demos we have available so far, and Computerbase has even captured downloadable high-bit-rate videos of the Final Fantasy XV benchmark and Infiltrator demo that reviewers have access to. (We're uploading some videos of our own to YouTube, but 5-GB files take a while to process.) One common thread of those comparisons is that both of those outlets are impressed with the potential of the technology, and I count myself as a third eye that's excited about DLSS' potential.
https://techreport.com/blog/34116/w...nvidia-dlss-for-image-quality-and-performance
 
Last edited:
I have not claimed DLSS is a wonderful ultimate solution, and I am not dismissing the other options, I am just responding to the claims that DLSS is just a spur of the moment spawn and an excuse to work the tensor cores out, a notion that I find extremely silly and childish.
Why silly and childish? Do you consider it impossible as a business decision for nVidia to develop tech focussed on one (two) market(s) and then look to repurpose it for another instead of investing in proprietary solutions in that other market which lacks value in the former, larger markets?

To me, that makes a lot of business sense. Create a GPU/ML platform to sell to datacentres, professional imaging, automotive, etc, and then see how you can leverage it to sell to high-end gamers for high-profit flagship gaming cards. If the non-gaming extras can be marketed well, you can secure mindshare far more profitably than creating discrete silicon for both AI and gaming markets.

I see nothing silly or childish in that proposition. Doesn't mean it's true, but you'll have to explain to me why it's ridiculous to even consider.
 
To take a step back even iphones and google pixel nowdays have dedicated hw for inference. There clearly is something happening industry wide when considering various types of machine learning/neural networks. I suspect image/video editing is one of the more fruitful use cases. In mobile context perhaps single pictures and on desktop it could be scaled way beyond including realtime graphics.
 
I see nothing silly or childish in that proposition. Doesn't mean it's true, but you'll have to explain to me why it's ridiculous to even consider.
This is obvious, because it comes with a huge pile of baggage. The amount of headache the creation of DLSS would stir is astounding on so many levels. There is no need for NVIDIA to even create DLSS, if their intention is saving costs. The existence of tensor cores is already justified for with ray tracing. You will have to extend that silly theory to RTX as well: So NVIDIA created RT cores to serve the professional markets, then decided hey! why not give RT cores to gamers as well! We will just implement RT cores forever in our gaming lines, enlarging our die sizes, increasing production costs, potentially affecting power consumption! We will be committed to enhancing them each gen as well. See how silly that sounds?

Fact is:
1-NVIDIA was already moving in the direction of ray tracing in games. (the creation of stuff like VXAO, HFTS, VXGI and DXR proves it)
2-RTX needed BVH acceleration and denoising acceleration.
3-RT cores are created, Tensor cores are repurposed from the Tesla line to serve RT cores as well.
4-DLSS is created to serve ray tracing as well by reducing the required resolution to play RT games.

What follows is the logical explanation for what happened: DLSS is repurposed to be used in regular games too, because not using it in these games would be a missed opportunity.
 
Common sense dictates that denoising can't be done better on shaders than Tensor cores, since you are taking away shaders that could've participated in rasterization. Diligating denoising to tensor units frees those shaders to help in traditional gaming workloads to increase performance.

If you would restrict denoising to neural networks that would be true, as the tensor cores are faster to compute NN then shaders.
Shaders on the otherhand are not limited to NN denoising, and there may be as good algorithms not based on NN denoising, that run equally fast or faster on shaders.

Also using the tensor cores comes not for free, they require massive amounts of NN weights data, and NN input and output data, competing with shaders for register and cache bandwidth.

From the Turing white paper it looks even worse as when tensor cores are active, no shading or RT can happen:
tft.png
 
Last edited:
So NVIDIA created RT cores to serve the professional markets,
Yeas, because they want them, benefit, and will pay for it.
...then decided hey! why not give RT cores to gamers as well!
Why not? It gets raytracing in the realtime domain and helps sell new hardware.
We will just implement RT cores forever in our gaming lines, enlarging our die sizes, increasing production costs, potentially affecting power consumption! We will be committed to enhancing them each gen as well. See how silly that sounds?
No, I don't. Rumours are the mid- and low-tier 2xxx devices will lack RTX. Regardless, performance in the raytracing seems fairly lacking, and incredibly expensive. There's a good argument that it's been released into the consumer space too early and nVidia could have left RT out of gaming for a few years, during which time they could probably develop custom denoising hardware that's far more efficient than the general-purpose Tensor cores. Tensor wasn't invented for denoising, but for generalised machine-learning, so it'd be very surprising if there's no better way to denoise in a more bespoke way.

As for increased costs, that doesn't matter if the profits go up accordingly. The profit per mm² for nVidia is probably higher for 20xx than 10xx, so its better for them than removing these features and making smaller, less proftable parts.

Now compare that to the alternative which is designing both 20xx with RTX and Tensor for the professional markets, and then also developing a new GPU architecture optimised for gaming of no use to these professional markets. nVidia now have more development costs, production concerns (making gaming dies at less profit per mm² than pro silicon and smaller production numbers for both dies meaning higher relative costs), and extended support across more devices. From a business standpoint, it strikes me as far more sensible to take the pro part and find best-case application of it in gaming than to mess around with discrete pro and gaming parts.
 
Last edited:
Why not? It gets raytracing in the realtime domain and helps sell new hardware.
They could just as easily sell new hardware on the basis of performance alone. Split Turing like Pascal, pimp it up with moar cores and you have a recipe for success.

You still haven't countered the massively increased hassle with DLSS and RTX which really spoils the saving cost part. Still hasn't countered the general movement in the direction of ray tracing with DXR and GameWorks. Or the elegance of DLSS as an outside generic solution. Which makes this whole theory very weak at best.
There's a good argument that it's been released into the consumer space too early and nVidia could have left RT out of gaming for a few years
Why would they do that? Right when they have the hardware and software ready?
performance in the raytracing seems fairly lacking, and incredibly expensive.
As it should be, but if your judging the performance in the early released titles or in the very first few months then you are gravely mistaken. Things will improve with time, optimizations and each new GPU generation.

Now compare that to the alternative which is designing both 20xx with RTX and Tensor for the professional markets, and then also developing a new GPU architecture optimised for gaming of no use to these professional markets. nVidia now have more development costs, production concerns (making gaming dies at less profit per mm² than pro silicon and smaller production numbers for both dies meaning higher relative costs), and extended support across more devices. From a business standpoint, it strikes me as far more sensible to take the pro part and find best-case application of it in gaming than to mess around with discrete pro and gaming parts.
All of this has been done before, in Maxwell and Pascal. And NVIDIA made that model extremely profitable. If It ain't broke don't fix it. Which means they will control these variables much more efficiently than creating huge dies for RTX and then manhandling the spread and adoption of it themselves. Potentially risking their competitive edge with the huge dies and their requirements of increased prices, increased manufacturing costs, increased power demands .. etc.
 
Last edited:
Back
Top