NVidia Ada Speculation, Rumours and Discussion

Status
Not open for further replies.
I believe (I think they briefly mentioned this in the announcement somewhere as well?) is that they want to push to implement all these features via Streamline, and in that case all of these features are just separate plugins for Streamline. So you wouldn't actually implement DLSS 3 as something by itself (that has frame generation, upscaling, and the reflex low latency all in one), but you'd implement Streamline -> DLSS + Reflex + Frame Generation (or whatever it end up being called). At the moment Streamline has 4 plugins - DLSS, Reflex, NRD, and NIS. I believe each of these plugins itself than has a check if the hardware is supported and the DLSS check doesn't distinguish versions.

DLSS 3 as a term seems more like a marketing/branding term in terms of how it's currently being conveyed.
Streamline won't solve user facing options issues like the lack of sharpening slider or lock out of some presets in some games.
It is certainly possible that some developer house would require (for whatever mysterious reason) FG to be always on and won't allow you to disable it on Ada.
Not an Nvidia problem though.
 
If it's not available in all games, the dev would have to bizarrely force frame generation and Reflex whenever you turn on DLSS which would be strange. DLSS 3 is simply DLSS and Reflex+Frame Generation.

I think it's pretty easy to know how it's going to play out but if you need to see it in action in every game, I guess there's not much more to be said.

I'm not sure why you are saying I need to see it in action in every game. I neither claim that the option will be available in all games nor that it won't be available in some games.

Regards,
SB
 
I'm not sure why you are saying I need to see it in action in every game. I neither claim that the option will be available in all games nor that it won't be available in some games.

Regards,
SB
What I'm saying is that it shouldn't even be a question. Maybe in some extremely specific and esoteric scenario, DLSS3 will be forced over DLSS2 but I doubt that will be the norm (or even happen at all). Is it possible? Yes. Is it likely? Don't think so.
 
What I'm saying is that it shouldn't even be a question. Maybe in some extremely specific and esoteric scenario, DLSS3 will be forced over DLSS2 but I doubt that will be the norm (or even happen at all). Is it possible? Yes. Is it likely? Don't think so.
I think there's a disconnect here on how you think DLSS3 works for game integration. The frame interpolation component is a subset, which requires Ada. A game can have "DLSS3 "integrated, and allow Turing and Ampere GPUs to utilize the reconstruction component of "DLSS3" aka DLSS2, but not the frame interpolation part. But only if the developer provides the different components of DLSS3 as separate toggles within the game. If a developer integrates DLSS2 only, then there's no issue, but also no access to new Ada features for the gamer.

At least that's my understanding of how it works from Nvidia's messaging.
 
In Geekbench, the 4090 is 78% faster than 3090, and 63% faster than 3090Ti.

Seriously impressive gen-on-gen performance uplift. With 24GB of VRAM on top of that, it’s not a bad deal, even for $1600.

It’s 2.6x the performance of a 3070 for 3.2.x the price but also 3x the VRAM.
 
In Geekbench, the 4090 is 78% faster than 3090, and 63% faster than 3090Ti.

Something doesn't give, the clock bump (at the reported boost clocks) + SM increase should place this well above 2x 3090 Ti for raw compute. I wonder what's starving the GPU, maybe it's just diminishing returns after a point.

Edit: I guess this might not be a very representative benchmark though, the 3080 is 16% faster than a 2080 ti in the same benchmark, where it usually lands double that in games.
 
Last edited:
The GPU is much wider: 128SM vs. 84SM. Something nVidia mentioned with Hopper that is much harder for them to saturate wider GPUs:
The CUDA programming model has long relied on a GPU compute architecture that uses grids containing multiple thread blocks to leverage locality in a program. A thread block contains multiple threads that run concurrently on a single SM, where the threads can synchronize with fast barriers and exchange data using the SM’s shared memory. However, as GPUs grow beyond 100 SMs, and compute programs become more complex, the thread block as the only unit of locality expressed in the programming model is insufficient to maximize execution efficiency.
 
If we compare Ampere results of the 3060ti, 3070, and 3070ti all 3 of which have very high tflops/bandwidth ratios -
GB 5 CUDA ScoretflopsMemory Bandwidth (Gb/s)
3060ti12832116.2448
307014834020.31448
370ti16586621.75608.3

GB 5 Cuda Score Differencetlfop differenceGB 5 Score gain per tflop
3070 - 3060ti20,0191.2544870
3070ti - 307017,5261.07112170
3070ti - 3060ti37,5455.556,764

Tflop scaling is very different in the scenario in which memory bandwidth remains the same compared to the one increasing a lot. The issue with Ada of course is that tflop/memory bandwidth ratio is much lower compared to Ampere, which in itself isn't great against Turing.

Without knowing the specifics of the benchmark it's hard to say. However my understanding of Geekbench's CPU test suit is that it seems to weight/factor in memory bandwidth on the higher side due to some subtests.
 
RTX 4080 12GB has half the compute performance and memory bandwidth. So it will be interessting to see if the score is half from the 4090.
 
Nobody is assuming that it'll be locked in any game just like no-one is assuming that it will be available in all games.

Ultimately it is up to the developer to implement those options into their game. You can easily see that on PC user accessible options to enable/disable or even adjust the quality settings of certain things may or may not be available. Some games allow you to select an ultra widescreen resolution, some don't. Some allow you to enable or disable motion blur, some do not. Some allow you to enable or disable depth of field, some do not.

Some games allow you to enable or disable NV sharpness settings with DLSS 2, some do not.

It does not matter if you can do it in one game because it doesn't necessarily mean you'll be able to do it in all games.

Regards,
SB
Correct.
I was originally replying to an answer to this question: https://forum.beyond3d.com/threads/nvidia-ada-speculation-rumours-and-discussion.62474/post-2266291:
BTW, do we know if all DLSS3 games will support DLSS 2 for older gen ? Or we'll get screwed like "oh, you need a 4xxxx for DLSS in this game" ?
To which the answer is: We don't know yet, as has been confirmed by the friendly discussion here.

If we compare Ampere results of the 3060ti, 3070, and 3070ti all 3 of which have very high tflops/bandwidth ratios -
GB 5 CUDA ScoretflopsMemory Bandwidth (Gb/s)
3060ti12832116.2448
307014834020.31448
370ti16586621.75608.3

GB 5 Cuda Score Differencetlfop differenceGB 5 Score gain per tflop
3070 - 3060ti20,0191.2544870
3070ti - 307017,5261.07112170
3070ti - 3060ti37,5455.556,764

Tflop scaling is very different in the scenario in which memory bandwidth remains the same compared to the one increasing a lot. The issue with Ada of course is that tflop/memory bandwidth ratio is much lower compared to Ampere, which in itself isn't great against Turing.

Without knowing the specifics of the benchmark it's hard to say. However my understanding of Geekbench's CPU test suit is that it seems to weight/factor in memory bandwidth on the higher side due to some subtests.
Oh, yes.
Some subtests either seem to suffer from driver difficulties or are almost completely bandwidth bound, i.e. SFFT, where (this singular instance of a) 4090 scores almost identical to RTX 3090 Ti.
 
To which the answer is: We don't know yet, as has been confirmed by the friendly discussion here.
DLSS 3.0 just adds a new toggle for frame generation, DLSS upscaling is still available much in the same way, that was confirmed by nvidia in one of their Q&As.
 
Would somebody like to compare and contrast AD102 to GA100? The 40MB L2 seems like an interesting precedent for one thing.
 
Look's like OFA is a stand alone hardware block that performs the same currently for all 3 GPU's. So that might answer some questions on how DLSS 3 frame generation will scale down to lower Ada.

To further aid encoding performance, Ada GeForce RTX 40 Series GPUs with 12 GB of memory or
more are equipped with dual NVENC encoders.

Interesting in terms of how that's phrased.
 
I'm looking forward to getting a 4090. I have a 2080ti and skipped the 30 series, first out of the inability to get one at launch, then as they continued to drive up in price, they simply became undesirable to me. The 2080ti was still performing perfectly fine in every game, so I used that time to upgrade other, non PC aspects of my setup. Like my chair, Desk, Speakers, Peripherals, and other flourishes which improve the QoL of my setup.

I was prepared to go with an all new AM5 based 7000 series AMD platform... however, now that we've seen AMDs upcoming CPUs, and we know where Raptor Lake will essentially fall... I've made the decision to simply upgrade my 3900X to a 5800X3D and allow AM5 to stabilize and mature a bit. Perhaps I'll catch the next cycle.

I'm ready for the 4090. Please for the love of god allow me to get an order in lol
 
I'm looking forward to getting a 4090. I have a 2080ti and skipped the 30 series, first out of the inability to get one at launch, then as they continued to drive up in price, they simply became undesirable to me. The 2080ti was still performing perfectly fine in every game, so I used that time to upgrade other, non PC aspects of my setup. Like my chair, Desk, Speakers, Peripherals, and other flourishes which improve the QoL of my setup.

I was prepared to go with an all new AM5 based 7000 series AMD platform... however, now that we've seen AMDs upcoming CPUs, and we know where Raptor Lake will essentially fall... I've made the decision to simply upgrade my 3900X to a 5800X3D and allow AM5 to stabilize and mature a bit. Perhaps I'll catch the next cycle.

I'm ready for the 4090. Please for the love of god allow me to get an order in lol

Am on a 3900x/2080Ti too. And in all honesty theres zero reason to upgrade other than even higher fidelity and performance. The most logical upgrade would indeed be a 4090, but then a new CPU would be needed too to not bottleneck the thing too much. And yea, while Ryzen 7000 is impressive performance wise, i think raptor lake will stomp all over it.
5800X3D is the perfect gaming CPU indeed, its like the thing is made for gaming.
 
Status
Not open for further replies.
Back
Top