NVidia Ada Speculation, Rumours and Discussion

Status
Not open for further replies.
Yes, but will it "feel" like 60 fps or will it still "feel" like 20/30 fps but look like 60 fps? The main input and game processing loop would still be at the lower raw rate, right?
Apparently, Nvidias only answer to this is to include their Reflex low latency tech to lower latency in the first place. Which is fine, but does nothing to help movement control with low game-engine-fps.
 
TBH this frame interpolation tech is starting to sound like the turn-24fps-movies-into-60fps (aka the much derided "soap opera" mode) stuff that's embedded in most TVs these days.
 
Currently, 24 Gbps is the highest announced speed for G6X with the announcement being from April this year. Going by that, a 192-bit-card can reach 576 GB/s. That's less than half of the 21-TFlops-3070-Ti. Yes, caching can get you an extra mile or two and SER will maybe have an effect as well (lessening the burden of the caches, so they can better serve other consumers). But all in all, i'm afraid, it's too little.
3070Ti is a 256 bit 19Gbps G6X card with 608GB/s of bandwidth.
4080/12 will be a 192 bit 21Gbps G6X card with 504GB/s of bandwidth.
That is a regression but not "less than half" for a 40TF card.
Still there will certainly be some cases where the big L2 won't help - we've seen this on RDNA2 already.

edit: Seemingly, many vendors like KFA2, MSI or Zotac will continue to use 21 Gbps memories, so even less than calculated with above: 504 GB/s.
22.5Gbps will be used on 4080/16 only AFAIK.

Yes, but will it "feel" like 60 fps or will it still "feel" like 20/30 fps but look like 60 fps? The main input and game processing loop would still be at the lower raw rate, right?
Relfex is a part of DLSS3 for a reason, they may do something with input through that.
But I suspect that frame generation feature will work "well" only when your "native" framerate is above 60.
 
TBH this frame interpolation tech is starting to sound like the turn-24fps-movies-into-60fps (aka the much derided "soap opera" mode) stuff that's embedded in most TVs these days.

It's not starting to sound like it. That's essentially what it is. It's derided in tvs because the quality is poor and it drastically alters the feel of a movie. In games we want higher fps, not less.
 
I suppose the best test for DLSS 3.0 is going to be taking 20/30fps to ~60fps.

These 200+ frame rates in games like Spiderman are all well good but with a frame rate already high before DLSS 3 it would, I imagine, be difficult to feel anything that doesn't quite feel right with DLSS 3.

Where as we're all very good at feeling the differencing between 20/30fps and 60fps so I feel that's where we'll notice if something doesn't quite feel like it should with DLSS 3.

We'll have to see how the implementation is in practice but in theory I see it as a the opposite in that frame multiplication is primarily aimed at high refresh and solves (or at least greatly mitigates) the issue of the rest of the system (if not the game itself inherently) being limitations in driving high frame rates regardless of the GPU's capabilities. Being primairly aimed at high refresh applications itself would also essentially mitigate latency issues.

120fps+ (including taking advantage of 240hz or higher displays) would actually be doable outside of esports titles. Where this will ultimately lead to (if it can further scaled up) is in combination with OLED (or equivalent) or at least fast enough LCD displays that are at 480hz (or higher) allow for essentially near perfect motion quality without relying on strobing/BFI (which have their own issues).
 
It's not starting to sound like it. That's essentially what it is. It's derided in tvs because the quality is poor and it drastically alters the feel of a movie. In games we want higher fps, not less.
Video frame do not have motion vectors in them.

Btw I wonder how much of an impact mining had on Lovelace memory subsystem design.
Narrower buses with bigger caches seem like a fine way to fight mining hash rates.
Alas doesn't look like this was needed.
 
Video frame do not have motion vectors in them.

Btw I wonder how much of an impact mining had on Lovelace memory subsystem design.
Narrower buses with bigger caches seem like a fine way to fight mining hash rates.
Alas doesn't look like this was needed.

I'd doubt it, there's other ways to tackle that if they wanted to. I actually wonder if there is any hashing limitations. Also the designs would've likely had to have been very far along already before the start of 2021.

The memory design just seems dictated by the what's available. There's no way to increase actual raw bandwidth without going to HBM. GDDR7 or whatever next gen memory technology isn't here yet.
 
So, RTX 4080 (12GB) has the same memory interface width as GTX 1060 3GB? Nice move. Ok, obviously transfer rate will be much higher, still it has to make do with roughly half the transfer rate of the TFLops-wise similarly configured RTX 3090 Ti. That's probably gonna be a rude awakening in some applications.
Don't they have some big L3 'Infinity Cache' now too? I remember this from rumors, at least.
 
Bit more thinking how bad the 4080 12GB looks in the stack. The SM ratio between the cut down AD102, 128 SM 4090 and AD104, 60 SM 4080 12GB is absolutely wild, 46.9% rounded up. Previous gens for perspective:

3090 82SM -> 38.4, 3060 Ti has 38
2080Ti 68SM -> 31.9, 2060 has 30, 2060S 34
1080Ti 28SM -> 13.1, 1060 6GB has 10, 1070 15
980Ti 22SM -> 10.3, 960 has 8, 970 13

So saying it should be a 4070 is actually generous, it's more like a 4060 Ti compared to every other gen's ratio which is partially re-affirmed by the 192 bit bus. The 4070 is probably going to be 52 or maybe 54 if they have a 4070 Ti, which would make the 4070 more like an x60 non-Ti vs x80Ti/x90 in previous gens purely by ratio. Yes AD102 is a big leap over GA102 but you could park a bus in that gap. Business wants money etc but $900 for a 4080 12GB which by their own slides is slower than a 3090Ti in raster, making it maybe 15-20% faster than a 3080 for 30%/$200 more 2 years later? Take a bow 104, you have peaked

Also confused with their "191 RT Tflops" claim for the 4090 (and the "200 RT Tflops" they gave in the presentation). 2.52GHz * 512 tensor FP16 ops per SM/clock * 128 SM = 165 "RT Tflops", for some reason there's a 15-20% modifier applied. They said SER was a "up to 2-3x increase in RT and 25% in overall game performance" so that's not it either
 
Last edited:
One particularly impressive aspect of DLSS3 vs DLSS2 is its ability to scale at very high framerates. I guess there must be a limit to that though. It'd be interesting to see what the cost is in terms of ms to create the new frame.

Also what happens if it misses the window? Clearly its not an exact doubling from all the examples we've seen which means some pairs of frames must be getting a new AI generated frame and others not. Isn't that going to lead to uneven frame pacing?
I'm much more worried about the artifacts. Here's 4 frame sequence from Digital Foundrys (NV controller) preview
dlss-artifaktisarja-1.jpg
dlss-artifaktisarja-2.jpg
dlss-artifaktisarja-3.jpg
dlss-artifaktisarja-4.jpg
Can you guess which frames are scaled and which generated?
 
TBH this frame interpolation tech is starting to sound like the turn-24fps-movies-into-60fps (aka the much derided "soap opera" mode) stuff that's embedded in most TVs these days.
I want ML motion blur as well. :)
Once we have upscaling, frame interpoaltion and motion blur, i would start to agree with saying tensor cores are worth it.
 
Relfex is a part of DLSS3 for a reason, they may do something with input through that.
But I suspect that frame generation feature will work "well" only when your "native" framerate is above 60.
Yeah, it's good that more would pick up tech to lower input latency. There's some games with amazingly bad input lag.

However, it's something that really has to be felt and experienced first hand as so much of this varies by person. It would be ideal if there were kiosks at local stores so one can try it in different games, but that is a huge ask. The closest we'll probably get is a handful to a dozen different reviewers giving their hands-on writeups.
 
unknown.png


:poop:
 
Yes, but will it "feel" like 60 fps or will it still "feel" like 20/30 fps but look like 60 fps? The main input and game processing loop would still be at the lower raw rate, right?
That's what I'm in interested in the most, how does it feel rather how does it look.
 
Can you guess which frames are scaled and which generated?
Ugh, that's bad. Halucination stroboscope expected.
They still have some work to do.
My proposal: Interpolate all frames, to get rid of the stroboscpe. Though, twice the cost for twice the soup. <:/
 
Status
Not open for further replies.
Back
Top