NVidia Ada Speculation, Rumours and Discussion

BRiT · Sep 22, 2022

Rundown of the RTX 40x0 video cards announced so far:

troyan · Sep 22, 2022

nVidia has shown Cyberpunk to the press: https://wccftech.com/nvidia-geforce...k-2077-dlss-3-cuts-gpu-wattage-by-25-percent/

In 1440p mit Psycho-settings ~60fps with 2850MHz...

JoeJ · Sep 22, 2022

Scott_Arm said:
Yah, you're probably right that it's a design issue and not my colour blindness. The problem with colour blindness is you lose the ability to detect a lot of different hues. So any time I look at something and can't tell the difference I just assume it's because I can't detect the difference in hues, but in this case it was probably just the brightness. I can easily distinguish the 1st and the 4th from the rest, but 2 and 3 look the same, especially in the small boxes above.

Ah ok. I can confirm: Same experience here about the chart, and my hue sensibility is way over the norm (at least it was).

To keep it simple, my way of dealing with color blindness is 'turn image black and white, and if some information is lost, consider to spend more work on it.'
Let me know if you think there's more to it. ; )

JoeJ · Sep 22, 2022

BRiT said:
Rundown of the RTX 40x0 video cards announced so far:

ahahaha...

'GPU support sticks' ???
You know how i call that? Crutches. The things you got to use soon before you die.

I'm happy i can't afford that.

Malo · Sep 22, 2022

"The RTX 4080 16GB is 3x the performance of the RTX 3080 Ti on next gen content like Cyberpunk with RT Overdrive mode or Racer RTX - for the same price," an Nvidia spokesperson told Eurogamer today. "And the RTX 4080 12GB is 3x the performance of the RTX 3080 12GB for $100 less.

haha that's really desperate

JoeJ · Sep 22, 2022

What? 'Heatsink so heavy it affects gravity itself'??? one more: ?

And they use this slogan for, um, marketing their own product? Seriously?

My chair just broke from laughing.

Deleted member 2197 · Sep 22, 2022

troyan said:
nVidia has shown Cyberpunk to the press: https://wccftech.com/nvidia-geforce...k-2077-dlss-3-cuts-gpu-wattage-by-25-percent/

In 1440p mit Psycho-settings ~60fps with 2850MHz...

That's pretty impressive!

At these settings, the GPU was running over 2.8 GHz, averaging around 2810-2850 MHz (min/max), and with a 100% utilization, the temperatures kept steady between 50-55C. That's a difference of up to 330 MHz versus the reference boost clock of 2520 MHz (+13% increase) and the impressive part is that no overclocking was involved! This was all happening at stock. This is just one game & we can see even higher clock speeds in other games. The card has since been reported to run over 3 GHz with overclocking.
...
But now we have to talk about the performance with DLSS 3 enabled. The game was using a pre-release version of DLSS 3 so performance and settings will vary in the final version. As soon as the DLSS 3 setting is toggled on, the DLSS Frame Generation setting is also enabled. This was using the "Quality" preset and we once again saw full GPU utilization with over 2.8 GHz clocks but the temps were closer to 50C than 55C this time around (keep this in mind).

The NVIDIA GeForce RTX 4090 got a performance boost to 170 FPS on average (119 FPS 1% Lows) with DLSS 3 enabled & an average latency of 53.00ms. That's an improvement of 2x in FPS and 30% in latency reduction versus DLSS disabled.

But that's not all, using the latest PCAT tool which comes with support for the PCIe 5.0 16-Pin (12VPHWR) power plug, NVIDIA also provided the wattage figures with both, DLSS disabled and enabled. With DLSS 3 disabled, the NVIDIA GeForce RTX 4090 graphics card consumed 461 Watts on average with a performance per watt (Frames/Joule) of 0.135 points. As soon as DLSS 3 was enabled, the GPU saw the wattage drop to 348W or a 25% reduction. This also increased the perf per watt to 0.513, an increase of 3.8x.

The power numbers are a seriously big deal and one reason why this may be happening is that the load of the FP32 cores is moved to the tensor cores which run the DLSS algorithms. These cores are specialized at these talks and rather than brute-forcing the whole GPU which results in a higher power draw, the tensor cores can process the data much faster and more efficiently while leading to lower power consumption. DLSS 3 can be a game changer in power efficiency and performance efficiency and we really can't wait to test this out for ourselves when we get our review samples.

Dampf · Sep 22, 2022

Next gen content in an era of last gen games...

Trust us guys, it's really 3x faster once next gen content releases... really.

I'm still very skeptical of these numbers. Remember Ampere? There was some RT Depth of Field or something in the marbles demo which resulted in a 4x RT performance increase over Turing and they marketed the hell out of that.. Turns out it was never used in any game and Ampere only had a very minimal performance improvement over Turing in real games with RT when excluding the fact that Ampere is just faster in general.

I bet it's the same here. Only in specific circumstances you will get such a great speedup, the only difference compared to the RT DOF thingy from Ampere is now they force it into a real game (Cyberpunk).

I bet once real next gen content releases, you won't see as much of a speedup like in their specific created apps (Overdrive mode in RT, Racer etc) because these real next gen games actually have to run on consoles and lower end GPUs.

Lurkmass · Sep 22, 2022

trinibwoy said:
Not sure what point you're trying to make. Nvidia advertised SER performance increases in real games. If it relies on a proprietary extension it will see limited adoption in real games. The other examples you mentioned - mesh shaders, tessellation and VRS are all available via standard apis. Btw I'm pretty sure tessellation is widely used, it's just not a special thing anymore so nobody talks about it.

We're talking about a vendor here that swears by implementing AI/ML HW acceleration with a significant amount of die space for which there are no standard APIs ...

You think adding a couple more "proprietary extensions" is somehow going to stop them from getting developers on board to use it no matter how much it sucks to maintain compatibility with their said libraries ?

trinibwoy · Sep 22, 2022

Dampf said:
Next gen content in an era of last gen games...

Trust us guys, it's really 3x faster once next gen content releases... really.

I'm still very skeptical of these numbers. Remember Ampere? There was some RT Depth of Field or something in the marbles demo which resulted in a 4x RT performance increase over Turing and they marketed the hell out of that.. Turns out it was never used in any game and Ampere only had a very minimal performance improvement over Turing in real games with RT when excluding the fact that Ampere is just faster in general.

I bet it's the same here. Only in specific circumstances you will get such a great speedup, the only difference compared to the RT DOF thingy from Ampere is now they force it into a real game (Cyberpunk).

I bet once real next gen content releases, you won't see as much of a speedup like in their specific created apps (Overdrive mode in RT, Racer etc) because these real next gen games actually have to run on consoles and lower end GPUs.

That's true but to their credit this is the only way to push the envelope. It can't just be about running console level IQ at 1000 fps. What's missing is support for these generational features in the standard DirectX and Vulkan apis. Given Nvidia's success with DLSS2 they will likely get support from devs for DLSS3 just fine. It's the other hardware features I'm worried about.

JoeJ · Sep 22, 2022

JoeJ said:
I'm happy i can't afford that.

I just changed my mind.
When i saw this cool model, had to preorder instantly:

To celebrate the end of an era.
I shall not miss the last next gen of dGPU.
Paying tribute is the least thing i should do.

Now back to serious discussion...

yamaci17 · Sep 22, 2022

I don't even think devs have anything to do with DLSS 3. It looks like an isolated feature that will most likely work automatically and completely on NV's side. Devs will just have to their usual DLSS 2/super resolution implementation part. Reflex and interpolation thing should be automatic. I mean let's be real, it literally creates artifacts around fixed UI elements like gamepad icons. I don't even think it will use actual motion vectors the in game TAA/DLSS uses. It will create its own motion vectors for objects on the presentation, most likely. Otherwise why would it distort icons and UI elements? Doesn't make much sense.

trinibwoy · Sep 22, 2022

Lurkmass said:
We're talking about a vendor here that swears by implementing AI/ML HW acceleration with a significant amount of die space for which there are no standard APIs ...

Nvidia is making boatloads of money from their AI/ML hardware. I'm really struggling to understand your point with respect to Ada's SER implementation.

You think adding a couple more "proprietary extensions" is somehow going to stop them from getting developers on board to use it no matter how much it sucks to maintain compatibility with their said libraries ?

The exchange in that thread points to UE5's physics implementation as the source of the problem. Doesn't really support your vilification of "proprietary extensions" as there is no industry standard physics api.

Lurkmass · Sep 22, 2022

trinibwoy said:
Nvidia is making boatloads of money from their AI/ML hardware. I'm really struggling to understand your point with respect to Ada's SER implementation.

The exchange in that thread points to UE5's physics implementation as the source of the problem. Doesn't really support your vilification of "proprietary extensions" as there is no industry standard physics api.

In no way was I criticizing "proprietary extension" since that's just the trend ...

My argument is that proprietary extensions aren't going to be issues at all since they're just going to partner up with developers and make different rendering paths or do game specific driver hacks in order to use these features. Having no standard APIs is not an impediment at all to them as we see with DLSS ...

You have to use proprietary driver extensions like NVAPI to use DLSS anyway ...

Dictator · Sep 22, 2022

yamaci17 said:
I don't even think devs have anything to do with DLSS 3. It looks like an isolated feature that will most likely work automatically and completely on NV's side. Devs will just have to their usual DLSS 2/super resolution implementation part. Reflex and interpolation thing should be automatic. I mean let's be real, it literally creates artifacts around fixed UI elements like gamepad icons. I don't even think it will use actual motion vectors the in game TAA/DLSS uses. It will create its own motion vectors for objects on the presentation, most likely. Otherwise why would it distort icons and UI elements? Doesn't make much sense.

UI does not have motion vectors, so it is subject to the whims of optical flow?

UI is usually done as a Post-Post-Process... The last thing with no motion vectors...

nAo · Sep 22, 2022

No need to speculate on motion vectors, what happens is pretty clear:

For each pixel, the DLSS Frame Generation AI network decides how to use information from the game motion vectors, the optical flow field, and the sequential game frames to create intermediate frames.

From here: https://www.nvidia.com/en-us/geforce/news/dlss3-ai-powered-neural-graphics-innovations/

The optical flow is clearly necessary to handle all those image elements that are not accounted for by "traditional" motion vectors.

nAo · Sep 22, 2022

Just because a feature is introduced as a proprietary API it doesn't mean it won't eventually make it into standard, especially if it is particularly successful.

trinibwoy · Sep 22, 2022

Lurkmass said:
My argument is that proprietary extensions aren't going to be issues at all since they're just going to partner up with developers and make different rendering paths or do game specific driver hacks in order to use these features. Having no standard APIs is not an impediment at all to them as we see with DLSS ...

Yes, Nvidia has proven with DLSS that they can successfully push proprietary tech. It doesn't explain though why their SER implementation needs a proprietary extension while Intel's doesn't. Or am I mistaken and Intel's thread sorting doesn't just work out of the box either?

troyan · Sep 22, 2022

From the press briefing:

In Cyberpunk 2077 with its new Overdrive graphics preset that significantly dials up RT calculations per pixel, SER improves performance up to 44 percent. NVIDIA is developing Portal RTX, a mod for the original game with RTX effects added. Here, SER improves performance by 29 percent. It is also said to have a 20 percent performance impact on the Racer RTX interactive tech-demo we'll see this November. NVIDIA commented that there's various SER approaches and the best choice vary by-game, so they exposed the shader reordering functionality to game developers as an API, so they have control over how the sorting algorithm works, to best optimize their performance.

NVIDIA Ada AD102 Block Diagram and New Architectural Features Detailed

At the heart of the GeForce RTX 4090 is the gigantic AD102 silicon, which we broadly detailed in an older article. Built on the 4 nm silicon fabrication process, this chip measures 608 mm² in die-area, and crams in 76.3 billion transistors. We now have our first look into the silicon-level block...

www.techpowerup.com

/edit:
German site computerbase writes that SER is always active.

nAo · Sep 22, 2022

trinibwoy said:
Yes, Nvidia has proven with DLSS that they can successfully push proprietary tech. It doesn't explain though why their SER implementation needs a proprietary extension while Intel's doesn't. Or am I mistaken and Intel's thread sorting doesn't just work out of the box either?

Because they are not the same thing?

Wait for more details..

NVidia Ada Speculation, Rumours and Discussion

BRiT

(>• •)>⌐■-■ (⌐■-■)

troyan

JoeJ

JoeJ

Malo

Yak Mechanicum

JoeJ

Deleted member 2197

Guest

Dampf

Lurkmass

trinibwoy

Meh

JoeJ

yamaci17

trinibwoy

Meh

Lurkmass

Dictator

nAo

Nutella Nutellae

nAo

Nutella Nutellae

trinibwoy

Meh

troyan

NVIDIA Ada AD102 Block Diagram and New Architectural Features Detailed

nAo

Nutella Nutellae

Similar threads