NVidia Ada Speculation, Rumours and Discussion

Qesa · May 1, 2022

no-X said:
Even Undervolted / TDP constrained Fiji was able to beat factory-OCed GTX 980 while having lower power consumption. That's quite common behavior of big power-hungry GPUs with wide buses. The same applies to GT200, GF100 etc. The problem is that you need to manipulate one product (more expensive one) and compare it to another product (cheaper one) at factory settings to get these seemingly good results. Apples to oranges.

Fiji was a ~50% bigger die than GM204, on the same node, and had expensive HBM and associated packaging

AD102 is only ~20% larger than Navi 21, and on a far less expensive process. So the die is certainly cheaper. It does have a wider memory bus, and fancy PAM4 signalling, but I'd be surprised if a 3090 ti cost more to produce than a 6900XT.

EDIT: Also, there's a difference between undervolting and reducing TDP. Undervolting is removing the vendor's voltage safety margin for parametric variance, ageing, etc, and can reduce power draw without impacting performance (or even improving it). Reducing the TDP preserves all that and necessarily reduces performance alongside power. Igor was doing the latter, while the undervolting craze for late GCN was the former.

Jawed · May 2, 2022

ninelven said:
Sure. That graph is frame time variance, not frame time. Thus, in situations where frame rate is hard capped by some similar metric (bandwidth, cpu, triangle rate, whatever) and that is responsible for the minimum frame rate, then cards will share a similar number. Now, if GPU A is generally faster than GPU B, and GPU A has a higher max frame rate than GPU B but a similar minimum frame rate, then the variance of GPU A will be higher than GPU B because it is offering a better overall experience. Of course, this doesn't have to be the case; it is just the most likely scenario. Without the actual data though, one can't be certain. If one really wants the lowest variance however, this can be achieved on any card by capping the frame rate at whatever the minimum frame rate is for it in that application. Then one can enjoy a 0 variance experience .

There's a whole page of graphs in the article that we've been discussing:

Cool flagship instead of fusion reactor: the GeForce RTX 3090 Ti turns the efficiency list upside down with a 300-watt throttle and beats the Radeons | Page 7 | igor'sLAB (igorslab.de)

Metro-Exodus-EE-FrameTimeoB-3840-x-2160-DX12-Ultra-RT-High-Reflection-Hybrid.png

It looks spiky on NVidia, as would be expected from the graph I showed earlier (16ms variance is quite a lot!). The average is obviously a lot better but the spikiness might be considered problematic.

In general it seems that the 300W 3090Ti suffers from a notable degradation in variance versus its "native" performance.

I think if you've shelled out for a 3090Ti you'd prefer better variance, which you get with the native configuration.

I've looked at a few things by Igor in the past but it's clear to me now I should be paying much more attention.

trinibwoy · May 2, 2022

Jawed said:
In general it seems that the 300W 3090Ti suffers from a notable degradation in variance versus its "native" performance.

That’s typical for a TDP limited scenario as clocks fluctuate a lot at the limit. Ideally you want to hit your stable clock with lots of TDP and temperature headroom. I set my 3090 to 375W but limit voltage and clocks to 775mv / 1700Mhz. Ends up pulling ~330W with clocks rock solid at 1695Mhz.

ninelven · May 2, 2022

I think if you've shelled out for a 3090Ti you'd prefer better variance

Again, not necessarily. Whether variance is better or worse depends on other factors such as maximum frame times and whether or not these are consistently an issue or relatively isolated spikes. However, if a GPUs minimum framerate is higher than its competition, then higher variance is probably a good thing as it means better overall performance. One can always employ a frame rate limiter to get a more consistent experience (less variance), but nothing is going to raise the performance floor other than more performance.

no-X · May 2, 2022

Qesa said:
Fiji was a ~50% bigger die than GM204, on the same node, and had expensive HBM and associated packaging

AD102 is only ~20% larger than Navi 21, and on a far less expensive process.* So the die is certainly cheaper. It does have a wider memory bus, and fancy PAM4 signalling, but I'd be surprised if a 3090 ti cost more to produce than a 6900XT.*

*{{citation needed}}
AMD supplies 6900XT's since 2020. It took Nvidia more than one year to harvest enough fully-enabled GA102 GPUs usable for RTX 3090 Ti. Does anybody really believe, that these rare cherry-picked GPU's can be cheaper? The cooler of RTX 3090 Ti is also bigger (more expensive). Not to mention 24 GB >21Gbps GDDR6X supplied by a single vendor compared to 16 GB of standard 16Gbps GDDR6.

DegustatoR · May 2, 2022

MEEE is a bit weird in how it works on Nv h/w since these frametime spikes are happening only on Nvidia. This looks more like a renderer issue than a h/w one.

no-X said:
It took Nvidia more than one year to harvest enough fully-enabled GA102 GPUs usable for RTX 3090 Ti.

A6000 was launched in Oct 2020.

Deleted member 2197 · May 2, 2022

Jawed said:
It looks spiky on NVidia, as would be expected from the graph I showed earlier (16ms variance is quite a lot!). The average is obviously a lot better but the spikiness might be considered problematic.

Variance is commonplace and influenced by the game scene being benchmarked. I don't think I've seen a case where fluctuating variance can be is considered an indicator of overall game experience, and much less as a valid indicator of what to expect over a wide range of games.

DavidGraham · May 2, 2022

no-X said:
AMD supplies 6900XT's since 2020. It took Nvidia more than one year to harvest enough fully-enabled GA102 GPUs usable for RTX 3090 Ti. Does anybody really believe, that these rare cherry-picked GPU's can be cheaper? The cooler of RTX 3090 Ti is also bigger (more expensive). Not to mention 24 GB >21Gbps GDDR6X supplied by a single vendor compared to 16 GB of standard 16Gbps GDDR6.

Fully enabled GA102 is available since 2020. VRAM and coolers are not related to dies, the die of the 6900XT is more expensive because it's made on a significantly more expensive node.

Qesa · May 2, 2022

no-X said:
*{{citation needed}}
AMD supplies 6900XT's since 2020. It took Nvidia more than one year to harvest enough fully-enabled GA102 GPUs usable for RTX 3090 Ti. Does anybody really believe, that these rare cherry-picked GPU's can be cheaper? The cooler of RTX 3090 Ti is also bigger (more expensive). Not to mention 24 GB >21Gbps GDDR6X supplied by a single vendor compared to 16 GB of standard 16Gbps GDDR6.

So you can just assert that GA102 is far more costly to produce, but I need a citation to contradict it? That's not how this works.

Also, as has been mentioned, 3090 tis are not the top bin. That's A6000 and A8000 that have been out for 18 months.

del42sa · May 2, 2022

so 7nm process is significantly more expensive than 8nm process in case of RX6900XT but 4nm/5nm process is cheaper than 7nm in case of AD102 ?

DegustatoR · May 2, 2022

7nm? RDNA3 will use a mix of N5 and N6.

del42sa · May 2, 2022

DegustatoR said:
7nm? RDNA3 will use a mix of N5 and N6.

he said: "AD102 is only ~20% larger than Navi 21, and on a far less expensive process..."

NAVI21 afaik is RDNA2

DegustatoR · May 2, 2022

del42sa said:
he said: "AD102 is only ~20% larger than Navi 21, and on a far less expensive process..."

NAVI21 afaik is RDNA2

It's an obvious typo, they meant GA102, not AD102.

del42sa · May 2, 2022

DegustatoR said:
It's an obvious typo, they meant GA102, not AD102.

well then yes, but AD102 gonna have similar die size so it makes confusion with the two more likely ...

DegustatoR · May 2, 2022

del42sa said:
well then yes, but AD102 gonna have similar die size so it makes confusion with the two more likely ...

It's a bit premature to compare Lovelace to RDNA3 since a lot is still unknown about them both.

del42sa · May 2, 2022

DegustatoR said:
It's a bit premature to compare Lovelace to RDNA3 since a lot is still unknown about them both.

yes, but we are in Ada speculation, rumours , discussion thread, right ?

Jawed · May 2, 2022

DegustatoR said:
MEEE is a bit weird in how it works on Nv h/w since these frametime spikes are happening only on Nvidia. This looks more like a renderer issue than a h/w one.

It's very odd.

I'm not sure: aren't there higher RT settings than those he used? Maybe that would make AMD spikey?

It'd be nice to see a deep dive into this, but it's unlikely to happen.

DavidGraham · May 2, 2022

Jawed said:
I'm not sure: aren't there higher RT settings than those he used? Maybe that would make AMD spikey?

There is, Ultra RT settings is available above High.
Also Extreme settings is available above Ultra for rasterization.

Igor tested Ultra Rasterization, High RT. A step down from the max settings for each category.

Dangerman · May 2, 2022

I was doing some napkin math on AD102 being 2.8Ghz since Greymon mentioned RDNA3's frequency: 2.8 (Ghz) * 18432 * 2 = 103,219.2 GFlops/103.2192 Teraflops. 2.58x the Flops over the 3090 Ti by my calcs.

I mean 2.8 Ghz may have to be current frequency of the "4090" with 600W at this point. And I wonder if the rumoured integration of Hopper architecture elements into Ada will boost performance in "RL"/Gaming scenarios. All and all I do expect a 600W "4090" at this point to at least double 3090 Ti performance at 4K with the insane power draw on a TSMC N5/N4 process with Hopper Architecture being shoved in.

DegustatoR · May 3, 2022

Nothing is being "shoved in".

NVidia Ada Speculation, Rumours and Discussion

Qesa

Jawed

trinibwoy

Meh

ninelven

PM

no-X

DegustatoR

Deleted member 2197

Guest

DavidGraham

Qesa

del42sa

DegustatoR

del42sa

DegustatoR

del42sa

DegustatoR

del42sa

Jawed

DavidGraham

Dangerman

DegustatoR

Similar threads