NVidia Ada Speculation, Rumours and Discussion

Status
Not open for further replies.
I’m now realizing just how terrible a product the 3070 Ti is. 35% higher power for 5% more performance than the 3070. And that’s all due to 8GB of GDDR6X because it doesn’t clock any higher than the 3070 with 8GB GDDR6.
So then compare with 3080, which has 50% more bandwidth and lots more SMs and only uses 30W more according to the spec.

3070Ti appears to be the GA104 version of 3090Ti... In both cases, a card that is cherry-picked to scoop up the most SMs regardless of power draw. A card this is positioned to eliminate AMD regardless of costs.

So these rumours regarding power can be interpreted as being for Ti cards, where board power has no relevance and customers are expected to be stupid with their money.

I'm not saying the power usage rumours are true, merely that "fully enabled" Ti cards are likely to distort the rumours beyond breaking point.
 
So then compare with 3080, which has 50% more bandwidth and lots more SMs and only uses 30W more according to the spec.

3070Ti appears to be the GA104 version of 3090Ti... In both cases, a card that is cherry-picked to scoop up the most SMs regardless of power draw. A card this is positioned to eliminate AMD regardless of costs.

So these rumours regarding power can be interpreted as being for Ti cards, where board power has no relevance and customers are expected to be stupid with their money.

I'm not saying the power usage rumours are true, merely that "fully enabled" Ti cards are likely to distort the rumours beyond breaking point.

Exactly why I expect a GDDR6 4070 to land closer to 200w than 300. GDDR6X is the win at no cost option but has terrible perf/watt.
 
Fab GDDR interfaces don't scale with node, so there's no reason to think that TSMC is better than Samsung. It seems likely that the libraries supplied to IHVs for GDDR don't materially vary just because of the fab. PHYs are like iron-welding, in contrast with compute which is like micro-surgery.

I've just shown that 3080 performance per watt does not suffer from "GDDR6X" power consumption, so I think "X" power consumption is specifically a distraction.

I suspect it's better to think in terms of the power/performance curve, and that TSMC's curve is likely to be less punishing than Samsung's. What else on Samsung 10/8 was in the same ballpark as GA102? Nothing, I expect. Which other Samsung fabbed chips run at around 400W?

TSMC is used to catering to a vast range of customers' power/performance trade-offs, while Samsung is only good at the lower end. So I think GA102 and down, at Samsung, are poor baselines for Ada.
 
The biggest reason for the poor power efficiency of the current G6X is Sec 8N and the first generation controller+PHY.
I think switching to TSMC can fix this problem easily.
I'm not sure where these takes are coming from.
From what I've seen on 3070 vs 3070Ti wattage difference it was mainly due to a different GA104 bin being used in Ti version, not G6X.
G6X itself is more efficient than G6 but that's true for same clock. When clocked like it is in practice it is consuming more than G6 for sure but not that much more to push the TBP up more than 20W or so.
 
I've just shown that 3080 performance per watt does not suffer from "GDDR6X" power consumption, so I think "X" power consumption is specifically a distraction.

I don't think you've shown that. The 3080 is about 30% faster than the 3070 at 45% more power. Also the 3080 data doesn't invalidate the fact that the 3070 Ti has atrocious perf/watt and is a much more relevant comparison point to the 3070. There's literally no difference between the 3070 and 3070 Ti aside from 4% more cores and GDDR6X. How could "X" power consumption be just a distraction? What's the other explanation?
 

Which makes it even more likely that GDDR6X is the culprit. TPU also saw similar behavior where the Ti clocks slightly lower than the 3070 at the same voltage.

https://www.techpowerup.com/review/nvidia-geforce-rtx-3070-ti-founders-edition/35.html
https://www.techpowerup.com/review/nvidia-geforce-rtx-3070-founders-edition/32.html
 
I don't think you've shown that. The 3080 is about 30% faster than the 3070 at 45% more power. Also the 3080 data doesn't invalidate the fact that the 3070 Ti has atrocious perf/watt and is a much more relevant comparison point to the 3070. There's literally no difference between the 3070 and 3070 Ti aside from 4% more cores and GDDR6X. How could "X" power consumption be just a distraction? What's the other explanation?
I'm not denying that 3070Ti is atrocious performance per watt versus 3070, at least according to the TPU relative performance list.

It seems the full-fat Tis are suffering binning related sub-optimisation. NVidia appears to have decided that "all SMs" in a SKU goes along with worse power. If say 10% of the chips have all SMs functional, NVidia doesn't want to reduce the population any further by restricting power.

The power usage may be sub-optimal in order to favour custom AIB cards. I dunno...

If we're going to measure the performance between 3080 and 3070 (or 3070Ti) then we probably should be using scenarios where 3080 isn't CPU bottlenecked and with maximum ray tracing. Otherwise 50% more bandwidth is getting to do nothing.

Though I admit, it's hard to find a situation where 3080 is more than 30% faster than 3070, while 3070 isn't running out of memory (e.g. discounting 55% for Cyberpunk 2077 at 3840x2160). That comes back to something else in the architecture:

NVIDIA GeForce RTX 3070 Ti Founders Edition Review - Raytracing & DLSS | TechPowerUp

but there are a few tests there where 3080 is stretching its legs, e.g. Metro Exodus (though that's not Enhanced Edition).

Is 3070 the sweetspot in NVidia's Ampere GPUs for performance per watt?

NVIDIA GeForce RTX 3070 Ti FE Review: Inefficient side-grade with high power consumption as mining brake | Page 8 | igor'sLAB (igorslab.de)

From later in the review:

"The remaining up to 50 watts more compared to a GeForce RTX 3070 can’t be explained with simple logic. It can be assumed that the yield of fully functional chips is quite high by now and that they wanted to transfer as many chips as possible into the commodity cycle with slightly modified voltage/frequency curves."
 
Is 3070 the sweetspot in NVidia's Ampere GPUs for performance per watt?

Looks like it, although it seems GA102 is a bit more efficient with RT enabled.

"The remaining up to 50 watts more compared to a GeForce RTX 3070 can’t be explained with simple logic. It can be assumed that the yield of fully functional chips is quite high by now and that they wanted to transfer as many chips as possible into the commodity cycle with slightly modified voltage/frequency curves."

That's an interesting theory. TPU's 3070 Ti sample had the same voltage as their 3070 though. Maybe Nvidia is using leakier chips in the Ti.
 
It seems the full-fat Tis are suffering binning related sub-optimisation. NVidia appears to have decided that "all SMs" in a SKU goes along with worse power. If say 10% of the chips have all SMs functional, NVidia doesn't want to reduce the population any further by restricting power.
That's my take on 3070Ti power as well. It almost looked like there was some sort of shortage going on during the period of its launch which forced them to lax power binning for a full die GA104 to be able to ship more of them to the shelves. Weird, I know.
 
That's an interesting theory. TPU's 3070 Ti sample had the same voltage as their 3070 though. Maybe Nvidia is using leakier chips in the Ti.
Samsung parametric yields are simply horrendous. Very few full dies can work at the target efficiency on the desired voltage point. On a +1000 USD SKU, with limited sales quantity, it's not a big issue. On a more mainstream product, you have to compromise a lot to get enough volume out of the door...
 
Samsung parametric yields are simply horrendous. Very few full dies can work at the target efficiency on the desired voltage point. On a +1000 USD SKU, with limited sales quantity, it's not a big issue. On a more mainstream product, you have to compromise a lot to get enough volume out of the door...

Why is it specifically a problem for full dies? Shouldn’t the 3070 have the same issue with parametric yield variation.
 
Why is it specifically a problem for full dies? Shouldn’t the 3070 have the same issue with parametric yield variation.
Simply because on cut dies, you can disable the worst block(s) (SM in Nvidia case) to increase your parametric yields

Edit: In fact, SEC 8N defective yields are very acceptable now (D0 below 0.12), so on GA104, vast majority of the dies can be used for RTX 3070 Ti. But the irony of the parametric yield issue is that full dies are not so good as the cut dies !
 
Last edited:
It's interesting that both GA104 and GA102 full dies in the 3070ti and 3090ti have the worst gaming perf per watt in the Ampere line.

It is just the memory controller working with GDDR6X. There is a reason why nVidia has not used them for Quadro cards.

There wasn't 2 GB GDDR6X dies until recently. GDDR6X did not have the capacity of GDDR6 for Quadro/ProViz.
 
There wasn't 2 GB GDDR6X dies until recently. GDDR6X did not have the capacity of GDDR6 for Quadro/ProViz.
I was under the impression Micron didn't make ECC GDDR6X, which would make it a non-starter for professional use anyway.

EDIT: The protocol has error correction for transmission, but nothing to detect or correct a bit flip on the memory chip
 
Stupid non-HW person question, but would it make sense to allow voltage to vary on a per SM/GPC/whatever level?
 
Status
Not open for further replies.
Back
Top