Nvidia Ampere Discussion [2020-05-14]

If you take the issue with the to high clockspeed in consideration it will make sense. Manufacturing chips is high risk production. You know what you get when you are in massproduction, and then it’s to late. I think the issue with the clock speed indicates that they had some really good samples in the beginning. At this time, they evaluated what’s the highest clock which they can go. After release the yield have dropped and that’s why so many people have issues with the slightly higher clock speeds.

There is indeed an issue with production.
 
Using dynamic super resolution (8K) for 4K gameplay seems to me to be a legitimate use of 3090 graphics horsepower:


This is about 30fps (but I don't think it's "Ultimate Quality") and he says that the card is undervolted.

Compared with 4K :

https://www.techspot.com/review/2105-geforce-rtx-3090/

where we see 59 fps 1% and 79 fps average. I think we can say that the 8K DSR performance is not fillrate dominated, since 8K is 4x the pixel count but we're seeing framerates far above 25% of 59 fps.

HZD in that review linked above shows 18% better 1% low performance at 4K versus 3080. It's worth remembering that HZD is one of the better instances of scaling versus 2080Ti, 55% in the review.

In the video it's also fun seeing that VRAM usage is 18GB or so.
 
They would never have gone with Samsung if the yields were bad to a point where the product couldn't even be produced. Internet is a hell of a fake news factory.

Nvidia has been slow to adopt new manufacturing processes since they got burned on Fermi 40nm. But are they slow enough to know what production yields would look like before locking in wafer contracts? Samsung's process is "old" but was never tested on chips the size of GA102.
 
Undervolting Radeon goes back at least as far as Cypress (HD 5870 - 2009 launch). This became prevalent with bitcoin mining and AMD's API for GPU control meant that miners did not need to mess about with things like Afterburner because the mining software integrated all the GPU features.

Undervolt settings typically fail after a while: degradation in the chip means that as time goes by the undervolt will have to be scaled-back. The GPU manufacturer has to balance lifetime chip degradation with the variability of the chips in the "silicon lottery" bins.

Googling reveals undervolt discussions for HD4870 too. Can't be arsed going further back in time.
 
By that logic the past 3 gens of Radeon cards have had yield issues.
Your logic fails there.
Radeons haven't been crashing left and right at their default voltages and clocks and AMD hasn't downclocked or undervolted them later on. What users do is their business.
GeForce RTX 30s on the other hand were crashing left and right when hitting around 2040-2050 MHz (which many did at least on launch voltage/clockprofiles), which led to NVIDIA adjusting their voltage and clockspeed curves.
 
I still don't see how this is related to production or yields. NV being overly optimistic on boost ceiling could lead to the same result on chips made at any factory. And there are no clear signs that this issue was due to clocks and not due to power circuitry.
 
Your logic fails there.
Radeons haven't been crashing left and right at their default voltages and clocks and AMD hasn't downclocked or undervolted them later on. What users do is their business.
GeForce RTX 30s on the other hand were crashing left and right when hitting around 2040-2050 MHz (which many did at least on launch voltage/clockprofiles), which led to NVIDIA adjusting their voltage and clockspeed curves.

Firstly how many cards were crashing? Whats the percentage? If there are only 200 cards out in the wild, do you think the crashing percentage is the same as 10 000 cards in the wild? Secondly you are going on the record that 2 Ghz was the default clock rate? Because nowhere was it ever stated that was a normal frequency. That happened because of poor QA and lack of communication between Nvidia and AIBs



https://www.guru3d.com/news-story/g...ely-due-to-poscap-and-mlcc-configuration.html
 
Firstly how many cards were crashing? Whats the percentage? If there are only 200 cards out in the wild, do you think the crashing percentage is the same as 10 000 cards in the wild? Secondly you are going on the record that 2 Ghz was the default clock rate? Because nowhere was it ever stated that was a normal frequency. That happened because of poor QA and lack of communication between Nvidia and AIBs
https://www.guru3d.com/news-story/g...ely-due-to-poscap-and-mlcc-configuration.html
Enough cards for NVIDIA to adjust their voltage/clock curves.
It's not that 2 GHz would be some sort of "default clock rate", the only official clocks are base and Boost which are nowhere near 2 GHz, but NVIDIA allows the cards to boost however high they can go given the limitations (voltage, heat, power).
With original voltage/clock curves, cards started crashing around 2040 or 2050 MHz, which many could reach with stock settings.
The caps could have been part of the fault, but the difference between what's supposedly the worst and best cap configuration is couple 10s of MHzs or so (Der8auer tested this by switching caps on same card)
Also none of the cards use POSCAPs, they're SP-Caps. The POSCAPs term just stuck for whatever reason after first reports used the wrong term.
 
Enough cards for NVIDIA to adjust their voltage/clock curves.
It's not that 2 GHz would be some sort of "default clock rate", the only official clocks are base and Boost which are nowhere near 2 GHz, but NVIDIA allows the cards to boost however high they can go given the limitations (voltage, heat, power).
With original voltage/clock curves, cards started crashing around 2040 or 2050 MHz, which many could reach with stock settings.
The caps could have been part of the fault, but the difference between what's supposedly the worst and best cap configuration is couple 10s of MHzs or so (Der8auer tested this by switching caps on same card)
Also none of the cards use POSCAPs, they're SP-Caps. The POSCAPs term just stuck for whatever reason after first reports used the wrong term.

Good so we agree that 2 Ghz was not an official boost rate but rather due to insufficent testing because of lack of time. That has nothing to do with yields and it doesnt even have anything with what i wrote. You setup a strawman and refuted it

the crashing is fixed now but people still undervolt Ampere because the frequency/voltage is still past the best efficiency curve. The same thing happened with Polaris and Vega but nobody claimed yields were bad at TSMC/Glofo because of that
 
It's speculation at this point, but, if 3080 20gb is a real thing, do you think that nVidia can EOL 3090 24gb ? If there is a 500-600$ difference between the two, I wonder how 3090 can still sell .
 
Good so we agree that 2 Ghz was not an official boost rate but rather due to insufficent testing because of lack of time. That has nothing to do with yields and it doesnt even have anything with what i wrote. You setup a strawman and refuted it

the crashing is fixed now but people still undervolt Ampere because the frequency/voltage is still past the best efficiency curve. The same thing happened with Polaris and Vega but nobody claimed yields were bad at TSMC/Glofo because of that
No-one is claiming Ampere yields are bad because users are undervolting the cards.
The claim seems to be it's bad because
- Lack of cards, which could be explained by rushed launch if it's fixed quick, but at current rate it seems we're talking launch being 3+ months premature, which seems too much for just rushed launch.
- NVIDIA had to adjust clock/voltage curves to prevent cards from crashing post-launch (crashing occured even on Founders Editions, so blame can't be shifted to AIBs), which could indicate yield issue with high enough grade chips forcing them to use worse binned chips than normally
 
NVIDIA had to adjust clock/voltage curves to prevent cards from crashing post-launch (crashing occured even on Founders Editions, so blame can't be shifted to AIBs), which could indicate yield issue with high enough grade chips forcing them to use worse binned chips than normally

Not really. Clock ranges were essentially the same after the fix. Also the issue was so widespread and repeatable that it couldn’t be due to a few bad dies. It was clearly a software QC issue and the hardware is fine.

I think it’s been well documented by now that the issue was caused by sudden and short lived spikes in clock speed to ~2050Mhz. Those spikes did nothing for performance but wrecked stability.
 
I'm beginning to believe that by the time RTX 30 is readily available, it might have lost much of it's intial appeal.

Possibly. I know my enthusiasm has come way down. But that’s normal for any hyped product. Navi can either put a nail in Ampere’s coffin or generate even more demand for Nvidia’s cards depending on how it looks on the 28th.
 
Last edited:
Back
Top