Yep, quite happy now, that I didn't get a 3090 in the first round. With the second round probably still a week or two away, I think I'll rather wait for what AMD has in store - or I stay with my Vega56 for another gen. Mostly playing Defense Grid and Talos Principle anyway... but then, Nov. 19th, Cyberpunk 2077 is coming.This is turning into a pretty disastrous product launch.
Good thing is all the criminal resellers are now stuck with unsalable cards.
As it turns out, they're not even POSCAPs, they're SP-CAPS which are both more expensive and better.Yes, and also reported cases for MSI and EVGA models which used the same parts list as the Founders Edition. Albeit not reported at the same frequency as models using POSCAP on the NVDD rail too. May be same issue, but with a higher error margin, may be unrelated issues. Difficult to tell apart from PSU problems.
One unknown user (assuming it's not fake, Igor didn't link to a source) apparently got a Zotac GPU stable at boost clocks by replacing POPSCAP on NVDD by MLCC group, which provides a strong point in case. Assuming that user had sufficient knowledge about electrical engineering not to fall for PSU issues.
Might not be the worst of ideas.
More expensive than POSCAPs, but still cheaper and less suited for high frequency applications compared to a whole array of 10 MLCC.As it turns out, they're not even POSCAPs, they're SP-CAPS which are both more expensive and better.
This is turning into a pretty disastrous product launch.
Good thing is all the criminal resellers are now stuck with unsalable cards.
Because their Boost-algorithm allows the GPU boost that high from stock if the cooling and powerlimits allow itIf the crashing happens at 2Ghz and above, that is already an overclock.
Why should Nvidia be mindful of it?
I want my i9 to hit 5.2Ghz, if it does not, i cannot blame Intel or Asus.
The initial report on the Nvidia forums was about a crash to desktop at a 2Ghz boost, hard wall.If the crashing happens at 2Ghz and above, that is already an overclock.
Why should Nvidia be mindful of it?
I want my i9 to hit 5.2Ghz, if it does not, i cannot blame Intel or Asus.
And that also adds to that... Even though at least that part could be fixed with a driver / firmware update, putting a hard cap on boost clocks independently from base clocks. And probably also cutting all the "OC" models down to base clocks, where the legal issues arise.Because their Boost-algorithm allows the GPU boost that high from stock if the cooling and powerlimits allow it
Because their Boost-algorithm allows the GPU boost that high from stock if the cooling and powerlimits allow it
maybe it's not much, but it's such a very bad sign that those cards should be replaced. How many months can they hold up until they get toasty or damaged? I mean, for someone who spent 800€-1500€ on a card that's a pretty serious issue. They can tweak the bios, but those capacitors are still limited. That was caused 'cos of nVidia's secrecy, which I am fine with, but it's obvious that they are failing and they must be replaced even if you have to stay below certain thresholds, with more use it can get worse.They can hit near 2ghz just not sustained. They’ll boost for a few frames depending on a workload, just long enough to crash. I imagine the boosting behaviour just needs to be tweaked. Sounds like more of a firmware issue to me than something where cards would need to be rebuilt.
darn!!! I thought the ASUS TUF were the safest, better new nVidia GPUs out there yet, and costing like the Founders Edition, the cheapest in the market. Disappointed.Not many other options, even if that means that the planned launch stock has to be rebuilt (meaning at least 2 shipping round-trips before anything fixed ends up on the shelf), and all the broken models have to be flashed with a throttled firmware and rebranded.
I wouldn't be surprised at all if we were seeing the faulty 3080/3090 models again as "3070" or "3070 Ti", including their oversized PCBs, at most stripped from their coolers. I don't see any other option to prevent a total loss of the inventory.
As for the models already out there, a recall is the only option. Throttling the product when already owned by the customer would end up in lawsuits.
On the bright side, at least some AIBs (Asus, maybe others as well?) got it right before shipping their first batch.
For the remainder of the AIBs and NVidia themselves, this is going to leave a huge dent in the financial projections for 2020.
More expensive than POSCAPs, but still cheaper and less suited for high frequency applications compared to a whole array of 10 MLCC.
Nice digest of the whole topic in the current state:
EDIT1:
https://forums.evga.com/m/tm.aspx?m=3095238
Official statement from EVGA, reviewers got faulty models, all production units are supposed to be cleared from this specific issue. The claim that 1 MLCC group is sufficient still needs to be validated though, as failures are still reported in the wild.
EDIT2:
And a couple of electrical engineers are voicing misgivings regarding MLCC too, as it's prone to aging, voltage and temperature related issues. If the MLCC groups end up failing too (over time, as it's doubtful whether they have sufficient safety margin), then this may yet turn into a perfect disaster.
EDIT3:
As to why the doubt about safety margin, some vendors have only 220uF capacity per group, some 330uF, some went for a more conservative 470uF per group. EVGA appears to be in the 220uF category. For comparison, Founders Edition uses 470uF per group on NVVDD rail, 220uF per group on MSVDD rail. Asus models are all 470uF.
EDIT4:
And that's an Asus TUF failing for a reviewer. So apparently it's not all about the caps, even though they do play a role as EVGA confirmed unambiguously. Coincidentally, some 20 series owners also report similar crashes with 30 series launch drivers though, so may as well be bad drivers as a cherry on top.
@Cyan It's definitely not good if cards are crashing, but I don't think we know that the cap selection is actually bad. Any card will crash if you push the frequency too high. If their boosting algorithm is a little too aggressive, that's all it'll take to make the card reset.
Igor's lab testing showed spikes of over 500W at <1ms. Could it be that the firmware or drivers are missing some hard limiter that would stop the card from going that high? If memory serves Turing and Pascal wouldn't peak that high relative to their averages.
Short term peaks where at 430-460 watts (depending on the card) with 2080 Ti also, given their lower TDP rating, relatively speaking, they did not peak substantially lower.If memory serves Turing and Pascal wouldn't peak that high relative to their averages.