NVIDIA Kepler speculation thread

Well, if GK104 is close to 7970 stock performance with 50% less bandwidth, less power and a smaller die (I'll believe it when I see it - and I would like to), then "beat handily" is a bit of an understatement IMO.

Thinking wishfully ... maybe it's so good that Charlie applied negative bias and still ended up with a compliment?
 
What Charlie said may very well possibly be true.

Even if nvidia uses a fermi like design, and lets say that the 28nm process reduces the die size by 40% (similar to ati 28nm)
they could make a gtx 580 shrink in 306mm^2
and then decide if they want:
40% less power consumption or
40% more performance.

I don't see how they can screw tings up.
 
Let's assume that he's right for a change, and that GK104 is great in terms of perf/W and perf/mm2. ( I don't see what else the 'better at everything' could mean.)

Then he's essentially writing today that his last 20 months of Kepler reporting were 100% incorrect.

Only a month ago he was writing about how his moles were whispering stories about epic fail.

I guess that would settle things once and for all. 'semi' accurate a gross exaggeration. One can only hope. ;)
 
I've been pinching myself for a few minutes now, but it looks like Charlie is indeed saying very good things about Kepler: http://semiaccurate.com/2012/01/19/nvidia-kepler-vs-amd-gcn-has-a-clear-winner/

Considering the April - May timeframe it had better be better than whatever it is competing against or it ends up just repeating GF100 versus Cypress, but in a shorter time frame.

If it includes perf/mm^2 as well as power consumption advantages versus its direct competition I'll be suitably impressed.

And best of all...Nvidia FINALLY includes a DP connector. About freaking time. I wonder if it'll be able to support at least 3 monitors or if they are still stuck with only 2 simultaneous display outputs.

Then he's essentially writing today that his last 20 months of Kepler reporting were 100% incorrect.

Only a month ago he was writing about how his moles were whispering stories about epic fail.

Well he does mention that A2 silicon is what the production version will be. Perhaps A1 silicon was disasterous in performance or some other metric, hence the bad news from his moles.

Regards,
SB
 
Silent_Buddha said:
Well he does mention that A2 silicon is what the production version will be. Perhaps A1 silicon was disasterous in performance or some other metric, hence the bad news from his moles.
After all this years on this forum, I hoped that people here would understand that
A) metal spins are pretty much the rule in chip land
B) they fix minor functional issues

It's really hard to come with a case where it can fix disastrous performance.
 
After all this years on this forum, I hoped that people here would understand that
A) metal spins are pretty much the rule in chip land
B) they fix minor functional issues

It's really hard to come with a case where it can fix disastrous performance.

I could be wrong, but IIRC, R520 in the X1800 featured a quite disasterous problem in the metal layer that took them months to track down and fix.

Regards,
SB
 
I could be wrong, but IIRC, R520 in the X1800 featured a quite disasterous problem in the metal layer that took them months to track down and fix.
Yes. If I remember correctly, some via issue in a custom designed cell.

Most of the time, any metal spin fixes multiple problems that are each blocking issues in their own way: a hang, a functional mistake (in a GPU that could be the wrong pixel ending up on the screen etc.), a violation of a certain specification, etc. It's usually not the case that silicon is usable without those fixes: it may be perfectly valid to develop 99.9% of the software and do a major part of all validation. But it can also not be sold because, one way or the other, it would result in unacceptable behavior for the customer.

I've never worked on a chip that did not have at least 1 metal spin. I've also never worked on a chip where disastrous performance could be fixed with a metal spin. That's because disastrous performance is almost always an architectural issue. You can't fix something with a half a billion gates by metal patching 100 of them. (And 100 gates is generally considered a high complexity metal ECO.

Well he does mention that A2 silicon is what the production version will be. Perhaps A1 silicon was disasterous in performance or some other metric, hence the bad news from his moles.
All I'm saying is: if he mentions that A2 is required, jumping to the conclusion that there it may have had disastrous performance, doesn't make sense. There are 100 reasons for doing metal spins, disastrous performance is but 1 of them and a very low probability one at that.
 
Last edited by a moderator:
While not impossible, I'd be hard pressed to see how Nvidia could have better performance. Not only is the memory interface only 256b, but I sincerely doubt Nvidia will clock it any higher than 4.8GT/s and even that sounds high - 4.4GT/s is a much better estimate. That means around 150GB/s compared to ~270GB/s for AMD. That's a really big deficit to make up.

I'm also skeptical that Nvidia's highest end part is only 256b wide. That sounds rather unusual, given their history. But perhaps I'll be surprised. My guess though is that there is some missing or inaccurate information - or perhaps NV is going to do a dual-die for graphics...

DK
 
He says every metric. Perf/w or Perf/mmm² are possible, but that does not mean that GK104 is able to challenge the 7970 in anything but some hand picked scenarios. Maybe Charlie is just raising the expectations so that the actual launch will be felt as a disapointment.
There is only one thing wrong with Tahiti and this is the price. If NV launches the GK104 at 299-329$ (which equals the 560 2GB at launch) they would probably sell lots and lots of those cards.
 
Last edited by a moderator:
While not impossible, I'd be hard pressed to see how Nvidia could have better performance. Not only is the memory interface only 256b, but I sincerely doubt Nvidia will clock it any higher than 4.8GT/s and even that sounds high - 4.4GT/s is a much better estimate.
Even if Nvidia gets the same 5.5GT/s, it will still be at a major BW deficit, but why do you think can won't be able to get that? If Kepler is as impressive as Charlie suggests, they must have made major changes. The weak points of Fermi were obvious: power, shader performance, MC speed. Only logical that this is what they had to focus on.
 
Even if Nvidia gets the same 5.5GT/s, it will still be at a major BW deficit, but why do you think can won't be able to get that? If Kepler is as impressive as Charlie suggests, they must have made major changes. The weak points of Fermi were obvious: power, shader performance, MC speed. Only logical that this is what they had to focus on.
Yeah, I guess Cypress is a very relevant point of reference here: When people first saw the specs, they assumed it had to be highly bandwidth-limited, as AMD basically doubled shaders, TMUs and ROPs (compared to RV770) - but kept the (comparatively small) 256bit interface @ slightly improved memory clocks.

End of the story was that tiny little 256-bit HD5870 basically devastated mumbo-jumbo 512-bit GTX285 in every single way possibly - and the performance advantage even seemed to increase in typically bandwidth-intensive scenarios. Just look at what Cypress did to GT200b once you enabled 8xAA @ ultra-high resolutions ... it's not even funny: About the same (even slightly less) BW - but 50% more performance ...
 
End of the story was that tiny little 256-bit HD5870 basically devastated mumbo-jumbo 512-bit GTX285 in every single way possibly
Don't forget that GTX285 used slow GDDR3. In fact HD 5870 had higher memory bandwidth than GTX 285. But GK104 with GDDR5@256bit MC cannot offer higher bandwidth than GDDR5@384bit MC.
 
Don't forget that GTX285 used slow GDDR3. In fact HD 5870 had higher memory bandwidth than GTX 285. But GK104 with GDDR5@256bit MC cannot offer higher bandwidth than GDDR5@384bit MC.
GTX285 -- 166 GB/s
HD5870 -- 153 GB/s

;)
 
It's a pretty bad comparison. The 285 had 8% more bandwidth but the 5870 was far ahead in all other metrics and built on the smaller 40nm process. GK104 is probably going to come in way behind Tahiti in several areas (bandwidth, texturing, flops) and they share the same process.
 
Back
Top