Nvidia BigK GK110 Kepler Speculation Thread

The 7870 LE is the real bottom-scraping barrel of Tahiti like the old 4830/5830 before it. Take the absolute worst of the salvagable Tahiti dies and run as much voltage through them as you can and you're there.

You just can't compare this to the 670 and expect a similar matchup next series. What you can expect is AMD's marketing to fuck it up horribly and make Nvidia's tier below card look like its good competition anyway.
 
The 7870 LE is the real bottom-scraping barrel of Tahiti like the old 4830/5830 before it. Take the absolute worst of the salvagable Tahiti dies and run as much voltage through them as you can and you're there.

You just can't compare this to the 670 and expect a similar matchup next series. What you can expect is AMD's marketing to fuck it up horribly and make Nvidia's tier below card look like its good competition anyway.

My point was simply about raw power and what Nvidia and AMD can do with it. It's not about marketing, power efficiency or bottom-scraping ;)

Right now, AMD has almost 30% more compute power (4.3TF vs 3.3TF) and 50% more bandwidth than Nvidia and cannot translate that into at least a 30% lead on average.
 
I think it depends a lot on what that compute power is actually being used for. Clearly AMD has a far bigger lead in a lot of compute benches.

In terms of gaming efficiency then yes I think Nvidia is ahead very slightly from last series, however this was exacerbated by AMD being bandwidth heavy. it's unlikely we will see any changes from them however Nvidia will surely be upgrading the bus width at every level from 660 upwards. When that happens the perf/watt situation will swing in favour of AMD again.

The actual power penalty of carrying that extra bus (and memory if valid) is a lot higher than the performance gain it gives.
 
Last edited by a moderator:
Gaming of course. I thought that much was clear.

Well it depends again. See Dirt Showdown as an example. :p

I would advise against believing Nvidia is about to deliver some kind of knockout blow, if that's what you're imagining is about to happen. AMD has a history of getting a lot out of the same process and it would be wise to keep that in mind. ;)
 
AMD was directly involved in writing the renderer of Dirt Showdown. That is a bad example, since they likely made sure it runs like crap on Nvidia hardware.

I don't believe in a knowout blow, but a healthy advantage. Nvidia has still some 60-80W left (depending on application/game) until 250W TDP and they can increase bandwidth considerably - AMD cannot.
 
Nvidia will lose most of that 60-80W on the extra bus and memory alone. Tahiti was mediocre, AMD won't release two mediocre series in a row on the same process.

Truly...don't be suprised if it's a lot closer than you imagined. Nvidia should "win" but you'll be left with the usual hollow feeling that most "winners" of the past 3 years have given.
 
More like ~40%, going by Hardware.fr's numbers.

Still a pretty solid lead over the 7970 GHz Edition, though.

In strictly GPU limited situations the 690 is faster compared to 680 than what that 1080p comparison implies. Hardware.frs numbers have some scaling issues as well. Imo if you only measure GPU power, then you should make sure that the GPU has work to do, instead of being limited by something else.

I think pjbliverpools number was pretty much spot on or perhaps even slightly conservative. The suggested price is terrible though.
 
Nvidia will lose most of that 60-80W on the extra bus and memory alone. Tahiti was mediocre, AMD won't release two mediocre series in a row on the same process.

Truly...don't be suprised if it's a lot closer than you imagined. Nvidia should "win" but you'll be left with the usual hollow feeling that most "winners" of the past 3 years have given.

The memory (if it is 3GB) will not use more power. As far as I know, only an increased number of memory chips will do that. And the memory bus doesn't consume that much power. Look at K20X which has 4TF, 24 memory chips with the 384bit bus and consumes 235W max. IIRC, 1GB of GDDR5 (4x2 Gbit chips) use 8-10W.
 
Like that leak hintend I think nVidia is going to price and market this as a special edition card, super expensive. This'll be faster and more expensive than GTX 780 or AMDs next series. nVidia got to taste the sweet margins with the GK104 and they aren't going go back and sell these huge chips as a mainstream GTX part unless absolutely necessary. I believe the GTX 780 will probably be a 10 or 12 SMX part that is close to AMD's next top card in performance, but still not as fast as this Titan.
 
Like that leak hintend I think nVidia is going to price and market this as a special edition card, super expensive. This'll be faster and more expensive than GTX 780 or AMDs next series. nVidia got to taste the sweet margins with the GK104 and they aren't going go back and sell these huge chips as a mainstream GTX part unless absolutely necessary. I believe the GTX 780 will probably be a 10 or 12 SMX part that is close to AMD's next top card in performance, but still not as fast as this Titan.

Who is still a strange move... it can work on marketing side, but not so much. During 1-2 months, reviewers will praise the cards in all review as a monster, but quickly the card will be determined as unlikely available and impossible to find. + the price itself will put the card in is own league and will compet with different cards ( what if you can buy 2x 300$ cards and be as fast as this one. when you cant even find this Titan 900$ )

Peoples was allready complain of the 550$ price tags for " top single gpu cards " last year, not sure how can be recieved a 900$ GPU who is nearly impossible to find and buy.

I take those Titan informations with a little bit of care..

Now the good thing for Nvidia is whatever happend with AMD performance vs their own series, they can release a refresh in the wait of Maxwell. a GTX 785.
 
AMD was directly involved in writing the renderer of Dirt Showdown. That is a bad example, since they likely made sure it runs like crap on Nvidia hardware.
The renderer had to be done several months before AMD learnt details about GK104. Anyway, you compared GFLOPS and bandwidth, but ignored texturing and rasterizing power. GTX 670 has 17 % higher texturing power than Tahiti LE, so it has to be faster in scenarios, which aren't limited by arithmetic power.
 
Texturing power is hardly relevant today at these levels. The GTX560 Ti has more texel fillrate than the GTX580 and yet the latter is 40% faster. The GTX680 has 2.5x the texture fillrate of the GTX580 and is at best 50% faster under certain conditions. Shall I go on? :)

The GTX670 and the 7870 LE have the same pixel fillrate. The 7970 GE and the GTX680 also have the same fillrate (both boost to about 1050 MHz).

If those scenarios you speak of exist, they are very very rare and hardly qualify for judging performance in an average of many games. What matters today is compute power and bandwidth.

As for Dirt Showdown:
Kepler isn't that different from Fermi. I'm not that naive to trust AMD (or Nvidia for that matter) to play fair when they get their hands on the code or can influence devs this heavily. Think HawX 2, Crysis 2 etc. They would be stupid not to take that opportunity to promote their own cards.
 
Peoples was allready complain of the 550$ price tags for " top single gpu cards " last year, not sure how can be recieved a 900$ GPU who is nearly impossible to find and buy.

Well I think that like the 690, these will be easy to find after the initial wave of purchases is done.
 
The renderer had to be done several months before AMD learnt details about GK104. Anyway, you compared GFLOPS and bandwidth, but ignored texturing and rasterizing power. GTX 670 has 17 % higher texturing power than Tahiti LE, so it has to be faster in scenarios, which aren't limited by arithmetic power.

The renderer seemed to be a last-minute add-on when Dirt Showdown shipped. Global Illumination for example was present only greyed-out in the games' menu and you could only enable it in the XML-config file.

FWIW:
http://www.pcgameshardware.de/Dirt-...nchmarks-zur-Erweiterten-Beleuchtung-1018360/
There's a table at the end, showing performance implications for Forward+ and Global Illumination separated.
 
In strictly GPU limited situations the 690 is faster compared to 680 than what that 1080p comparison implies. Hardware.frs numbers have some scaling issues as well. Imo if you only measure GPU power, then you should make sure that the GPU has work to do, instead of being limited by something else.

I think pjbliverpools number was pretty much spot on or perhaps even slightly conservative. The suggested price is terrible though.

There are 2560×1600 numbers on this page as well. I got about 35% in 1080p and about 45% in 2560×1600, which is why I called it ~40%.

But look at it this way: the best GK110-based Tesla has 2688 shaders at 732MHz, for a TDP of 235W. Let's assume that this TDP would hold in games, not just in compute workloads that don't make much use of dedicated graphics hardware.

NVIDIA could enable the remaining SMX on a GeForce card. It may seem a bit unlikely because low-volume, high-margin products like Teslas are where you'd expect them to do that, but perhaps they're disabling one SMX for power more than for yields; or other financial/stock management reasons.

Now, let's say NVIDIA manages to enable this remaining SMX, and let's say it was disabled for yields, not power, and therefore enabling it does not result in a super-linear power increase. Just enabling this additional SMX takes us to about 250W with linear scaling (15/14 × 235). But a GPU is more than SMXs, so let's say NVIDIA can increase the clock speed a bit too, to 750MHz. Both NVIDIA and AMD seem to agree that 250W is the highest acceptable TDP on a single-GPU card, so I'll stop there.

Comparing this to the GTX 680, we get approximately:

Code:
(shaders)	×	(clocks)
(2880/1536)	×	(750/1070*) = 1.31, or a 31% improvement.
Maybe I'm being too conservative with clocks, maybe NVIDIA will manage to go a bit higher, perhaps with Turbo. So maybe the theoretical improvement is more like 35%, or even 40%. And that's an upper bound, assuming ideal scaling.

I don't think it's realistic to expect it to be anywhere near 50% faster.

*Typically, 680s tend to stay in rather high Turbo most of the time.
 
We'll see after the Titan release from 3rd party independent measurements how it stacks up against a GTX680 or anything else.

However whether 14 or 15 clusters enabled at the same theoretical frequency for both cases neither (noticable) performance nor power consumption/TDP will change.

Extrapolating GK110 desktop performance based on sterile unit amounts compared to GK104 is somewhat nonsense, since it would mean that there's no single difference between those two chips that could affect 3D performance. If you'd even have a corner case of 3D with a pinch of compute added to the mix it could get even more colourful.

What I personally want to see first is its MSRP. If the rumored $899 should be true, I'll have a damn hard time justifying its performance difference compared to a GK104 whether it's 30, 40 or even 50%.

Besides I'm afraid what you boys with that funky speculative tend to forget is that any of us with some good benchmarking experience could create a benchmark parcour where you have varying degrees of differences (always within borders of course). That said I'll be extremely dissapointed (all above aside) if the "Titan" isn't at least up to twice as fast as a GTX580; the point where everyone is curious about is obviously how often that occurs exactly, but once you're there the very same question applies for about every GPU out there compared to its direct predecessor.

Huge surprises like R300 vs. NV30 or G80 vs. R600 appear extremely rarely and I'm not even sure we'll see any such cases anymore in the future. The only other thing I'd like to see from BOTH IHVs this round are far more reasonable 28nm GPU prices. Not possible? I couldn't give a rat's ass I'll just get a whatever tablet and play angry birds until I get the first epileptic signs.

***edit: I asked someone if he could have a look what the chip stamp states on a Tesla K20X; it stated 35th week 12' A1 if that should be of any help.
 
There are 2560×1600 numbers on this page as well. I got about 35% in 1080p and about 45% in 2560×1600, which is why I called it ~40%.

But look at it this way: the best GK110-based Tesla has 2688 shaders at 732MHz, for a TDP of 235W. Let's assume that this TDP would hold in games, not just in compute workloads that don't make much use of dedicated graphics hardware.

NVIDIA could enable the remaining SMX on a GeForce card. It may seem a bit unlikely because low-volume, high-margin products like Teslas are where you'd expect them to do that, but perhaps they're disabling one SMX for power more than for yields; or other financial/stock management reasons.

Now, let's say NVIDIA manages to enable this remaining SMX, and let's say it was disabled for yields, not power, and therefore enabling it does not result in a super-linear power increase. Just enabling this additional SMX takes us to about 250W with linear scaling (15/14 × 235). But a GPU is more than SMXs, so let's say NVIDIA can increase the clock speed a bit too, to 750MHz. Both NVIDIA and AMD seem to agree that 250W is the highest acceptable TDP on a single-GPU card, so I'll stop there.

Comparing this to the GTX 680, we get approximately:

Code:
(shaders)	×	(clocks)
(2880/1536)	×	(750/1070*) = 1.31, or a 31% improvement.
Maybe I'm being too conservative with clocks, maybe NVIDIA will manage to go a bit higher, perhaps with Turbo. So maybe the theoretical improvement is more like 35%, or even 40%. And that's an upper bound, assuming ideal scaling.

I don't think it's realistic to expect it to be anywhere near 50% faster.

*Typically, 680s tend to stay in rather high Turbo most of the time.

I think based on previous Teslas, it's hard to figure out Geforce clocks based on them. I wouldn't be surprised if this card turns out to be closer to 300W and in any case I expect the clocks to be higher than 750Mhz, more like 850Mhz and up with turbo. Perhaps even close to 690 clocks and TDP.

1080p was only part of the problem I had with those results. Dirt Showdown doesn't scale at all, Crysis 2 doesn't even have full Ultra settings at 1080p on their test just to name a few, if one cherry picks only the most demanding tests from there (Imo makes sense for measuring absolute GPU power), you can see that 690 is a lot faster than 680, almost double minus the clock difference and then 85% of that equals quite a bit. But yeah there are too many moving parts and a rumour behind it all, so let's hope it comes out soon and we'll see for ourselves.
 
Texturing power is hardly relevant today at these levels. The GTX560 Ti has more texel fillrate than the GTX580 and yet the latter is 40% faster. The GTX680 has 2.5x the texture fillrate of the GTX580 and is at best 50% faster under certain conditions. Shall I go on? :)
This comparision clearly shows, how wrong is to look at one aspect and ignore the rest. AMD and Nvidia have slightly different ALU:TEX ratio, so at similar gaming-performance level, AMD has higher ALU-rate and Nvidia has better texturing performance. That's nothing new, it was similar in G92 vs. RV770 days.

The 7970 GE and the GTX680 also have the same fillrate (both boost to about 1050 MHz).
Well, 7970 GE doesn't hit 1050 MHz very often while many review samples of GTX 680 run mostly around 1100 MHz. That can affect result of your comparision by up-to 10 %.

If those scenarios you speak of exist, they are very very rare and hardly qualify for judging performance in an average of many games. What matters today is compute power and bandwidth.
So, why Nvidia decided to double texturing power with GK104 (compared to GF114/110)? TMUs aren't really small (cheap) units. :smile:
 
So, why Nvidia decided to double texturing power with GK104 (compared to GF114/110)? TMUs aren't really small (cheap) units. :smile:

You would have to ask Nvidia that. Maybe 12 TMUs per SMX wasn't feasible and instead of sticking with 8, they went to 16? This doesn't mean anything.

The rest of what you are saying is all true, but from looking at benchmarks across different generations I cannot see fillrates being that important on average. Compare the 7970 GE with the GTX670 if you want the clocks being closer together. Their fillrates are equal, but the 7970 GE is 20-30% faster depending on settings on average. That would not be possible if fillrate was more important.
 
Back
Top