NVIDIA Kepler speculation thread

TDP is not 225W. Someone just added the 2x6pin +slot power up and sold it as TDP. Even if the TDP is 225W, that does not equal consumption which could be significantly lower.
 
Anyway, this situation seems to be very similar to late 2009. When AMD launched HD 5800, Nvidia promised better pricing, better availability, better technologies, better performance... they would promise anything including stopping global warming to ruin competitors launch, while they had nothing in hands just like now :)

nVidia hasn't promised a thing, they've actually been dead quiet. All they've said to date is that Kepler has up to 2.5x the DP perf/w of Fermi and that claim was made a long time ago, well before Tahiti showed up. Or is Charlie now nVidia's mouthpiece? Haha now that would be funny.

In any case there's no need to ruin the 7970 launch. You can barely find one in stock. If they start blabbing it will be when they actually feel threatened or insecure about Kepler's prospects or when they have something concrete to show.
 
There's couple different, IMO realistic scenarios:

- GK104 beats GCN, but it isn't Tahiti it's beating, however conclusion can be drawn that if GK104 beats Pitcairn, highend Kepler will beat Tahiti

- GK104 beats Tahiti, but is also the highest end chip there will be for desktops (there was rumors that there would be "monster chip coming later but only for HPC etc markets, not desktop)

The possibility that it's midrange chip and beating Tahtiti is of course possibility, but the chances for such are extremely slim, considering how close to each other they've been in performance ever since "R300/R350 vs NV30/35 in DX9"-fiasco, the largest difference I can remember since then would be 8800 GTX/Ultra vs 2900XT, and even then 2900XT still fought head to head on all fronts against 8800GTS which was considered highend
 
NV needs to absorb as much customers as possible from Tahiti and their target for GK104 was GF110+30% so if they were able to reach that with their current silicon is still up in the air. Now, it may not beat a HD7970 hands-down (maybe in some cases, and if it did, I would be extremely impressed), obviously, but this is not what this is about.

It has to replace GF110, it has to be very competitive or better in terms of TDP, performance, die size and it has to have a price that attracts a lot of customers.

The only thing that still needs to be taken care of is TSMC, whose 28nm 28HP target seems to have very high variability still and is also missing some production lines that can only be installed so fast. That takes time and NV has to wait for it. Orders should have been placed for March.

This is why Huang said that we or rather "you have to be patient about it". You have to be patient about it, because it takes time for NV to finally have a product in hand backed up with enough volume from the foundry to make an impression and reach customers.

Also, I´ve heard that GF11x is also targeted for Desktops. So that would make it even more interesting.
 
Last edited by a moderator:
Even if Nvidia gets the same 5.5GT/s, it will still be at a major BW deficit, but why do you think can won't be able to get that? If Kepler is as impressive as Charlie suggests, they must have made major changes. The weak points of Fermi were obvious: power, shader performance, MC speed. Only logical that this is what they had to focus on.

I suspect that AMD does custom design for their memory controller PHYs, whereas Nvidia is probably just using whatever is available at TSMC. This is probably the reason why AMD has always had significantly faster and denser memory interfaces.

It's possible that Nvidia decided to do custom design for their PHYs, but I'm skeptical. That seems out of character for Nvidia, as physical design has never been their forte. Given that assumption, I'd expect that Nvidia's memory interface runs at 4.4-4.8GT/s (which is still 10-20% faster than 40nm). As you pointed out though, even if they hit 5.5GT/s, that would put them significantly behind AMD.

So my guess is that this might be more of a high-end card, but there is another SKU which is even higher performance. Anyway, we'll see.

DK
 
There's pretty much zero chance that nVidia is building an HPC only card. That market doesn't produce enough revenue to justify a dedicated ASIC.

The most realistic scenario is that nVidia aren't magicians and they simply have a chip that can spar with Tahiti but at a much lower price.
 
I don't think that GK104 will beat Tahiti. But I think that it'll make AMD to slash 7900's prices by quite a considerable amount. And don't forget that GK104 isn't NV's top Kepler. Also -- it's not the lowest.
 
TDP is not 225W. Someone just added the 2x6pin +slot power up and sold it as TDP. Even if the TDP is 225W, that does not equal consumption which could be significantly lower.

Yes i know that, but gtx460 has a 160watt tdp, 560ti 175watt, 225watt for an hypotetical gtx660 (rumors says that will be a gtx660ti too...) it's too much, imho...and what about the top one?
 
We don't know the TDP of GK104. All we know is that the card has two six pin connectors. What are you guys arguing over? :)
 
We don't know the TDP of GK104. All we know is that the card has two six pin connectors. What are you guys arguing over? :)


To be frank, we don't know that it has 2x 6pin connectors, we have some source(s) claiming it does
 
dkanter said:
I suspect that AMD does custom design for their memory controller PHYs, whereas Nvidia is probably just using whatever is available at TSMC.
Are you serious? TSMC hasn't offered this kind of IP in years. You can buy some of it from third parties, but it's never available in the aggressive process time frames that AMD and Nvidia need. Pretty much all major and not so major fabless silicon companies design their own cells except for the standard cell library: PLLs, A/D, D/A, pads etc.

There was a time were fabs provided extensive IO pad libraries, even for SDRAM interfaces, but just an IO pad doesn't cut it anymore for the speeds of today: companies use high complexity macro blocks that have calibration and retiming logic built-in. This logic is intimately connected to the memory controller. You can't take of the shelf IP and expect it to work.

IP providers in general have quit this market anyway, forced or by choice: there's no money in design IP, ARM being the notable exception. (Still waiting for even success story IMG to turn a sustainable profit.) This is more so for analog IP.

As you pointed out though, even if they hit 5.5GT/s, that would put them significantly behind AMD.
Yes, but the question is one of the overall memory architecture, caches included.
And for the 7970, I've seen benchmarks where, at drastically reduced MC clocks, to emulate a 256-bit MC, performance moved by only a few % for most (but not all) applications.
 
Pretty much all major and not so major fabless silicon companies design their own cells except for the standard cell library: PLLs, A/D, D/A, pads etc.

Which raises the question of why AMD is so much better at it for so long? Is it one of those ancient arts that you can't learn in grad school?
 
Yes, but the question is one of the overall memory architecture, caches included.
And for the 7970, I've seen benchmarks where, at drastically reduced MC clocks, to emulate a 256-bit MC, performance moved by only a few % for most (but not all) applications.

So have I, but those benchmarks were limited to single-monitor setups and MSAA 4X, no 8X.

I suspect 5760×1200 with MSAA 8X would be a different story.
 
I don't think that GK104 will beat Tahiti. But I think that it'll make AMD to slash 7900's prices by quite a considerable amount. And don't forget that GK104 isn't NV's top Kepler. Also -- it's not the lowest.

If it is close in TDP and area, it wouldn't matter much.
 
So have I, but those benchmarks were limited to single-monitor setups and MSAA 4X, no 8X.

I suspect 5760×1200 with MSAA 8X would be a different story.

Yeah probably but when talking about whether a lower bandwidth GK104 could challenge Tahiti those settings are pretty inconsequential for the vast majority of potential customers. AMD can't rely on Eyefinity users for its bread and butter, even at $550.
 
Which raises the question of why AMD is so much better at it for so long? Is it one of those ancient arts that you can't learn in grad school?
I think Nvidia was caught off guard by RV770 using GDDR5, so Fermi was their first generation that really depended on GDDR5. It's really only one generation where the difference was very important. (A faster GDDR5 would not have lifted GT215 from its mediocrity, so there was not point speeding it up.)

There are much better explanations that lacking an ancient art. ;)

Spec priorities, engineering resources, time to market and the inability to accurately predict the future. For all their differences, I always find it striking how close AMD and Nvidia are for each generation.

Maybe when initially architected, Fermi was supposed to be 1.5 year earlier on the market? In which case, it would make sense to spend less them optimization timing because high speed GDDR5 wouldn't be ready yet?

Maybe it was not that late, but high speed GDDR5 came to market quicker than initial promised?

Maybe optimization was ongoing by aborted because getting the product to market at lower speed was considered a reasonable trade-off vs not having a product on the market at all?

Maybe they believed that their current speed was simply good enough?

Don't forget that speeding up your RAM interface is more than getting the PHY to run faster. There's a huge amount of very complicated logic behind it to drive it. Even if it runs at 1/4 the speed, that's 1.35GHz if your interface is operating at 5.5GT/s. Speeding this up is a lot of effort.
 
Last edited by a moderator:
In any case there's no need to ruin the 7970 launch. You can barely find one in stock.
Here in germany they're actually pretty available - not from every shop at every single point in time, but you could always get one provided your willingness to shell out 500+ EUR.

It's possible that Nvidia decided to do custom design for their PHYs, but I'm skeptical. That seems out of character for Nvidia, as physical design has never been their forte.

Is there a large difference between going custom for PHYs and going custom for purely digital logic like their shaders? Your post makes it sound like there is.

Yes, but the question is one of the overall memory architecture, caches included.
And for the 7970, I've seen benchmarks where, at drastically reduced MC clocks, to emulate a 256-bit MC, performance moved by only a few % for most (but not all) applications.

This for example:
http://www.computerbase.de/artikel/...deon-hd-7970/20/#abschnitt_384_bit_in_spielen

Their rating says it's 14-15 percent plus for 50 percent more memory transfer.
 
Last edited by a moderator:
So have I, but those benchmarks were limited to single-monitor setups and MSAA 4X, no 8X.

I suspect 5760×1200 with MSAA 8X would be a different story.
Sure, but who cares? ;)
If you can get close to the competition's performance for a lower cost while still reaching 90% of your customers, it's a reasonable trade-off. AMD has been doing for years.

The 384-bit interface for 7970 was dictated by HPC, IMO. That's where bandwidth is king.
 
Back
Top