NVIDIA Kepler speculation thread

I see different, I don't see better

It is very simple- Tahiti is bigger, more power hungry and offers much lower margins overall compared to GK104. Tahiti does the same as what the R600 did compared to 8800 GT (GTX 680). While Titan is the new damage aka 8800 Ultra
History repeats itself.

And yeah, architectures obviously have their weaknesses and strengths but AMD arranged the things in such a way that Nvidia with the smaller chip can achieve the same performance and gain noticeably higher margins...
 
Implementation != Architecture. Tahiti's trade-off's are very product specific to address different segments.

Look elsewhere are you'll see the performance/area or performance/watt relationship look very different. For instance, the top end of Pitcairn is not far away from the performance of the bottom of GK104 SKU; the top SKU for GK106 (slightly larger die than Pitcairn) sits between the Pitcairn SKU's and the bottom of GK106 is in a solution cost space close to Bonaire which is significantly smaller.
 
It is very simple- Tahiti is bigger, more power hungry and offers much lower margins overall compared to GK104.

It's also faster. For ~25% more area you get ~15% more performance. As I mentioned before Titan has ~30% more performance for ~50% extra area. Those numbers look quite similar to me.

Tahiti will have ~155 die candidates per wafer compared to GK104's ~195, but the market is so small that it barely matters.
 
It is very simple- Tahiti is bigger, more power hungry and offers much lower margins overall compared to GK104. Tahiti does the same as what the R600 did compared to 8800 GT (GTX 680). While Titan is the new damage aka 8800 Ultra
History repeats itself.

And yeah, architectures obviously have their weaknesses and strengths but AMD arranged the things in such a way that Nvidia with the smaller chip can achieve the same performance and gain noticeably higher margins...

You should narrow down your argument to gaming, and even then you should narrow it down to 2012/13 and older titles, because heavy compute driven game engines are favouring Tahiti and let's not even mention compute outside of gaming.
 
You should narrow down your argument to gaming, and even then you should narrow it down to 2012/13 and older titles, because heavy compute driven game engines are favouring Tahiti and let's not even mention compute outside of gaming.

Which titles are "heavy compute driven"? Tahiti is not better than GK104 in that regard when we look at the most advanced and demanding titles today. Just compare the similarly specced GK104 and Tahiti LE and you'll see that they operate on the same performance level. If Tahiti is faster, it's due to 30% more GFLOPs and 33-50% more memory bandwidth, not due to the architecture itself.
 
One of the big visual differences between Tahiti and the smaller chips is the size of the GDDR interface. There's quite a few mm2 just for the PHY.
If there were a hypothetical variant that dialed back on the interface, the perf/mm might look a little better, although that bandwidth gets leveraged pretty heavily once things kick into gear.
 
Really?

I see different, I don't see better. There are legitimate arguments to indicate that the tradeoffs made for Keplar are a blind alley and they will need to bias back for future gen.

Would you care to elaborate? Or point me to some discussion of said arguments somewhere else.

I suspect you're referring to registers/caches, but I'd be interested to have your/AMD's perspective on this.
 
Last edited by a moderator:
Kepler's exposure of instruction latency to the compiler might be something to quibble over.
At the same time, it also continues to maintain the leaky SIMT abstraction, which in turn may have something to do with not having an exposed scalar unit like GCN or an integer domain on a CPU.
These are at least running counter to general computing norm.

The way shared memory and global memory are differentiated is different from CPUs and from GCN. I'm not certain from an ISA standpoint if that concept will persist. Bonaire's level adds a flat addressing mode that at tries to handle regular and LDS data as part of a single space, although there are hazards the software must watch out for.
 
Last edited by a moderator:
I'm not sure it really matters how many additional CU's you throw onto Tahiti, it's still going to have the same (or very slightely higher) memory bandwidth with current technology so is Tahiti really going to be able to achieve 30% more performance with the same memory bandwidth?

Also there's a lot of talk about GK110 being "gaming focussed" but I thought GK110 included all the extra compute capability that was lacking in GK104. So it's pretty similar to Tahiti in that respect isn't it?
 
Would you care to elaborate? Or point me to some discussion of said arguments somewhere else.

I suspect you're referring to registers/caches, but I'd be interested to have your/AMD's perspective on this.

Dave drops some good hints every so often and it's worthwhile paying attention to them and reading between the lines. ;)

AMD is driving the industry towards compute driven games. This will also help them immensely with APU's as it helps to lessen the bandwidth penalty. They have the consoles, they have the games devs and they have the cards. Basically speaking, everything is already in place and Nvidia will have to do a 180 under these conditions.
 
You managed to put quite a lot of BS into such a short post there SB.

Really?

Pitcairn is 212 mm^2.
GK106 is 221 mm^2

GK106 is unable to match Pitcairn in either performance or power consumption under load. Now the difference isn't huge, but it is notable. If AMD had made a GK104 sized chip based on a similar design philosphy as Pitcairn it likely would outperform GK104 as well. Not by much, but still noteable.

Regards,
SB
 
Dave drops some good hints every so often and it's worthwhile paying attention to them and reading between the lines. ;)

AMD is driving the industry towards compute driven games. This will also help them immensely with APU's as it helps to lessen the bandwidth penalty. They have the consoles, they have the games devs and they have the cards. Basically speaking, everything is already in place and Nvidia will have to do a 180 under these conditions.

What games are we talking about? As I've said, GCN is no special when it comes to the latest games. If Nvidia were to beef up GK104 to 2048 Cuda Cores and give it 250-288 GB/s bandwidth, Tahiti wouldn't stand a chance.
 
Really?

Pitcairn is 212 mm^2.
GK106 is 221 mm^2

GK106 is unable to match Pitcairn in either performance or power consumption under load. Now the difference isn't huge, but it is notable. If AMD had made a GK104 sized chip based on a similar design philosphy as Pitcairn it likely would outperform GK104 as well. Not by much, but still noteable.

Regards,
SB

To be honest GK106 is not far off depending on what review one looks at and you also said that a 294mm2 uber Pitcairn could reach Titan in performance...That was by far the biggest issue I had with your post.

I do admit that I did undervalue the performance of Pitcairn a little bit, but mostly compared to Tahiti.
 
If AMD had made a GK104 sized chip based on a similar design philosphy as Pitcairn it likely would outperform GK104 as well. Not by much, but still noteable.

So then why haven't they?

It's been over a year since the GK104 was released and with 28nm a mature product yet no magical Pitcairn GK104 has been seen.

Seems more like AMD knows reasons why it can not or should not be done.
 
From a product perspective there is no need, Tahiti already fulfils that performance role in the gaming segment. Their point is illustrative that with a different emphasis on implementation choices that type of performance level can be achieved at a smaller die size.
 
It is very simple- Tahiti is bigger, more power hungry and offers much lower margins overall compared to GK104.
It's a little bigger because it has compute features, not because of a fundamental architecture flaw.

It has probably lower margins, but given the high price they can still command (by historical standards) it should still be extremely profitable. Nvidia managed to be very profitable with Fermi, not exactly a poster child of area and power efficiency.

It's a bit more power hungry. Not a big deal.

Tahiti does the same as what the R600 did compared to 8800 GT (GTX 680). While Titan is the new damage aka 8800 Ultra
History repeats itself.
You're comparing Tahiti to R600? The chip that was 9 months late and didn't even come close to its competitor? Seriously?

And yeah, architectures obviously have their weaknesses and strengths but AMD arranged the things in such a way that Nvidia with the smaller chip can achieve the same performance and gain noticeably higher margins...
Dave mentioned some time ago that AMD finally starts to see some traction in the professional work. If this is due to GCN-DP supports (not a bad guess, IMO) then the slightly lower margin on 7970 is a price well worth paying.

The only thing botched with 7970 was the launch wrt initial performance. Yeah, not a minor thing, but it has nothing to do with broken silicon or a broken architecture or implementation.

(Expecting a "I'm right no matter what you say. Point." answer)
 
How do people see Tahiti's compute capabilities stacking up against Titan?

Titan is a very impressive performer in certain CUDA workloads, but Tahiti delivers better overall performance in OpenCL benchmarks and applications. That could however be due to immature drivers from Nvidia or a deliberate lack of focus on OpenCL to extend the life of CUDA.

I've also heard it mentioned on this board that it is more difficult to extract maximum GPGPU performance from Kepler than it is on Fermi or GCN, but the rewards from doing so can be significant.
http://beyond3d.com/showpost.php?p=1711202&postcount=1238
http://beyond3d.com/showpost.php?p=1721584&postcount=1492
 
Back
Top