NVIDIA Kepler speculation thread

So the obvious question is, how does it compare to 3x1080p on HD7970 AND was that 29 FPS measured on the central or surround displays :D
 
The resolution only increases pixel and not tessellation load, playing into HD7970's hands.

OT of Heaven 3.0: Results I've seen in the lab indicate that the 3.0 version seems to be quite a bit "faster", i.e. achieving higher fps - at least on some sort of cards.
 
I thought very high resolutions are a strongsuit of AMD cards in general. But then again, 22fps would make it roughly 30% faster than a 580, so that would be normal I guess.
 
This is curious btw
3dm11%20gtx%20680.png


Notice the default clocks?

edit:
GPU-Z obviously detects "shaderclocks" wrong, likely just "new enough nvidia, let's use gpu clock * 2"
 
Last edited by a moderator:
DarthShader said:
So when are we going see the fixed bigK, aka GK100? Or at least GK110? GK112?? ;)
Well, somebody once claimed that it is fundamentally impossible to have 4x distributed geometry processing without having massive latency and gargantuan power consumption. That same someone inexplicably continued to prove this by pointing at, wait for it: a cell characterization problem that was fixed with a metal patch. (WTF?) Then he murmured something about crossbars too, which is funny, because I didn't really expect a crossbar that serves memory to have much to do with distributing geometry at the front-end. Right? It made him even speculate that GK104 would only have 2x geometry, because that's what sensible people do.

It all makes me wonder: what if Nvidia had used a 2x configuration instead of 4x, how much lower do you think the power of GK104 could have been?

What a missed opportunity...

Edit: I forgot the best one: the distributed geometry architecture is responsible for increase power consumption during compute. You can't make this up...
 
Last edited by a moderator:
The resolution only increases pixel and not tessellation load, playing into HD7970's hands.

OT of Heaven 3.0: Results I've seen in the lab indicate that the 3.0 version seems to be quite a bit "faster", i.e. achieving higher fps - at least on some sort of cards.

After reading the release notes it was released not long ago. I think the previous version of that benchmark program is 2.5. I would like to see 3.0 vs 2.5 just to see if performance has changed or not using both cards.
 
This is curious btw
http://www.sf3d.fi/sites/default/files/3dm11 gtx 680.png
Notice the default clocks?
Or you might simply ask Techpowerup's w1zzard if GPU-z 0.5.9 fully supports Kepler yet. [Hint: It doesn't, as you have noted above wrt to shaderclocks].


I thought very high resolutions are a strongsuit of AMD cards in general.

Yes, they allow AMDs higher pixel and texture fill as well as compute play to it's strength (and the front end and load distribution are not as much a stumbling block). In this case though, being already limited by tessellation performance, the normal advantage could very well be amplified quite a bit.
 
Last edited by a moderator:
This is curious btw
3dm11%20gtx%20680.png


Notice the default clocks?

edit:
GPU-Z obviously detects "shaderclocks" wrong, likely just "new enough nvidia, let's use gpu clock * 2"

What I did notice, is that the gpu score shown here is 66% higher than my stock GTX 570! Wow! :oops:

If the performance difference remains the same in real games as well, I may have mislabeled the asking price as ridiculous.

Btw a Zotac 680 has been listed for 507 euros.





After reading the release notes it was released not long ago. I think the previous version of that benchmark program is 2.5. I would like to see 3.0 vs 2.5 just to see if performance has changed or not using both cards.

According to a fellow GTX 570 user, these are his results of 2.5 vs 3.0 with the 570 at stock.

Capture3.jpg
Capture4.jpg
 
Well, somebody once claimed that it is fundamentally impossible to have 4x distributed geometry processing without having massive latency and gargantuan power consumption.
Latency? Cross-bar interconnects are the first choice to be used in cases of low-latency communication between moderately large number of clients. The problem with complex cross-bar interconnects is the accumulation of hotspots due to signal crossings. In GF100 the distributed nature of both geometry processing (16x) and primitive setup (4x) asked for very dense wiring mesh. JHH said that this aspect of the architecture was the main reason the product delays and metal re-spins. The other obstacle was the large transistor leakage variance.
 
Back
Top