NVIDIA Kepler speculation thread

Was there a VLIW rumor at some point?
They seem to be proposing LIW (2 DP FMAs, 1 LD/ST per clock) for Echelon/Einstein, which is IIRC targeted at something like the 10nm process timeframe, ~2017. (See the Maxwell thread, lots of interesting links in there). I suppose one could claim this kind of detail shouldn't come up unless implementation of at least some of those ideas was imminent. But I get the impression that holding ones breath on that one might be suicidal :).
 
GPUs are multi-threaded since years.

Maybe here is some native Japanese speakers, who can translate it...

The writer is saying that it reminds him of Intel Hyperthreading, and how the utilisation of the CUDA cores has gone up, and the power efficiency has gone up.
- but I think it's unlikely that nV marketing is going to spill the beans before any official announcement next week.... so it's just general marketing flim-flam IMHO....

However, 1536 SPs with a hot-clock, even if it's not very hot, sounds like a real winner, but may be also be too good to be true....
:?:
 
Maybe too good to be true? :D
1536 SP's would mean they've gotten their shader units to about 1/3rd the size in addition to what 40nm > 28nm gives - that while maintaining the hotclocks too? suuuure
 
Kyle_Bennett said:
GK104 will be 680.
Seeing some benchmarks show 7970 in the lead, mostly in tessellated situations there, other "normal" synthetic 3D benchmarks and gaming benchmarks show 680 faster. Driver advancements still happening as well for 680.

[H]...
 
H is benching their 680, commented it is faster than the 7970. benches from last month were real.

edit: Man from atlantis beat me to it.
 
Yes, but do they work the same as Intel Hyperthreading? Honest question, I have no idea. It doesnt make any sense at all to talk about a Hyperthreaded CUDA core?

No it doesn't. Hyper threading works by having multiple active threads issue instructions to the same wide core with multiple execution units. Since each CUDA "core" has only one execution unit there's no chance of that happening.

Hyperthreading is used to maximize resource utilization of wide superscalar architectures. GPUs in general aren't superscalar and maximize utilization primarily through high thread parallelism.
 
No it doesn't. Hyper threading works by having multiple active threads issue instructions to the same wide core with multiple execution units. Since each CUDA "core" has only one execution unit there's no chance of that happening.

Hyperthreading is used to maximize resource utilization of wide superscalar architectures. GPUs in general aren't superscalar and maximize utilization primarily through high thread parallelism.

I had to register to correct you. GF114 is superscalar. GF114 was Fermi's mid range GPU, much like GK104 is to be Kepler's mid range, so there reason to believe that GK104 is superscalar like GF114 and can potentially benefit from some sort of hyperthreading-like execution.
 
I had to register to correct you. GF114 is superscalar. GF114 was Fermi's mid range GPU, much like GK104 is to be Kepler's mid range, so there reason to believe that GK104 is superscalar like GF114 and can potentially benefit from some sort of hyperthreading-like execution.

Correct was a wrong choice of words. Amendment is more accurate.
 
Fermi actually does issue instruction(s) from 2 threads (Warps) simultaneously to one core (aka SM). So it does simultaneous multithreading (what intel calls hyperthreading). AMDs VLIW architecures do fine grained temporal multithreading of two threads (wavefronts switch every 4 cycles, i.e. after each instruction).
 
GPUs in general aren't superscalar and maximize utilization primarily through high thread parallelism.
GF104/GF114 (and all non-GF1x0 GPUs really) are specifically marketed as being 'superscalar' though. Then again what 'superscalar' actually means in reality when we're talking about GPU manufacturers who all have a habit of not respecting standard CPU terminology is very much up in the air...
 
I had to register to correct you. GF114 is superscalar.
No, it's not. Each SM has two schedulers both capable of dual-issue. It is a multiple issue SIMD (vector) architecture. There is nothing scalar about the current crop of nV GPUs.

The "super" part of superscalar designates the multiple issue character. But only a scalar architecture can be superscalar, a vector architecture cannot. If you want to invent that term, you could call it supervector or something, but multiple issue SIMD actually nails it.
 
GF104/GF114 (and all non-GF1x0 GPUs really) are specifically marketed as being 'superscalar' though. Then again what 'superscalar' actually means in reality when we're talking about GPU manufacturers who all have a habit of not respecting standard CPU terminology is very much up in the air...
Yes, exactly.
 
H is benching their 680, commented it is faster than the 7970. benches from last month were real.

edit: Man from atlantis beat me to it.

Where did Kyle say he is benching the 680? I highly doubt he would be spilling the beans about his own testing, since he should be under NDa (which looks like it might happen starting next Monday).
 
If Kyle is benching then release is imminent. Hope it's not paper and that 680 can do surround on 1 card.
 
Where did Kyle say he is benching the 680? I highly doubt he would be spilling the beans about his own testing, since he should be under NDa (which looks like it might happen starting next Monday).

there is a link on the last page...
 
Where did Kyle say he is benching the 680? I highly doubt he would be spilling the beans about his own testing, since he should be under NDa (which looks like it might happen starting next Monday).

He didn't, and he's not. Both Kyle and Fuad are basically saying the same thing about performance however (I'm guessing from the same Chinese source) so consider it DOUBLE CONFIRMED. :p
 
Note: Im not contesting the validity of what he is saying. Just the origin of the "data".

Who knows, but if Charlie, Fuad and Kyle are all saying very similar things then it's probably not very far from the truth. Either that or Nvidia has pulled off the mother of all smokescreens.

I can only believe so much that I want to believe. The logic is telling me that Nvidia is going to win this round by a large margin unfortunately. It might not be R300 in absolute performance terms but in terms of swing from AMD to Nvidia it might even surpass that.

If you're an AMD fanboy (like me) it might be time to start baking that humble pie. :p
 
Assuming for a second that these rumors of 1536 SP ALUs + hot clock and ~320mm2 die size are true, consider that the 7970 has 33% more ALUs and 50% wider memory bus in a die size only slightly larger than what's rumored for NV, and overclocks to > 1.1GHz on air.

I don't see how AMD can fail to win in the bandwidth limited case (e.g. at least some compute tasks). I guess NV could still win in other areas through better utilization, but I dunno - I get the feeling it's going to be a tight race.
 
Back
Top