NVIDIA Kepler speculation thread

RobertR1 · Mar 22, 2012

With boost always on, it would seem that it's hard to compare apples to apples with the 7970 since its specs are locked in.

Either way, the BF3 performance boost is still not good enough to justify an upgrade and really that's the only big game that needs help.

DavidGraham · Mar 22, 2012

Looking at GK104 shader arrangement and compute capabilities , GK110/GK100 is definitely a reality , it is definitely coming , the question is WHEN ?

Rangers · Mar 22, 2012

trinibwoy said:
Lol I'm glad somebody said it. The coo-coo train is really rolling now. The fact that it doesn't make toast is probably cheating too.

You were the one advancing the whole "nvidia already paid compute penalty now AMD had too finally making it fair" thing. What it appears is in fact Nvidia rolled back some of that penalty, so their die size is not so amazing, and so you should give AMD some credit where due, but I'm sure you wont...

It's smart enough I'm sure. We only care about gaming performance really, and yes AMD was smart to do it all those years imo.

Funny thing is even after slash and burn pricing (lets say they went to 399), AMD will still be charging way more than Caymen for a smaller die, and Nvidia obviously even more so.

nah
http://www.hardwarecanucks.com/forum...review-29.html

Canucks had some weird benchmarks, it is pretty much the only review where 7970 lost almost every single game including some they won in other reviews.

They also had a weird conclusion, "680 wins in every metric including launch day availability" when this morning I checked there are no 680 on newegg so um, presumptuous much? The Canucks guy is Nvidia biased I've noticed it before, when he talked about the Ti 448 like it was the best card of all time recently despite that imo it was a non starter at $300 due to only 1GB RAM.

Anyways even the hardware canucks chart shows 680 only +12% >7970, it doesn't negate what I said. It does show 7970 still sucking with AA...

DavidGraham · Mar 22, 2012

Rangers said:
You were the one advancing the whole "nvidia already paid compute penalty now AMD had too finally making it fair" thing. What it appears is in fact Nvidia rolled back some of that penalty, so their die size is not so amazing, and so you should give AMD some credit where due, but I'm sure you wont...

It's smart enough I'm sure. We only care about gaming performance really, and yes AMD was smart to do it all those years imo.

The way I see it , is NV now has two architectures (or two variations of the same one) , one for gaming , and the other for compute , they started doing that since GF100/GF104 , but many thought this was a temporary solution to address GF100 shortcomings , it is clear now that this was their plan from the start , and they intend to keep doing that until further notice .

trinibwoy · Mar 22, 2012

Sorry Rangers I can't keep up with all the gremlins you throw at me. I've always acknowledged Tahiti's compute burden - you said so yourself....

Do you not realize that what's most impressive about Kepler isn't how it fares against Tahiti but rather GF110/4 in perf/watt/mm?

AnarchX · Mar 22, 2012

According to Anandtech GK104 has 8 dedicated FP64 CUDA-Cores per SMX and a 1/24 DP-rate:
http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2

silent_guy · Mar 22, 2012

The general take away for me, is that, for regular consumers/gamers, Nvidia has an excellent gaming card, that's cheaper, consumes less power, and that takes the hassle out of overclocking by doing it automatically and dynamically, to the future chagrin of reviewers and theorists.

Performance may be slightly above the 7970, but not to an extent that it matters in games. Other than absolute price ($499 still rather high), I don't think there's anything to really complain about, is there? Unless you run LuxMark-like loads on a regular basis?

SimBy · Mar 22, 2012

That is indeed the most impressive metric.

Pitcarin is still the king in my book though

Jawed · Mar 22, 2012

trinibwoy said:
Jawed was right about the greater compiler dependency though. He probably saw the white paper before starting that little diatribe In any case it's obvious nVidia's static scheduling needs some work. It's only dual issue dammit, how hard can that be. AMD had to deal with 2.5x that.

No I hadn't seen anything, it was a hunch based on the idea of a rejigged balance of ALUs and scheduling, simply in order to fit in all these ALUs.

I'm befuddled as to what the hell they've done, though. I'm off to play tanks I'll let you guys work it out...

Jaaanosik · Mar 22, 2012

680 is a nice card. Good job NV.
Still slower clock for clock than 7970, plus broken CUDA stuff in half of the applications.

entity279 · Mar 22, 2012

silent_guy said:
First people made fun that testing at 1080p is BS for a card of this caliber. Then they speculated that, for sure, it would trail at 25x16. And now with that out of the way too, they complain that a test was done at 25x16, the resolution of choice remember, when it should have been done at 1080p

It's just that they're different people

.

Anyway, great job done by nV with Kepler.

Now time time for me to wait 3-6 mounts to see who was the fastest *passive* video card for this generation

3dilettante · Mar 22, 2012

I think the scheduler scans statically compiled latency numbers prior to picking a warp instruction for decode, possibly a counter indicating how close a dependent instruction is in the warp's stream.
The instruction's field is used to update the table used for scheduling.

This covers statically determined ALU instruction latency, though other operations with variable latency may have kept their old schemes.

Tim Murray · Mar 22, 2012

Jaaanosik said:
680 is a nice card. Good job NV.
Still slower clock for clock than 7970, plus broken CUDA stuff in half of the applications.

what CUDA failures are you referring to specifically?

Arnold Beckenbauer · Mar 22, 2012

AnarchX said:
According to Anandtech GK104 has 8 dedicated FP64 CUDA-Cores per SMX and a 1/24 DP-rate:
http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2

But:
http://www.hardware.fr/articles/857-2/gk104-fermi-regime.html
http://translate.googleusercontent....e.html&usg=ALkJrhju_P-6dcWZ-XV4Dgf0gCSJVnumHQ

Each scheduler has its own registers (4096 x 32 bits) and its own group of four texture units and can initiate the execution of two operations per cycle. However, he must share resources at this level with a second scheduler:

- Unit SIMD0 32-way (the "cores"): 32 FMA 8 FMA FP32 or FP64
- Unit SIMD1 32-way (the "cores"): 32 FMA FP32
- Unit SIMD2 32-way (the "cores"): 32 FMA FP32
- 16-way unit SFU: 16 or 32 special functions FP32 interpolations
- Unit Load / Store 16-way 32-bit

Who is right?

doob · Mar 22, 2012

AnarchX said:
The same what a ~$229 GTX 460 did with a GTX 285.

They forgot the ~ same price.

Jaaanosik · Mar 22, 2012

Tim Murray said:
what CUDA failures are you referring to specifically?

F@H does not run. Ryan at AT mentioned half the stuff is not working right.

NathansFortune · Mar 22, 2012

The reason Nvidia have dropped the compute from GK104 is because the GK110 is still on the way. I expect all of the compute will be back in and it will depress overall performance in the chip but it is clear that the 104 is almost purely game focussed. Nvidia have completely outmanoeuvred AMD this gen by removing the compute from GK104 and pitching it to gamers while leaving their big die for Tesla applications. AMD will have to go for a similar move or be left behind next generation.

dnavas · Mar 22, 2012

Jawed said:
I'm befuddled as to what the hell they've done, though. I'm off to play tanks I'll let you guys work it out...

Yeah, I agree. Top on my list -- 8 separate DP units?
Uhh.. Uhh.. Really? I hope that isn't true, because if it is it suggests the idea that it is area/power cheaper to put in 8 DP single-cycle FPs rather than the (smallish) wiring for (say) half/quarter rate support plus scheduler. That's surprising to me, anyway.

3dilettante · Mar 22, 2012

NathansFortune said:
The reason Nvidia have dropped the compute from GK104 is because the GK110 is still on the way. I expect all of the compute will be back in and it will depress overall performance in the chip but it is clear that the 104 is almost purely game focussed. Nvidia have completely outmanoeuvred AMD this gen by removing the compute from GK104 and pitching it to gamers while leaving their big die for Tesla applications. AMD will have to go for a similar move or be left behind next generation.

It's not like the 680 has abandoned compute. It still does very well in a number of cases. It doesn't look like Nvidia has placed much priority in the areas where it doesn't however.

I'm not sure how much AMD plans to follow. Its path has already been outlined, and compute/integration enhancements are in store.
The question is whether AMD intends to push ahead on facets of its graphics domain that it has modestly improved or tweaked: areas the 680 has exploited.

caveman-jim · Mar 22, 2012

Rangers said:
In fact it seems if AMD hadn't fubared 7970's clocks by Dave's admission , they might be right there.

I missed that admission, got a link?

NVIDIA Kepler speculation thread

RobertR1

Pro

DavidGraham

Rangers

DavidGraham

trinibwoy

Meh

AnarchX

silent_guy

SimBy

Jawed

Jaaanosik

entity279

3dilettante

Tim Murray

the Windom Earle of mobile SOCs

Arnold Beckenbauer

doob

Jaaanosik

NathansFortune

dnavas

3dilettante

caveman-jim

Similar threads