NVIDIA Kepler speculation thread

With boost always on, it would seem that it's hard to compare apples to apples with the 7970 since its specs are locked in.

Either way, the BF3 performance boost is still not good enough to justify an upgrade and really that's the only big game that needs help.
 
Looking at GK104 shader arrangement and compute capabilities , GK110/GK100 is definitely a reality , it is definitely coming , the question is WHEN ?
 
Lol I'm glad somebody said it. The coo-coo train is really rolling now. The fact that it doesn't make toast is probably cheating too.

You were the one advancing the whole "nvidia already paid compute penalty now AMD had too finally making it fair" thing. What it appears is in fact Nvidia rolled back some of that penalty, so their die size is not so amazing, and so you should give AMD some credit where due, but I'm sure you wont...

It's smart enough I'm sure. We only care about gaming performance really, and yes AMD was smart to do it all those years imo.

Funny thing is even after slash and burn pricing (lets say they went to 399), AMD will still be charging way more than Caymen for a smaller die, and Nvidia obviously even more so.

Canucks had some weird benchmarks, it is pretty much the only review where 7970 lost almost every single game including some they won in other reviews.

They also had a weird conclusion, "680 wins in every metric including launch day availability" when this morning I checked there are no 680 on newegg so um, presumptuous much? The Canucks guy is Nvidia biased I've noticed it before, when he talked about the Ti 448 like it was the best card of all time recently despite that imo it was a non starter at $300 due to only 1GB RAM.

Anyways even the hardware canucks chart shows 680 only +12% >7970, it doesn't negate what I said. It does show 7970 still sucking with AA...
 
You were the one advancing the whole "nvidia already paid compute penalty now AMD had too finally making it fair" thing. What it appears is in fact Nvidia rolled back some of that penalty, so their die size is not so amazing, and so you should give AMD some credit where due, but I'm sure you wont...

It's smart enough I'm sure. We only care about gaming performance really, and yes AMD was smart to do it all those years imo.
The way I see it , is NV now has two architectures (or two variations of the same one) , one for gaming , and the other for compute , they started doing that since GF100/GF104 , but many thought this was a temporary solution to address GF100 shortcomings , it is clear now that this was their plan from the start , and they intend to keep doing that until further notice .
 
Sorry Rangers I can't keep up with all the gremlins you throw at me. I've always acknowledged Tahiti's compute burden - you said so yourself....

Do you not realize that what's most impressive about Kepler isn't how it fares against Tahiti but rather GF110/4 in perf/watt/mm?
 
The general take away for me, is that, for regular consumers/gamers, Nvidia has an excellent gaming card, that's cheaper, consumes less power, and that takes the hassle out of overclocking by doing it automatically and dynamically, to the future chagrin of reviewers and theorists. ;)

Performance may be slightly above the 7970, but not to an extent that it matters in games. Other than absolute price ($499 still rather high), I don't think there's anything to really complain about, is there? Unless you run LuxMark-like loads on a regular basis?
 
Jawed was right about the greater compiler dependency though. He probably saw the white paper before starting that little diatribe :LOL: In any case it's obvious nVidia's static scheduling needs some work. It's only dual issue dammit, how hard can that be. AMD had to deal with 2.5x that.
No I hadn't seen anything, it was a hunch based on the idea of a rejigged balance of ALUs and scheduling, simply in order to fit in all these ALUs.

I'm befuddled as to what the hell they've done, though. I'm off to play tanks I'll let you guys work it out...
 
First people made fun that testing at 1080p is BS for a card of this caliber. Then they speculated that, for sure, it would trail at 25x16. And now with that out of the way too, they complain that a test was done at 25x16, the resolution of choice remember, when it should have been done at 1080p


It's just that they're different people ;).



Anyway, great job done by nV with Kepler.

Now time time for me to wait 3-6 mounts to see who was the fastest *passive* video card for this generation :D
 
I think the scheduler scans statically compiled latency numbers prior to picking a warp instruction for decode, possibly a counter indicating how close a dependent instruction is in the warp's stream.
The instruction's field is used to update the table used for scheduling.

This covers statically determined ALU instruction latency, though other operations with variable latency may have kept their old schemes.
 
According to Anandtech GK104 has 8 dedicated FP64 CUDA-Cores per SMX and a 1/24 DP-rate:
http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2

But:
http://www.hardware.fr/articles/857-2/gk104-fermi-regime.html
http://translate.googleusercontent....e.html&usg=ALkJrhju_P-6dcWZ-XV4Dgf0gCSJVnumHQ
Each scheduler has its own registers (4096 x 32 bits) and its own group of four texture units and can initiate the execution of two operations per cycle. However, he must share resources at this level with a second scheduler:

- Unit SIMD0 32-way (the "cores"): 32 FMA 8 FMA FP32 or FP64
- Unit SIMD1 32-way (the "cores"): 32 FMA FP32
- Unit SIMD2 32-way (the "cores"): 32 FMA FP32
- 16-way unit SFU: 16 or 32 special functions FP32 interpolations
- Unit Load / Store 16-way 32-bit
Who is right?
 
The reason Nvidia have dropped the compute from GK104 is because the GK110 is still on the way. I expect all of the compute will be back in and it will depress overall performance in the chip but it is clear that the 104 is almost purely game focussed. Nvidia have completely outmanoeuvred AMD this gen by removing the compute from GK104 and pitching it to gamers while leaving their big die for Tesla applications. AMD will have to go for a similar move or be left behind next generation.
 
I'm befuddled as to what the hell they've done, though. I'm off to play tanks I'll let you guys work it out...

Yeah, I agree. Top on my list -- 8 separate DP units?
Uhh.. Uhh.. Really? I hope that isn't true, because if it is it suggests the idea that it is area/power cheaper to put in 8 DP single-cycle FPs rather than the (smallish) wiring for (say) half/quarter rate support plus scheduler. That's surprising to me, anyway.
 
The reason Nvidia have dropped the compute from GK104 is because the GK110 is still on the way. I expect all of the compute will be back in and it will depress overall performance in the chip but it is clear that the 104 is almost purely game focussed. Nvidia have completely outmanoeuvred AMD this gen by removing the compute from GK104 and pitching it to gamers while leaving their big die for Tesla applications. AMD will have to go for a similar move or be left behind next generation.

It's not like the 680 has abandoned compute. It still does very well in a number of cases. It doesn't look like Nvidia has placed much priority in the areas where it doesn't however.

I'm not sure how much AMD plans to follow. Its path has already been outlined, and compute/integration enhancements are in store.
The question is whether AMD intends to push ahead on facets of its graphics domain that it has modestly improved or tweaked: areas the 680 has exploited.
 
Back
Top