Lol I'm glad somebody said it. The coo-coo train is really rolling now. The fact that it doesn't make toast is probably cheating too.
The way I see it , is NV now has two architectures (or two variations of the same one) , one for gaming , and the other for compute , they started doing that since GF100/GF104 , but many thought this was a temporary solution to address GF100 shortcomings , it is clear now that this was their plan from the start , and they intend to keep doing that until further notice .You were the one advancing the whole "nvidia already paid compute penalty now AMD had too finally making it fair" thing. What it appears is in fact Nvidia rolled back some of that penalty, so their die size is not so amazing, and so you should give AMD some credit where due, but I'm sure you wont...
It's smart enough I'm sure. We only care about gaming performance really, and yes AMD was smart to do it all those years imo.
No I hadn't seen anything, it was a hunch based on the idea of a rejigged balance of ALUs and scheduling, simply in order to fit in all these ALUs.Jawed was right about the greater compiler dependency though. He probably saw the white paper before starting that little diatribe In any case it's obvious nVidia's static scheduling needs some work. It's only dual issue dammit, how hard can that be. AMD had to deal with 2.5x that.
First people made fun that testing at 1080p is BS for a card of this caliber. Then they speculated that, for sure, it would trail at 25x16. And now with that out of the way too, they complain that a test was done at 25x16, the resolution of choice remember, when it should have been done at 1080p
what CUDA failures are you referring to specifically?680 is a nice card. Good job NV.
Still slower clock for clock than 7970, plus broken CUDA stuff in half of the applications.
According to Anandtech GK104 has 8 dedicated FP64 CUDA-Cores per SMX and a 1/24 DP-rate:
http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2
Who is right?Each scheduler has its own registers (4096 x 32 bits) and its own group of four texture units and can initiate the execution of two operations per cycle. However, he must share resources at this level with a second scheduler:
- Unit SIMD0 32-way (the "cores"): 32 FMA 8 FMA FP32 or FP64
- Unit SIMD1 32-way (the "cores"): 32 FMA FP32
- Unit SIMD2 32-way (the "cores"): 32 FMA FP32
- 16-way unit SFU: 16 or 32 special functions FP32 interpolations
- Unit Load / Store 16-way 32-bit
The same what a ~$229 GTX 460 did with a GTX 285.
what CUDA failures are you referring to specifically?
I'm befuddled as to what the hell they've done, though. I'm off to play tanks I'll let you guys work it out...
The reason Nvidia have dropped the compute from GK104 is because the GK110 is still on the way. I expect all of the compute will be back in and it will depress overall performance in the chip but it is clear that the 104 is almost purely game focussed. Nvidia have completely outmanoeuvred AMD this gen by removing the compute from GK104 and pitching it to gamers while leaving their big die for Tesla applications. AMD will have to go for a similar move or be left behind next generation.
In fact it seems if AMD hadn't fubared 7970's clocks by Dave's admission , they might be right there.