That isn't true.
The loss of the hotclock makes that just a 1:2 ratio of the ALU throughput. And the L2 cache bandwidth or the speed for atomics actually went up with GK104 compared to Fermi (bandwidth about +70% compared to GF110, about +150% compared to GF114, atomics quite a bit more). I don't think the culprit can be found in the 33% lower size of the L2.As Alexko already noted amongst others there's cache missing compared to GF110; but in that regard GK104 doesn't have any significant differences compared to GF1x4. GK104 has the downside though that it packs a crapload of more SPs within each cluster compared to GF104 (192 vs. 48) if you come to think of cache amounts.
Isn't the LDS size [per thread-block] limited by the API-mandated exposure, anyway? I'm not sure for OCL, but DC5.0 requires a fixed size of 32KBytes. Does OCL set only a lower limit?The last point is a significant regression for compute (they should have doubled the L1/shared memory to 128kB, GCN has 64kB dedicated shared memory [+16kB L1] for just 64 ALUs, not 48kB max for 192 ALUs, GK110 will probably do it). And connected to that, the L/S capabilities are about halved per ALU (but overall still a plus considering the clock speed). Then you have of course the much more static scheduling reducing performance (but that shouldn't have that much of an effect).
OpenCL requires a minimum of 32KB of LDS. Nvidia exposes 48KB in OpenCL.Isn't the LDS size [per thread-block] limited by the API-mandated exposure, anyway? I'm not sure for OCL, but DC5.0 requires a fixed size of 32KBytes. Does OCL set only a lower limit?
Does it matter?Isn't the LDS size [per thread-block] limited by the API-mandated exposure, anyway? I'm not sure for OCL, but DC5.0 requires a fixed size of 32KBytes. Does OCL set only a lower limit?
The big chip would be targeting compute and professional graphics segments.
$1000 would be way too low for a top-end Quadro.
I wonder if nv will try a little experiment. release it as a quadro only product and see if the high end gamers buy if.
So, 690 wasn't BigK, it was dual 680. How utterly boring:
They would have to release GeForce drivers for it. The Quadro drivers aren't exactly performant (never mind the update schedule).I wonder if nv will try a little experiment. release it as a quadro only product and see if the high end gamers buy if.
Heck, it wouldn't even surprise me if BigK was relegated to the ultra enthusiast (~1000 USD) segment when it launches in the 7xx series with the chips focus being on prosumer/professional/HPC markets the consumer space will just be there for inventory bleed off and/or salvage parts. With that Nvidia using smaller dies tailored for consumer use fitting everything from 780 on down. I certainly wouldn't be surprised if Nvidia abandoned the big die strategy for the consumer space.
I'm not sure if this has been pointed out before in the long Kepler thread… but someone at the SemiAccurate forums noted that the Kepler GPUs for the Oak Ridge upgrade will have 6 GB memory. That seems to indicate the GK110 will have either a 384-bit bus or a 512-bit bus that's disabled to 384-bit on the particular cards they'll use.
GK110 doesn't sound like it'll appear for desktop all that soon.