What do you mean?5th we dont really have a real GCN driver yet so let hope that comes soon and we can see what we really can expect, it could still be current level or we could get a nice boost.
What do you mean?5th we dont really have a real GCN driver yet so let hope that comes soon and we can see what we really can expect, it could still be current level or we could get a nice boost.
What do you mean?
The hot clock should give better perf/mm^2 not worse. Nvidia's warp size being smaller than AMD's wavefront likely contributes something though.AMD had better perf/mm^2 in the past, mainly because
1) They didn't have all the GPGPU gunk that nVidia chose to put in
2) They didn't have a hot clock
With this generation
1) AMD has now gone down the GPGPU route, much like Fermi
2) nVidia has allegedly dropped the hot-clock
So, I would expect things to be much closer now
- though AMD would still have an advantage if they are on 28HPL, and NV is on 28HP...
Edit: Actually, that's a perf/w issue, not a perf/mm^2 issue, AFAIK
I don't think this is that obvious. It sure will decrease peak flops/area but it should lead to higher utilization (of course, depending on the workload, dynamic branching etc. should be faster). Haven't seen any numbers though for a conclusion either way.Nvidia's warp size being smaller than AMD's wavefront likely contributes something though.
The hot clock should give better perf/mm^2 not worse. Nvidia's warp size being smaller than AMD's wavefront likely contributes something though.
2nd if you look in GCN thread lots of people where boarderline orgasmic about GCN's ALU architecture specifically around scheduling and how it is far simpler then fermi but almost as functional.
This looks like a ddr3 equipped version so probably for GK107-200. Wouldn't be surprising then power draw would be below HD7750 level. Maybe for the other versions need more pwm circuitry?
The extra transistors due to the increased clock rate are more than made up for by halving the ALUs. The negative to the hot clock concept is power, not area. Though there could be other negatives like complexity. There are a lot of bogus rumors floating about, but I have a feeling the removal of the hot clock is correct.My hunch is that the hot clock cost them more than it's worth - that's why
1) they're dropping it this time round (assuming that's a correct rumor)
2) no one else does it - i.e. if it was such a good idea to run most of the chip at this obviously stressful frequency, then AMD would have done it by now.
As discussed several pages back the hot clock costs a lot of energy and transistors to do all the additional pipeling necessary to get the clock speeds up.
That's out of a representative equivalent sample of 1, right?whitetiger said:2) no one else does it
Depends on what you call 'a problem'. If your performance is entirely power constrained (as I believe was and is the case for the 550mm2 beasts) every little big of improvement is going to help.psurge said:Is it reasonable to guess that the hot clock is a power problem?
Why compare it to a single FMA? (At least the decode part.)I mean, it's not like these things are clocked at the 3GHz+ speed-racer CPU level. And as to the vector length and scoreboarding stuff... I'd love to see some numbers on energy consumption for the decode + issue part of a scoreboarded in-order pipeline relative to energy burned in a single fma.
That's out of a representative equivalent sample of 1, right?
You could also say that, right now, 50% of the contemporary, known add-on GPU architecture are using hot clocks. On average, of course...
How could they cancel a non-existing product?Here's my guess - the GK104 used TSMC's 28HP process
- however, they canned the GK100 as used too much power
- and the GK114 & GK110 will move to the 28HPL process, just like AMD are using for Tahiti...
Why compare it to a single FMA? (At least the decode part.)
That's out of a representative equivalent sample of 1, right?
You could also say that, right now, 50% of the contemporary, known add-on GPU architecture are using hot clocks. On average, of course...
That's out of a representative equivalent sample of 1, right?
You could also say that, right now, 50% of the contemporary, known add-on GPU architecture are using hot clocks. On average, of course...
Adding GPGPU functionality costs transistors, compared to AMD's previous lean & mean design
- it's a burden that nVidia have been carrying for a few generations, but now has AMD designed a fairly comparable architecture, they also have to carry that burden.
that again is assumption, how about quantifying what that burden is. its easy to say lean and mean, it doesn't really mean anything.
lets look at AMD's ALU scaling on LVIW 4/5 designs. now lets look at ALU utilization across current and near term expected workloads(the "life" of the GPU 2-3 years) and compare it to VLIW5 . its easy to say each ALU and supporting infrastructure costs more so therefore its "burdened with GPGPU like nvidia" but that assumes:
linear or near scaling of ALU's
DX11/compute shader workloads wont increase thus decreasing VLIW5/4 shader utilization
that NV and AMD made the same trade offs
now im the furthest thing from an expert in this area. But i dont see anything in your argument that really convinces me.