Perf/watt/IHV man hours/posts *split*

It was far more than 10%

http://www.anandtech.com/show/2977/...x-470-6-months-late-was-it-worth-the-wait-/19

And that's coming 6 months after 5870 hit the market. The GTX 480 consumed roughly 60-100% more power than 5870. I can't remember the site that did power consumption through the PCIE slot and power connectors back then which had more detailed power usage breakdowns.
Here you go:
http://www.pcgameshardware.de/Grafikkarten-Grafikkarte-97980/Tests/Direct-X-11-von-Nvidia-743333/2/

In Race Driver Grid, our then-benchmark of choice for power measurement, because it was the closest to Geforce and Radeon averages over all the games tested (wtf? ;)), the GTX 480 consumed 235 watts, the HD 5870 128 watts.

In Furmark it was 304 compared to 186. :)

I have no idea why we are discussing Fermi. Yes, AMD was earlier to the market, faster and used less power in 2009/10, today they are neither. Does this instil confidence in Vega?

It's probably been brought to the table in order to show how much of a non-issue power consumption was back then and how it should be treated today. From both parties, with accusations to both sides of using double standards. ;)
 
Last edited:
We have seen metal spins in the past, never have we seen such an improvement with one.

well, when i take this rumor with a lot of grains of salt, and i dont really trust the 50%, its needed to seen what this "metal spin bring" or account for it, nobody is saying its all due to the metal spin :

- Amelioration on the fabrication process. ( without saying they could have fix some abnormal behaviour who should have not been there in first place )
- Search and fix on leakage
- Lower voltage needed, less power. etc.
- Add maybe some tuning on the power control.
 
With all the smoke screens about "TFLOPS" (calculated based on the Boost/Turbo/Whatever clock, which cannot be sustained under really heavy load) and "TDP/TBP/TGP" (which are either ignored and exceeded or serve as a means of reducing boost, thus also reducing TFLOPS), IMHLO it is important not to lose track of what we are comparing here.
 
With all the smoke screens about "TFLOPS" (calculated based on the Boost/Turbo/Whatever clock, which cannot be sustained under really heavy load) and "TDP/TBP/TGP" (which are either ignored and exceeded or serve as a means of reducing boost, thus also reducing TFLOPS), IMHLO it is important not to lose track of what we are comparing here.
All you can do is map the frequency-voltage-power where frequency is locked/sustained, or when left dynamic the average (after all the overall power demand generally fluctuates in the 2-10ms range and needs to be averaged-analysed from that context), but as shown by hardware.fr there is a divergence between the most intensive power draw software and lighter ones, which also can muddy the context along with the temperature.
However there is still a trend that can be seen when looking at TPU/Tom's/PCPer/Hardware.fr, who all have pretty comparable methodologies.
Also fitting in with what you as mentioned earlier one cannot compare the system power report (GPU-Z/etc) for GPUs with regards to TDP, and further exacerbated that some cards cannot report the power usage of the auxiliary (includes the XFX GTR) to the system such as GPU-Z.

I think the the original 110W-120W rumour-info came from the power demand of the 480 die, not technically its TDP/TBP that is reported by AMD as 150W (still low though but possible to do with more dynamic nature now) on its website.
The 3xx range was generally spot on and extreme ones such as 295X2 was rather conservative figures from AMD and real TDP was actually a bit lower.

Cheers
 
Last edited:
Could you specify on which large chip we saw this?

Some CPUs in the Bulldozer family.
But practically almost all CPU architectures have enjoyed improved perf/watt ratings on later revisions. Usually the IHV s launch "new" models with higher clocks and same TDP and same clocks with lower TDP.
 
With all the smoke screens about "TFLOPS" (calculated based on the Boost/Turbo/Whatever clock, which cannot be sustained under really heavy load) and "TDP/TBP/TGP" (which are either ignored and exceeded or serve as a means of reducing boost, thus also reducing TFLOPS), IMHLO it is important not to lose track of what we are comparing here.

Separately just to show this still existed even back with Hawaii/Tonga in terms of the power demand fluctuation, albeit with much less advanced dynamic power management/tuning in terms of the performance-thermal envelope.
Context though is more around the TDP-power demand fluctuation rather than frequency-voltage-thermal envelope that is more simplistic with these generations, point is it is possible to average for Polaris GPUs the frequency-power-voltage to calculate TFLOPs if careful with what is used and guideline of AMD's spec for base-boost for 'reference' design, yeah still needs a bit of leeway but still a reasonable indicator.

380X averaging 191W.


r_600x450.png
 
Last edited:
Some CPUs in the Bulldozer family.
But practically almost all CPU architectures have enjoyed improved perf/watt ratings on later revisions. Usually the IHV s launch "new" models with higher clocks and same TDP and same clocks with lower TDP.

Yes we saw some go from 125W to 95W TDP, but I am not aware of any metal spin to go from 125W to below 95W.
 
Some CPUs in the Bulldozer family.
But practically almost all CPU architectures have enjoyed improved perf/watt ratings on later revisions. Usually the IHV s launch "new" models with higher clocks and same TDP and same clocks with lower TDP.
Isn't that usually like 9-12month gap rather than just 3.5 months when going from launch to improved product on same node?
In that short time I would expect yields to improve rather than the performance-power envelope.
Cheers
 
Yes we saw some go from 125W to 95W TDP, but I am not aware of any metal spin to go from 125W to below 95W.
It's a 30% boost. Not the rather ridiculous 50% that wccftech claims, but it's about the same order of magnitude.
Meanwhile, I remembered the TLB bug in the first Barcelona chips. BIOS updates had to disable some cache which resulted in up to 20% performance penalty. I guess you could call the revision that solved it a metal spin?

I just stated we've seen it happen before, just not with GPUs, because those usually get shorter lifetimes than CPUs. Or at least it did until Intel started their tick-tock schedules while the rest of the world got stuck with 32-28nm for 3-4 years.

Isn't that usually like 9-12month gap rather than just 3.5 months when going from launch to improved product on same node?
In that short time I would expect yields to improve rather than the performance-power envelope.
Cheers

Yes, and I think it's extremely unlikely that this would happen with both Polaris chips. Wccftech is probably playing the clickbait game yet again and I don't think they have any source at all. They probably just saw some forum posts about better overclocking results with later chips and made up a story about it.


It would be extremely funny if turned out true, though. All those Ethereum miners who seemingly cockblocked the whole AMD FinFet offerings for months would probably rip their collective hair in anger.
 
well, when i take this rumor with a lot of grains of salt, and i dont really trust the 50%, its needed to seen what this "metal spin bring" or account for it, nobody is saying its all due to the metal spin
It's just 15% higher clocks with 25% less power usage. 1.15/0.75=153% Roughly what the Jayz video was showing allowing for some binning. I'd still like to see a source or confirmation for an actual metal spin.

Isn't that usually like 9-12month gap rather than just 3.5 months when going from launch to improved product on same node?
Normally, but not unreasonable if something was wrong with the chip. In most cases the issue would be fixed prior to release. The board definitely seemed more power hungry than expected at launch according to all the marketing numbers. If a last minute revision messed something up the current situation is plausible.

It would be extremely funny if turned out true, though. All those Ethereum miners who seemingly cockblocked the whole AMD FinFet offerings for months would probably rip their collective hair in anger.
Or they'd buy up all the new cards for a better return as the cards probably paid for themselves by now.
 
Or they'd buy up all the new cards for a better return as the cards probably paid for themselves by now.

But the point is to make money. If they keep changing cards every quarter they won't start making money.
Well but if they did that, at least they'd flood the 2-hand market with cheap RX480 cards..
 
Yes, and I think it's extremely unlikely that this would happen with both Polaris chips. Wccftech is probably playing the clickbait game yet again and I don't think they have any source at all. They probably just saw some forum posts about better overclocking results with later chips and made up a story about it.


It would be extremely funny if turned out true, though. All those Ethereum miners who seemingly cockblocked the whole AMD FinFet offerings for months would probably rip their collective hair in anger.
Made me smile thinking of that with the mining, but also wince with the effect it would have for AMD sales channels and retail pricing.

Sorry if you knew but Khalid seems to be basing this improvement upon the embedded platform, which has a manufacturer spec of <95W and same numbers as the discrete 480 in all ways.
Problem is the previous gen embedded Tonga E8950 is also the full 2048 architecture spec of <95W with 3 TFLOPs, looking at the 380X as it released 6 weeks later has TDP of 191W and 3.97TFLOPs.
Reducing 380X clocks to 750MHz gives 3 TFLOPs, but the AMD embedded brochure states the Tonga E8950 is operating at 1000MHz.
Even then just like with current Polaris lowering the clocks so far still does not get the TDP to within <95W.

Others have mentioned in the past when looking at the E8950 its reported clocks do not align with the actual TDP/TFLOPs, and I am wondering if we are seeing the same again with the Polaris embedded E9550 part but this time the data is further skewed due to how TDP-TFLOPs-Frequency is calculated for Polaris.

The reason I am using the 380X is that it aligns with the narrative from Khalid regarding comparing comparable discrete GPU to the embedded part.
Still this is the same guy who has taken that patent info being discussed in the other topic and writing an article about it being applied to Vega without regard it is a patent that can be used in multiple ways or maybe aligns more with Navi..
Pretty sure quite a few of us are rather cynical about all of this just like you.
Cheers
 
We have, just not on GPUs.
The question that should be asked first: how can a metal spin changing anything about lower?

It may very well be that AMD found a way to become somewhat competitive at perf/W, pigs may even fly, but one does not lose power in the metal layers (unless AMD accidentally created humongous metal capacitors in the original 480 for no good reason whatsoever.)
 
It's a 30% boost. Not the rather ridiculous 50% that wccftech claims, but it's about the same order of magnitude.
Meanwhile, I remembered the TLB bug in the first Barcelona chips. BIOS updates had to disable some cache which resulted in up to 20% performance penalty. I guess you could call the revision that solved it a metal spin?

I just stated we've seen it happen before, just not with GPUs, because those usually get shorter lifetimes than CPUs. Or at least it did until Intel started their tick-tock schedules while the rest of the world got stuck with 32-28nm for 3-4 years.

No, as Barcelona needed fixes to the design and not just changes to the Back-end-of-line (BEOL) processing. And even the step from 125W to 95W saw a new revision, which might have been more than a metal spin. (and the real difference was not 30W anyway, as the original revision was around 105-110W). I would believe improvements of around 10% but 50% is a fairy tale.
 
The question that should be asked first: how can a metal spin changing anything about lower?

It may very well be that AMD found a way to become somewhat competitive at perf/W, pigs may even fly, but one does not lose power in the metal layers (unless AMD accidentally created humongous metal capacitors in the original 480 for no good reason whatsoever.)
Fixing issues with ground loops, noise, and sensors would seem a possibility. The adaptive clocking and power supply calibration features were disabled as I recall. They may in fact have turned the metal layers into giant capacitors to counteract interference. Not uncommon for RF systems. That still won't consume much, if any, power, but it will influence a system that does. They'd only need a 25% voltage reduction to cut the power nearly in half. If running full throttle because the sensors weren't working the claims would be plausible.
 
It's just 15% higher clocks with 25% less power usage. 1.15/0.75=153% Roughly what the Jayz video was showing allowing for some binning. I'd still like to see a source or confirmation for an actual metal spin.
Bear in mind you still need to add some more watts to the total of the Jayz video because there is also a voltage controller used in the XFX that cannot report the auxiliary/memory consumption to the system utilities.
I decided to look for videos with the XFX GTR that measure the Vcore/VDDC as a comparison to other GPUs and found one.
Test is the XFX 480 GTR and also Powercolor 480 Red Devil.
Comparable fps results with following VDDC:
XFX GTR 1.175V at 1400MHz
Red Devil 1.206V at 1360MHz.
What is interesting is that their results were 'comparable' (yeah would be nice to see the end page bah) in the benchmark Fire Strike test, while the voltage difference is under 3%, the XFX may had higher clock but it did not necessarily influence the result.
1st test is with the custom default clocks/fans and 2nd test is the one I am referencing that is 100% fans with best OC.
Test starts at 4m55secs, 7m8secs gives the clock and VDDC measurements for that test.

So their XFX GTR max was 1400MHz, and the extra 40MHz did not provide any relevant gains over the Powercolor Red Devil, however it does have pretty good voltage at 1400MHz although that is under 3% improvement over the Red Devil considering both had performance parity - seems because they did not give us the end score sigh just the running fps, but then we are missing things from Jayz vid as well.
TBH I think their test is as unreliable as Jayz to reach any realistic conclusion, especially when considering Test 3 at 9m55sec still had the same VDDC (ideally would be nice to see the chart for this as I maybe reaching wrong conclusions due to how data is presented by them).
I cannot wait until one of the key OCers or the best sites for measuring-analysing power demand get their hands on the card, but this vid test sort of balances out Jayz's.
Cheers
 
Last edited:
Fixing issues with ground loops, noise, and sensors would seem a possibility. The adaptive clocking and power supply calibration features were disabled as I recall. They may in fact have turned the metal layers into giant capacitors to counteract interference. Not uncommon for RF systems. That still won't consume much, if any, power, but it will influence a system that does. They'd only need a 25% voltage reduction to cut the power nearly in half. If running full throttle because the sensors weren't working the claims would be plausible.


Yeah that is not what is going on,

Also we don't even know if the patent for adaptive clocking and power supply calibration were even for Polaris so how is the assumption of that is what is going even valid?

We have never seen this type of power differential by respin in the history of graphics or cpu design for that matter and cpu design has much more control over what they are doing.

RF systems are not GPU's......
 
It's a 30% boost. Not the rather ridiculous 50% that wccftech claims, but it's about the same order of magnitude.
Meanwhile, I remembered the TLB bug in the first Barcelona chips. BIOS updates had to disable some cache which resulted in up to 20% performance penalty. I guess you could call the revision that solved it a metal spin?

I just stated we've seen it happen before, just not with GPUs, because those usually get shorter lifetimes than CPUs. Or at least it did until Intel started their tick-tock schedules while the rest of the world got stuck with 32-28nm for 3-4 years.

Yes but those are mask layer changes with much more control over what they are doing. GPU's tend not to be privy to those types of changes.

They are doing changes internally to those CPU's, later versions even tick tock with Intel, only about 20% of the microcode can be saved, 80% is new, that shows the amount of changes that went on under the hood of those CPU's.
 
Also we don't even know if the patent for adaptive clocking and power supply calibration were even for Polaris so how is the assumption of that is what is going even valid?
http://radeon.wpengine.netdna-cdn.c...is-Architecture-Whitepaper-Final-08042016.pdf
It's in the Polaris Whitepaper, Page 12. I believe many sites reported the calibration was broken at launch. Seems reasonable to assume a broken feature could be fixed with a revision. If that is what occurred. Also seems reasonable to assume a feature designed to reduce power consumption could reduce power consumption if fixed.

RF systems are not GPU's......
No, but highly sensitive circuits could be susceptible to the same kinds of interference. Likely not RF in this case, but the same techniques can help with ground isolation. In theory the chip has a mechanism to examine slew rate of various clocks. Likely FINFETs configured with various gains and some logic. I'll admit I'm just guessing as to the cause, but there are only so many explanations for repeated metal spins if that's the case.

We have never seen this type of power differential by respin in the history of graphics or cpu design for that matter and cpu design has much more control over what they are doing.
By a respin no, but we have with software changes. Just go find some benchmarks for power management disabled vs enabled.
 
Back
Top