NVIDIA GF100 & Friends speculation

no-X said:
If you switch-off the furmark-protection and use GTX480-cooler, GTX580 will consume more power, will reach higher temperatures and higher noise, than GTX480. So, these "improvements" are quite debatable, at least when speaking about GPU-level.

There's nothing debatable about it. Remember the GTX580 has 16/15 * 772/700 = 17.6% more throughput. By all rights, it should consume at least 17.6% more power. It doesn't, even with the GTX 480 cooler.
 
There's nothing debatable about it. Remember the GTX580 has 16/15 * 772/700 = 17.6% more throughput. By all rights, it should consume at least 17.6% more power. It doesn't, even with the GTX 480 cooler.

quickfacts for the brainwashed: 580 with disabled OCP consumes 350W
gtx580_power_gpuz.jpg


480 in TPU's review topped at what, 320W? so it's only a 10% increase there.
 
Last edited by a moderator:
There's nothing debatable about it. Remember the GTX580 has 16/15 * 772/700 = 17.6% more throughput. By all rights, it should consume at least 17.6% more power. It doesn't, even with the GTX 480 cooler.

To be fair, you have to use a fixed temp, assuming temp derating is similar (~2W/°C).

In fact, even at max power draw the GTX580 throttles in FurMark sooner than the GTX480 (97°C vs 105°C iirc).

Summing it up, the ~20% efficiency improvement goes down to ~0% (less than 10% at the very least).


I don't know if the resulting higher efficiency is due to some process work, "fully enabled" die, power delivery redesign or a combination of these, but it's clearly not sufficient to imply a redesign.
 
To be fair, you have to use a fixed temp, assuming temp derating is similar (~2W/°C).

In fact, even at max power draw the GTX580 throttles in FurMark sooner than the GTX480 (97°C vs 105°C iirc).

Summing it up, the ~20% efficiency improvement goes down to ~0% (less than 10% at the very least).


I don't know if the resulting higher efficiency is due to some process work, "fully enabled" die, power delivery redesign or a combination of these, but it's clearly not sufficient to imply a redesign.

Temp derating is a) nonlinear, so using a linear approximation is shaky, b) design dependent, so assuming it's the same to prove the design is still the same is circular reasoning.

I think we can all agree that GF110 is a minor reworking of GF100. Personally, I would be more comfortable calling it GF100b. But clearly, GF110 is different chip than GF100 - it has a different die size and a few feature tweaks. Nvidia explicitly claimed they reworked the transistor voltage thresholds, and I see no reason to invent conspiracy theories to prove they didn't.
 
quickfacts for the brainwashed: 580 with disabled OCP consumes 350W
gtx580_power_gpuz.jpg


480 in TPU's review topped at what, 320W? so it's only a 10% increase there.


Using Furmark to evaluate power consumption is misleading. All chip designers optimize their circuits for the common case, and then ensure that corner cases don't break the chip. Take a look at power consumption in real life games, and you'll see that GF110 is more power efficient than GF100. This isn't surprising - all Nvidia's efforts to improve GF100 have rightly been focused on games, not synthetic benchmarks like Furmark.
 
All chip designers optimize their circuits for the common case, and then ensure that corner cases don't break the chip.

Any reasonably recent CPU is quite thoroughly fine with violently running Linpack or all the other "burn" variations of it, without violating spec, and without relying on lame mechanisms like app-detection - is Furmark significantly different as a concept?
 
Any reasonably recent CPU is quite thoroughly fine with violently running Linpack or all the other "burn" variations of it, without violating spec, and without relying on lame mechanisms like app-detection - is Furmark significantly different as a concept?

I recall it wasn't too long ago that it was possible to voltage or heat death a processor under F@H/Linpack/IBT. AMD and Intel introduced throttling and thermal protection (hard off, clock throttle, etc) to prevent this, then stated putting better coolers, higher TDP's on chips. I think we're just seeing the same start of ramp up on similar (or better) technologies.
 
Take a look at power consumption in real life games, and you'll see that GF110 is more power efficient than GF100. This isn't surprising - all Nvidia's efforts to improve GF100 have rightly been focused on games, not synthetic benchmarks like Furmark.
Power consumption is dependant (among others) on GPU temperature, so it's related to cooling. Even this furmark graph shows, that the longer the application runs, the higher power consumption is. So it's really hard to judge power efficiency of two GPUs equipped by significantly different cooler. Until anybody takes GTX480 cooler, places it on GTX580 and tests it in e.g. Crysis or Medal of Honor, I won't be convinced, that the GPU is significantly more power-efficient.
 
I think it's ridiculous that this mechanism is relying on app detection, however. It needs to be general purpose - Nvidia is fighting the symptoms, not the actual issue.
 
I recall it wasn't too long ago that it was possible to voltage or heat death a processor under F@H/Linpack/IBT. AMD and Intel introduced throttling and thermal protection (hard off, clock throttle, etc) to prevent this, then stated putting better coolers, higher TDP's on chips. I think we're just seeing the same start of ramp up on similar (or better) technologies.

Exactly. To be honest, Nvidia's choice of using OCCT/Furmark app detection strikes me as a complete kludge. The right thing to do is just power limit the chip dynamically: monitor how much power you're burning and then throttle softly and dynamically to make sure you stay at power budget. App detection is the wrong solution to this problem, I think Intel CPUs have done the right thing here and I expect GPUs to follow suit.
 
I think it's ridiculous that this mechanism is relying on app detection, however. It needs to be general purpose - Nvidia is fighting the symptoms, not the actual issue.

Doing app detection is a lot simpler, because it doesn't require any kind of monitoring hardware, which would be necessary for anything more sophisticated. I don't think they expected to run into such power issues when they started designing Fermi, so they didn't include it; plus, they had enough on their plate already, and were rather area-constrained. Perhaps that will change with Kepler or Maxwell.

That said, app detection seems to work fine, so far.
 
Doing app detection is a lot simpler, because it doesn't require any kind of monitoring hardware, which would be necessary for anything more sophisticated. I don't think they expected to run into such power issues when they started designing Fermi, so they didn't include it; plus, they had enough on their plate already, and were rather area-constrained. Perhaps that will change with Kepler or Maxwell.

That said, app detection seems to work fine, so far.

I appreciate that it is easier Alex, but the current solution actually does seem to rely on monitoring hardware - which just stays inactive unless (particular versions of) Furmark or OCCT are detected. Various reviews have demonstrated/measured this, and here's an image from Hexus which highlights the hardware side of things:

Power.jpg


What disadvantages would there be to simply having this on all the time, like in CPUs? (Back in the days their thermal protections were external to the CPU die instead of internal as well).
 
Power consumption is dependant (among others) on GPU temperature, so it's related to cooling. Even this furmark graph shows, that the longer the application runs, the higher power consumption is. So it's really hard to judge power efficiency of two GPUs equipped by significantly different cooler. Until anybody takes GTX480 cooler, places it on GTX580 and tests it in e.g. Crysis or Medal of Honor, I won't be convinced, that the GPU is significantly more power-efficient.

As far as I can tell your argument appears to be 'cooler transistors use less power so the improved cooling on the GTX 580 makes the GF110 appear to have better perf/w than the 480'. Is that correct?
 
I don't know what is actually happening, but a reason that you may app detect for something like this could be because the driver may need to poll the current monitoring / regualtor devices for activity, which can chew up CPU cycles; you certainly don't want to do this for all apps. The solution we implemented on Evergreen feeds directly into the microcontroller we have on the GPU, taking away any potential driver overhead and making the solution truly generic.
 
They really are doing the detection in SW and looking for a specific application that causes overheating. That's like virus scanning. If you haven't seen the problem application before, you aren't protected!

CPUs have been using dynamic systems for a long time now that will prevent against arbitrary programs overheating the CPU. Intel designed such a system in Montecito (and had to disable it), Tukwila, Nehalem, Sandy Bridge, etc. and AMD has designed one in Llano, Bobcat and Bulldozer.

David


PS: Yes, cooler transistors leak less. So all things being equal, if you improve cooling you will lower the power consumption of the chip. However, more cooling may use more system power (i.e. a bigger fan).
 
Back
Top