NVIDIA Fermi: Architecture discussion

Power consumptions - the #1 reason not to build a supercomputer.

Yeah exactly, that doesn't fly at all for a couple reasons. Power consumption can't be high on their list of concerns given the perf/watt advantage of GPUs over CPUs. And why in the world would ORNL commit to such a large project without knowing what the power requirements would be? Maybe Nvidia could have promised something and broken that promise but it's hard to believe Nvidia would still be guessing at power consumption at this late stage.
 
Yeah exactly, that doesn't fly at all for a couple reasons. Power consumption can't be high on their list of concerns given the perf/watt advantage of GPUs over CPUs. And why in the world would ORNL commit to such a large project without knowing what the power requirements would be? Maybe Nvidia could have promised something and broken that promise but it's hard to believe Nvidia would still be guessing at power consumption at this late stage.

The deal wasn't announced yesterday. It's very plausible that when it was made, Nvidia wasn't 100% sure of the final FLOPS/W ratio.
 
The deal wasn't announced yesterday. It's very plausible that when it was made, Nvidia wasn't 100% sure of the final FLOPS/W ratio.

Let's take the rumoured 130W 3.33Ghz Gulftown and assign it a generous 5 flops (scalar+SSE) per core. That's 0.77Gflops/watt. Nvidia set power consumption of the 2.1TF S2050 at 900W. That's 2.3Gflops/watt and that number is handicapped since it includes memory power consumption as well. 3x the flops/watt of the best case CPU.

So given those numbers how big does the error in their estimation have to be to make Fermi unattractive due to power consumption?
 
Let's take the rumoured 130W 3.33Ghz Gulftown and assign it 7 flops (scalar+SSE) per core. That's 1.08Gflops/watt. Nvidia set power consumption of the 2.1TF S2050 at 900W. That's 2.3Gflops/watt and that number is handicapped since it includes memory power consumption as well. Twice the flops/watt of the best case CPU.

So given those numbers how big does the error in their estimation have to be to make Fermi unattractive due to power consumption?

I would have to agree with your first comment to the "article". It just doesn't fly :)
And since there's no confirmation from either side, especially OR, it's highly doubtful that 1) OR really won't be using Fermi and 2) if indeed OR won't be using Fermi for their supercomputer, that the reasons for that are related with power consumption.
 
Let's take the rumoured 130W 3.33Ghz Gulftown and assign it a generous 5 flops (scalar+SSE) per core. That's 0.77Gflops/watt. Nvidia set power consumption of the 2.1TF S2050 at 900W. That's 2.3Gflops/watt and that number is handicapped since it includes memory power consumption as well. 3x the flops/watt of the best case CPU.

So given those numbers how big does the error in their estimation have to be to make Fermi unattractive due to power consumption?

900W is the "typical" power use. The C2070 has a 190W typical power draw and a 225W max power draw, so the ratio appears to be about 1.18.

So you can take that 900W figure and raise it to about 1065W if you want to compare it to Gulftown. Let's make it 1000W to account for memory.

But actually, that's not even the point. The point is that Nvidia likely promised a certain level of performance/W (let's call it X) for a certain price, let's call it Y.
If actual silicon only delivers 85% of X, then Oak Ridge may not be willing to pay more than 85% of Y. Perhaps that wasn't enough for Nvidia. Or maybe the ORNL decided to take a look at Larrabee instead, or wait a while for Nvidia to actually meet X and maybe pay 95% of Y due to delays. This is obviously just wild speculation, but I think it's plausible.
 
900W is the "typical" power use. The C2070 has a 190W typical power draw and a 225W max power draw, so the ratio appears to be about 1.18.
So you can take that 900W figure and raise it to about 1065W if you want to compare it to Gulftown. Let's make it 1000W to account for memory.

900 Watt / 4 = 225 Watt. :!:

But actually, that's not even the point.

Sure it's the point:
The supercomputer project was just killed for power reasons. Fermi power reasons. Whoops.
http://www.semiaccurate.com/2009/12/16/oak-ridge-cans-nvidia-based-fermi-supercomputer/
 
Lots of guessing....

The numbers are there plain to see. Not sure how you got that power consumption number but 900W already assumes 225W per board. The fact is that it's ridiculous to think that Fermi's perf/watt could be anything but fantastic compared to current options.

Even at your 85% number it's still better than everything else out there. And it's doubtful Nvidia is trying to make a big profit on this - their current goal is market penetration. I wouldn't be surprised if they're taking a loss or just breaking even.
 
Perhaps the overall wattage is the problem, rather than performance/W. Maybe Fermi-based Tesla was promised at a much lower overall wattage, something like 150-160W like previous models, IIRC.
 
Even at your 85% number it's still better than everything else out there.

Well, that depends on the SIMD utilization/efficiency and easy of programming, existing code bases etc for the needed workloads. If it was already a close call between a gpu and a cpu solution those 15% could be the tipping point.
The CPUs for a setup like this would also be sold just around break even btw.
 
Perhaps the overall wattage is the problem, rather than performance/W. Maybe Fermi-based Tesla was promised at a much lower overall wattage, something like 150-160W like previous models, IIRC.

You're still assuming that the article is correct, but there's no proof of that.
 

Yes but Nvidia says that this 900W figure is "typical" while they say the individual card has a 190W typical power draw and 225W TDP. So either there's some kind of overhead somewhere, or they don't use those terms consistently.

The numbers are there plain to see. Not sure how you got that power consumption number but 900W already assumes 225W per board. The fact is that it's ridiculous to think that Fermi's perf/watt could be anything but fantastic compared to current options.
Even at your 85% number it's still better than everything else out there. And it's doubtful Nvidia is trying to make a big profit on this - their current goal is market penetration. I wouldn't be surprised if they're taking a loss or just breaking even.

If by performance/watt you mean FLOPS/watt then yes, I agree. But my point is that Nvidia could very well have promised a certain level of performance for a certain level of power consumption but can't deliver. That doesn't mean Fermi is a poor choice, just not as good as it looked on paper, and that can be enough to cancel a project.
 
Yes but Nvidia says that this 900W figure is "typical" while they say the individual card has a 190W typical power draw and 225W TDP. So either there's some kind of overhead somewhere, or they don't use those terms consistently.

Because it's a cluster with a lot more stuff than only the card - they do the same with the S1070 cluster.

If by performance/watt you mean FLOPS/watt then yes, I agree. But my point is that Nvidia could very well have promised a certain level of performance for a certain level of power consumption but can't deliver. That doesn't mean Fermi is a poor choice, just not as good as it looked on paper, and that can be enough to cancel a project.

Why do you repeat always the power use of fermi when it's not the point? Nobody will cancel a supercomputer because they need 17% more power...
 
If you have to care about the infrastructure required to power and cool an installation drawing hundreds of kilowatts to megawatts of power, where the proposed design was already near the redline, you could.
 
Because it's a cluster with a lot more stuff than only the card - they do the same with the S1070 cluster.



Why do you repeat always the power use of fermi when it's not the point? Nobody will cancel a supercomputer because they need 17% more power...

Why not? What if it's 25% more power? What if they can't supply/cool that much? Or what if it has lower perf/W than planned and is 2 months late but the vendor can't give a significant discount to make up for it?
 
Let's take the rumoured 130W 3.33Ghz Gulftown and assign it a generous 5 flops (scalar+SSE) per core. That's 0.77Gflops/watt. Nvidia set power consumption of the 2.1TF S2050 at 900W. That's 2.3Gflops/watt and that number is handicapped since it includes memory power consumption as well. 3x the flops/watt of the best case CPU.

So given those numbers how big does the error in their estimation have to be to make Fermi unattractive due to power consumption?
Maybe they went with ... dunno .. maybe .. their competitors? Even the good old RV770 gives out 5Gflops per watt :rolleyes: or maybe they went with CELL. Lots of variables not just the Gulftown.
 
Back
Top