Nvidia BigK GK110 Kepler Speculation Thread

There is a significant delta in power draw even from Tahiti to Pitcairn & Verde due to the maturity of the design rules. Processes themselves do not stay static either, even within the same node.

I don't really know what to make of this Dave. Clearly, you aren't implying that I should compare Tahiti, Pitcairn and Cape Verde power wise and marvel at the lower power figures for the latter two. Processes get better over time, yes sure. But we're not talking 5800 -> 6900 here, but 3800 (maxed @496 GFLOPS) and 4800 (maxed @1,360 GFLOPS).
 
There is a significant delta in power draw even from Tahiti to Pitcairn & Verde due to the maturity of the design rules. Processes themselves do not stay static either, even within the same node.

Yep, perf/w on Pitcairn is miles ahead of Tahiti and pretty much every other chip this generation so I wouldn't be surprised if AMD pulled a rabbit out of the 28nm hat later this year.

Charlie talked about a GK114 refresh last year but I honestly don't see the point, especially after the Titan release. I guess they could use it as a test vehicle for upcoming changes in Maxwell.
 
Ok so that was an oversimplification, but it's not like there is 6GB of data constantly being written to memory in the average game, at least not at 2560x1600 and lower. It's no more valid than Tahiti's 3GB being to blame for worse power draw than the 680.

It doesn't matter much, because VRAM is used in parallel. If you're only using 1/3 of the available memory, or 2GB out of 6GB, then every RAM chip is 33.3% used, no chip is idle.

What you're suggesting might work if, in this case, a GPU used only 4 chips at 100%, but that's not how it works, because it would mean using only 128 bits out of the 384-bit interface, thereby dividing bandwidth by 3.
 
It doesn't matter much, because VRAM is used in parallel. If you're only using 1/3 of the available memory, or 2GB out of 6GB, then every RAM chip is 33.3% used, no chip is idle.

What you're suggesting might work if, in this case, a GPU used only 4 chips at 100%, but that's not how it works, because it would mean using only 128 bits out of the 384-bit interface, thereby dividing bandwidth by 3.

Ok I get it - idle was a bad choice of word - but if only 1/3rd of the total memory is being used then that's using less power than if all of it is being used.

I guess the power draw numbers themselves are pretty low even if all 6GB was 100% utilised. Basically, the extra (mostly underutilised) 3GB on Titan is contributing a couple of extra Watts in power (over the 7970) and not a lot more than that I'd imagine?

The low idle power draw of Titan even though it has 6GB vs 2GB on the 680 would seem to support the notion that memory is not really much of a consideration in terms of power draw.

53405.png
 
Last edited by a moderator:
Remember that whether RAM is occupied or not, it still needs to be refreshed. I think power use is more a function of the number of chips and traffic than occupancy.

Titan has 24 memory chips, while Tahiti has 12. That should have a measurable impact on power. Under load that might be around 10W or so. Obviously far less when idle.
 
I read an Anandtech article that suggested read and write ops were mostly the reason for power consumption on memory. I'll try to dig it up, was a while ago. It was just normal system RAM mind, so it might not be particularly valid anyway.
 
Ok I get it - idle was a bad choice of word - but if only 1/3rd of the total memory is being used then that's using less power than if all of it is being used.

I guess the power draw numbers themselves are pretty low even if all 6GB was 100% utilised. Basically, the extra (mostly underutilised) 3GB on Titan is contributing a couple of extra Watts in power (over the 7970) and not a lot more than that I'd imagine?

The low idle power draw of Titan even though it has 6GB vs 2GB on the 680 would seem to support the notion that memory is not really much of a consideration in terms of power draw.

53405.png

What I am saying when it comes to power we have 3 scenarios: One which show Nvidia, same amount of power, less than the amount of power. using more power.

If we take an average from all these reviews, we probably get something along the lines of same amount of power, or slightly more or less than this. Add in that Nvidia has twice as much memory and it probably uses less for purely the chip. But lets give AMD the benefit of the doubt and say it uses the same.

I think a lot of efficiency comes from being smaller chips with less complexity. Also Pitcairns didn't come that much later than tahiti. The launch came only 2 months after. Much of pitcairns efficiency comes from the fact that it basically sits at the sweet spot where adding components to the design yields less the parts added. Hence why the difference between it and tahiti is less than the sum of its parts of the latter. The same thing happened between the 5870 and 6870. As a result, you get a huge boost for performance per watt with a significantly smaller die. However, you don't get the same effect when you add more to a big chip and you will typically bring efficiency lower. What you can do is lower clocks to combat the effects of having a larger die.

Lets be honest here. Even though you guys want AMD to have the performance crown, it wide be a prideful and wasteful effort considering they don't have a strong professional market to make the R and D worth it(not to mention the wasted wafers).

AMD's likely and best course of action, is just another revision/respin of its current 7970 with a later stepping clocked higher with less headroom available. Perhaps in the 1150-1200mhz range. Nvidia is likely to do the same thing with GK114 and eventually gk110. So beside titan they won't be at much of a particular disadvantage. They will likely gain more than Nvidia from this type of move because that 256bit but on gk1x4 is starting to catch up with them. Also I think AMD already planned this, which is why initially the 7970's were clocked so conservatively initially. This also lines up with the more believable rumors from Charlie.

It just doesn't make as much sense to spends such tremendous R and D when the market is shrinking.

AMD game bundling approach has been successful so far, so it might be a more fruitful effort to spent money on marketing than a vanity project like a titan beater.
 
Last edited by a moderator:
I read an Anandtech article that suggested read and write ops were mostly the reason for power consumption on memory. I'll try to dig it up, was a while ago. It was just normal system RAM mind, so it might not be particularly valid anyway.
When you read and write to DRAM memory, you need to open and close and precharge those particular pages, in addition to refreshing. But it doesn't matter which pages you open or close: you're going to consume the same amount of power if you open a limited subset of pages or pages that are all over the address map.
 
The question is just how much power is actually being used? Sure if the whole 6GB is occupied I can see there being a delta of 10W over 2GB, but in the average gaming scenario in which the cards are tested we're looking at <2GB utilisation on each card.

That AT graph I linked showing Titan's similar idle power draw - Is the Titan still using all 24 memory chips but at a very low level of occupancy? I honestly don't know the answer to that but I'd assume that was the case unless something else changes from 2d to 3d.
 
Most of the power spent by a GDDR5 memory system is used driving the bus.

As a first order approximation, power consumption is proportional to bus width (assuming operating frequency stays the same).

Cheer
 
Most of the power spent by a GDDR5 memory system is used driving the bus.

As a first order approximation, power consumption is proportional to bus width (assuming operating frequency stays the same).

Cheer

This was one of my main beliefs for why Tahiti has such underwhelming power characteristics compared to the 256-bit chips. I have to say that Titan appears to perform better than I had anticipated, but it's so hard to get a true figure because of the boost.

You could wonder just how well a 256-bit Titan chip would have performed vs the 680 though. A lot of those bigger (50%+) gains appear to be down to bandwidth and Nvidia seemed keen to ensure that the card was benched no lower than 1600p.
 
You could wonder just how well a 256-bit Titan chip would have performed vs the 680 though. A lot of those bigger (50%+) gains appear to be down to bandwidth and Nvidia seemed keen to ensure that the card was benched no lower than 1600p.
Lower resolutions would just make the benchmarks CPU-limited. The performance at lower resolutions would have essentially nothing to do with the relative bandwidth configuration.
 
AMD has one more problem, W9000 is basically a failure. You don't hear anything about the sales of it and it basically loses or ties(hothardware and tomshardware, AMD made anandtech postpone his review)
To be clear here, AMD did no such thing. W9000 was a personal screwup of mine. I am not a domain expert on professional graphics, and while AMD provided some guidance I wasn't able to put together something I was comfortable releasing. It's something where we might have to bring in outside help, especially if we want to do more than SPEC's workstation benchmarks.
 
Most of the power spent by a GDDR5 memory system is used driving the bus.

As a first order approximation, power consumption is proportional to bus width (assuming operating frequency stays the same).

Cheer
Something to keep in mind of course is that being able to reduce memory clockspeeds and voltages at idle makes quite a difference in idle power consumption. One of 4870's problems at idle was that it was running at full clocks and full power all the time. When AMD could step down clocks and voltages on 5870, coupled with the improvements to the core it made a dramatic difference; about 30W at the wall.

"With Cypress AMD has implemented nearly the entire suite of GDDR5’s power saving features, allowing them to reduce the power usage of the memory controller and the GDDR5 modules themselves. As with the improvements to the core clock, key among the improvement in memory power usage is the ability to go to much lower memory clock speeds, using fast GDDR5 link re-training to quickly switch the memory clock speed and voltage without inducing glitches. AMD is also now using GDDR5’s low power strobe mode, which in turn allows the memory controller to save power by turning off the clock data recovery mechanism. When discussing the matter with AMD, they compared these changes to putting the memory modules and memory controller into a GDDR3-like mode, which is a fair description of how GDDR5 behaves when its high-speed features are not enabled."
 
What I'd consider a low hanging fruit would be completely powering down all but one memory controller when in the lowest power mode.
 
To be clear here, AMD did no such thing. W9000 was a personal screwup of mine. I am not a domain expert on professional graphics, and while AMD provided some guidance I wasn't able to put together something I was comfortable releasing. It's something where we might have to bring in outside help, especially if we want to do more than SPEC's workstation benchmarks.

Sorry for the assumption. I remember comments saying the review was going to come the following week. But it never ever came. I thought this review would have been particularly important because it highlights the payoff of GCN vs VLIW. It would have been at the very least, more important than the reviews of higher clocked versions of existing cards. So when it never arrived after months of waiting, I thought something happened when no explanation was given.

Considering the lackluster results most websites got and I have seen AMD request reviews to be postponed because of issues, I thought the same had happened here
 
Last edited by a moderator:
AMD has one more problem, W9000 is basically a failure. You don't hear anything about the sales of it and it basically loses or ties(hothardware and tomshardware, AMD made anandtech postpone his review) in the reviews out there to last gens quadro 6000. If AMD is serious about getting into the professional workstation market, its going have to do something serious about the design of GCN and this is tied to designing a big chip. To justify the manufacturing cost and R and D, something that big and is a competitor with titan, not only does it needs to compete against it for gaming performance, but applications performance(not open CL benchmarks). With this I can imagine the chip getting even bigger.
And yet the workstation business is hitting record revenue numbers...
 
And yet the workstation business is hitting record revenue numbers...

Any hard numbers coming on that front? In terms of market share, units, revenue, or anything. I seem to recall AMD making that same comment several times, but technically it could be true going from $10 million/year to $10.01 million/year to $10.02 million/year, etc.

The reason I'm asking is that it doesn't really seem to have much of an impact on finances (going by quarterly reports) and it seems like a major area of potential growth for AMD.
 
Its not broken down from the GPU revenue externally. AFAIK unit share is at the highest share it has ever been and I believe this has been the first time it was mentioned as a highlight by the CEO in the quarterly conference call since we've been AMD.
 
Back
Top