NVIDIA GF100 & Friends speculation

This may come as a surprise, but the set of target apps is a moving goalpost. What's efficient for one set of workload isn't necessarily efficient for workloads that have yet to be created. G71 is a great chip for the DX9 workloads at the time. It's hard for it to run DX11 apps though. Similarly, G80 is great for DX10, but would suffer under the more geometry-heavy world of DX11. And that doesn't even count all the CUDA-related features and workloads.

What's more is that different things scale differently with different process nodes. So, eg, logic gates used to be expensive and wires ~free, so that pointed to one particular architecture as being efficient, whereas in a different process node, logic is cheap and wires expensive, and a different architecture is needed to be efficient there.
So what you're saying is that NVidia needed a ring bus for GF100?
 
Regarding the speculations about a mid-life kicker (whatever that may turn out to be): How much headroom is there on current process tech wrt to die size?

I think I've read somewhere that there's a reticle limit at about 600 mm² and given that GF100 already has been measured at ~550 mm² (though an „official” number (GPU-z?) says 526mm²), the option of going much larger seems rather theoretical - despite my former speculations.
 
Regarding the speculations about a mid-life kicker (whatever that may turn out to be): How much headroom is there on current process tech wrt to die size?

I think I've read somewhere that there's a reticle limit at about 600 mm² and given that GF100 already has been measured at ~550 mm² (though an „official” number (GPU-z?) says 526mm²), the option of going much larger seems rather theoretical - despite my former speculations.

I think you have to subtract ~0.5mm for packaging of the die so you end up with 22.9x23, When I said 24x24 way back when, I had the same numbers as you got now, rounded up (and no digital caliper ;)
 
Regarding the speculations about a mid-life kicker (whatever that may turn out to be): How much headroom is there on current process tech wrt to die size?

I think I've read somewhere that there's a reticle limit at about 600 mm² and given that GF100 already has been measured at ~550 mm² (though an „official” number (GPU-z?) says 526mm²), the option of going much larger seems rather theoretical - despite my former speculations.

Seems like these were real since the packaging is the same:

http://en.expreview.com/2010/03/15/geforce-gtx470480-die-shots-exposed/6863.html

Apparently GF100 production silicons don't have any markings on them or they are polished away.
 
There are so many rumors floating around about a GF104/GX2 across the internet, which makes up for a quite reasonable "mid-life kicker".

I don't think it'll come close to AMD's mGPU Antilles, but at least it'll increase performance against a GTX480 significantly without necessarily having as much power consumption. I've never been too fond of mGPU/AFR based solutions personally, but against a 480 I can't say I wouldn't prefer it.
 
There are so many rumors floating around about a GF104/GX2 across the internet, which makes up for a quite reasonable "mid-life kicker".

I don't think it'll come close to AMD's mGPU Antilles, but at least it'll increase performance against a GTX480 significantly without necessarily having as much power consumption. I've never been too fond of mGPU/AFR based solutions personally, but against a 480 I can't say I wouldn't prefer it.

I don't really see that happening. Such a product would be equivalent to a pair of GTX 460s (1GB), drawing about 300W (just like the GTX 480) and performing a bit worse than a Crossfire of HD 5850s, or than an HD 5970. It would be nice compared to NVIDIA's current top-end offering, I guess, but AMD's? Not to mention, of course, the cost: 2 × 367mm² of silicon isn't exactly cheap.

Considering that Barts XT is likely to be about as fast as the GTX 460…
 
I don't really see that happening. Such a product would be equivalent to a pair of GTX 460s (1GB), drawing about 300W (just like the GTX 480) and performing a bit worse than a Crossfire of HD 5850s, or than an HD 5970. It would be nice compared to NVIDIA's current top-end offering, I guess, but AMD's? Not to mention, of course, the cost: 2 × 367mm² of silicon isn't exactly cheap.

Considering that Barts XT is likely to be about as fast as the GTX 460…

Hey, they can't lose the single and dual GPU performance crown two years in succession, without a fight!
 
I don't really see that happening. Such a product would be equivalent to a pair of GTX 460s (1GB), drawing about 300W (just like the GTX 480) and performing a bit worse than a Crossfire of HD 5850s, or than an HD 5970.

Considering a GTX460 has a TDP of 160W why would power consumption scale nearly by a factor of 2x for two chips on one PCB? And that's IHV agnostic anyway, otherwise would a 5970 (despite it having reduced Cypress chips) end up with a significantly higher TDP than it has today

The GT200b/275@633MHz has a TDP of 219W, while the GTX295 (575MHz/chip) has a TDP of 289W. Give or take if I normalize those to the same frequencies I get an increase in power consumption of ~40% for the mGPU.

I'd even dare to say that a full GF104 at over 700MHz in a mGPU config could end up still under the 300W TDP ballpark.

It would be nice compared to NVIDIA's current top-end offering, I guess, but AMD's? Not to mention, of course, the cost: 2 × 367mm² of silicon isn't exactly cheap.

Considering that Barts XT is likely to be about as fast as the GTX 460…

If a full 104 dual chip config at higher frequencies could place itself somewhere between Cayman and Antilles performance-wise, then they'd obviously also adjust the MSRP right in between.

Of course could I be completely wrong, but I don't see anywhere anything that proposes that when any IHV plasters two chips on the same PCB power consumption automatically (nearly) doubles.
 
Regarding the speculations about a mid-life kicker (whatever that may turn out to be): How much headroom is there on current process tech wrt to die size?

I think I've read somewhere that there's a reticle limit at about 600 mm² and given that GF100 already has been measured at ~550 mm² (though an „official” number (GPU-z?) says 526mm²), the option of going much larger seems rather theoretical - despite my former speculations.

I don't think they need to go larger..
Within 600 mm^2 and using GF104 architecture, i think they can beat easily GF100 in gaming performance.
Something like @40nm,768SP, 32 ROPs, 256bit MC with 7.0 Gps GDDR5, to be deliver by mid-2011.
 
Considering a GTX460 has a TDP of 160W why would power consumption scale nearly by a factor of 2x for two chips on one PCB? And that's IHV agnostic anyway, otherwise would a 5970 (despite it having reduced Cypress chips) end up with a significantly higher TDP than it has today

The GT200b/275@633MHz has a TDP of 219W, while the GTX295 (575MHz/chip) has a TDP of 289W. Give or take if I normalize those to the same frequencies I get an increase in power consumption of ~40% for the mGPU.

I'd even dare to say that a full GF104 at over 700MHz in a mGPU config could end up still under the 300W TDP ballpark.

The GTX 275/295 is a bad example, because the GTX 275 was a "dirty" part, just like the HD 5830: it was slower and drew more power than the GTX 285.

GTX 285: 648/1476/1242MHz with 512-bit bus => 204W
GTX 275: 633/1404/1134MHz with 448-bit bus => 219W
GTX 295: 576/1242/999MHz with 448-bit bus => 289W (dual)

So if you compare the GTX 295 to the GTX 285, then you have about ~11% lower GPU clocks, ~16% lower shader clocks, ~20% lower memory clocks, and a 12.5% smaller bus, and of course 12.5% fewer memory chips, for a ~42% increase in TDP.

Plus:

HD 5870: 850/1200MHz => 188W
HD 5970: 725/1000MHz => 294W

So about ~15% lower GPU clocks, ~17% lower memory clocks, and a ~56% increase in TDP. Another way to look at it is that the HD 5970 is a dual HD 5850 with the same clocks, a few more SPs enabled and almost double the power.

Basically, most of the efficiency you gain in dual-GPU parts comes from the reduced clocks, and of course the reduced voltage: the combined effect means higher efficiency per GPU, so you don't double the TDP. But what happens if you don't reduce the clocks?

This happens:

1192937-v2-consommation-full.png

http://www.pcworld.fr/article/comparatif-cartes-graphiques/consommation/497961/

Look at the HD 4870 X2's power consumption: it's 167W higher than the HD 4870's. Taking the PSU's efficiency under account (Seasonic MD12 850, so about 88%) that's about 147W more, which is very close to the HD 4870's TDP. In other words, it almost doubles. And there's probably some amount of throttling on top of it.

So yeah, considering the GTX 460's TDP of 160W, I doubt NVIDIA could increase clock speeds while adding a second GPU and remaining within 300W.

If a full 104 dual chip config at higher frequencies could place itself somewhere between Cayman and Antilles performance-wise, then they'd obviously also adjust the MSRP right in between.

I'm not sure releasing a top-of-the-line, super-high-end dual-GPU card that fails to outperform your competitor's is a very wise move, from a marketing point of view. It doesn't provide any kind of halo, and if anything it's just a display of technical inferiority.
 
So yeah, considering the GTX 460's TDP of 160W, I doubt NVIDIA could increase clock speeds while adding a second GPU and remaining within 300W.
My thoughts too. The GTX460 does have some headroom however wrt voltage (as evidenced by the good OC capabilities), so it might be possible to fit two full chips (but not at higher clocks) into 300W with some binning imho.
Still wondering when (if?) we'll see a full chip GF104 btw. I guess around release timeframe of Barts would make sense, but no rumors?
 
Look at the HD 4870 X2's power consumption: it's 167W higher than the HD 4870's. Taking the PSU's efficiency under account (Seasonic MD12 850, so about 88%) that's about 147W more, which is very close to the HD 4870's TDP. In other words, it almost doubles. And there's probably some amount of throttling on top of it.

You'll also have to take into account the higher system load a faster card produces, IOW IMHLO system-level measurement don't allow for such conclusions.
 
The GTX 275/295 is a bad example, because the GTX 275 was a "dirty" part, just like the HD 5830: it was slower and drew more power than the GTX 285.

GTX 285: 648/1476/1242MHz with 512-bit bus => 204W
GTX 275: 633/1404/1134MHz with 448-bit bus => 219W
GTX 295: 576/1242/999MHz with 448-bit bus => 289W (dual)

So if you compare the GTX 295 to the GTX 285, then you have about ~11% lower GPU clocks, ~16% lower shader clocks, ~20% lower memory clocks, and a 12.5% smaller bus, and of course 12.5% fewer memory chips, for a ~42% increase in TDP.

I never claimed more than a rough 40% increase in TDP. I obviously didn't neglect to approximate a few things with that estimate in the first place.

Not that it has directly anything to do with the debate at hand, but who guarantees that the chips for GTX285 hadn't been hand selected anyway? If yes then I wouldn't call a GTX275 chip "dirty", but rather the 285 chips the best out of a large bunch.

Plus:

HD 5870: 850/1200MHz => 188W
HD 5970: 725/1000MHz => 294W

So about ~15% lower GPU clocks, ~17% lower memory clocks, and a ~56% increase in TDP. Another way to look at it is that the HD 5970 is a dual HD 5850 with the same clocks, a few more SPs enabled and almost double the power.

With the only other difference that the amount of onboard memory is playing a role also. A 2GB 5870 has a TDP of 228W if memory serves well. That's a 25% difference plus of course the frequency difference which might get the real difference rather in the 50% ballpark.

So yeah, considering the GTX 460's TDP of 160W, I doubt NVIDIA could increase clock speeds while adding a second GPU and remaining within 300W.

160W * 56% = 250W

Enabling the 8th cluster shouldn't change consumption by a worth mentioning margin if at all. A reasonable increase in frequency of say 50MHz shouldn't boost the TDP far beyond the 175W mark (granted there's no problem with the chip itself) and since mGPUs have bandwidth to spare I wouldn't say that it needs a memory frequency increase.

Now add twice the amount of ram and the overhead for whatever else and I still have a hard time imagining anything damn close to 300W let alone above it.

I'm not sure releasing a top-of-the-line, super-high-end dual-GPU card that fails to outperform your competitor's is a very wise move, from a marketing point of view. It doesn't provide any kind of halo, and if anything it's just a display of technical inferiority.

It's better than nothing at all. I'm sure having for X time period just a GTX480 on display would be a better solution both in terms of performance against Cayman as well in terms of power consumption. If the GTX480 wouldn't burn a shitload of power the picture would be a lot different. All a hypothetical GX2/104 would have to have is at worst comparable power consumption to a 480 and a noticeable difference in performance compared to the latter. That's a higher perf/W ratio to their existing high end product and it's definitely less embarrassing than nothing at all.

And yes again I might be completely wrong. I'm sure as I can be that they had plans for a GX2/104 for which of course I can't know its exact characteristics. However there's also a chance they canceled it in the meantime. If yes I find it personally likelier that they did because of upcoming products ending up to close to it then power consumption itself.
 
Basically, most of the efficiency you gain in dual-GPU parts comes from the reduced clocks, and of course the reduced voltage: the combined effect means higher efficiency per GPU, so you don't double the TDP. But what happens if you don't reduce the clocks?]

You're missing a massive component...
 
I never claimed more than a rough 40% increase in TDP. I obviously didn't neglect to approximate a few things with that estimate in the first place.

Not that it has directly anything to do with the debate at hand, but who guarantees that the chips for GTX285 hadn't been hand selected anyway? If yes then I wouldn't call a GTX275 chip "dirty", but rather the 285 chips the best out of a large bunch.



With the only other difference that the amount of onboard memory is playing a role also. A 2GB 5870 has a TDP of 228W if memory serves well. That's a 25% difference plus of course the frequency difference which might get the real difference rather in the 50% ballpark.



160W * 56% = 250W

Enabling the 8th cluster shouldn't change consumption by a worth mentioning margin if at all. A reasonable increase in frequency of say 50MHz shouldn't boost the TDP far beyond the 175W mark (granted there's no problem with the chip itself) and since mGPUs have bandwidth to spare I wouldn't say that it needs a memory frequency increase.

Now add twice the amount of ram and the overhead for whatever else and I still have a hard time imagining anything damn close to 300W let alone above it.



It's better than nothing at all. I'm sure having for X time period just a GTX480 on display would be a better solution both in terms of performance against Cayman as well in terms of power consumption. If the GTX480 wouldn't burn a shitload of power the picture would be a lot different. All a hypothetical GX2/104 would have to have is at worst comparable power consumption to a 480 and a noticeable difference in performance compared to the latter. That's a higher perf/W ratio to their existing high end product and it's definitely less embarrassing than nothing at all.

And yes again I might be completely wrong. I'm sure as I can be that they had plans for a GX2/104 for which of course I can't know its exact characteristics. However there's also a chance they canceled it in the meantime. If yes I find it personally likelier that they did because of upcoming products ending up to close to it then power consumption itself.

Look at it this way: the HD 5970 is essentially a dual HD 5850, with a few more SPs enabled. And power went from 151W to 294W. What you're proposing for the GTX 460 is strictly equivalent: double everything, and possibly enable a few more SPs.

So if power went 151W -> 294W (+94.7%) for AMD, why should it be any different for NVIDIA? For what it's worth, 160 × 1.947 = 311.5W.

You're missing a massive component...

Binning? They share a certain amount of circuitry, but I'm guessing that apart from VRMs, most of it is negligible. I can't really comment/speculate about binning without extensive internal data, which I don't have.
 
There could be a LOT different for NVIDIA considering the hot-clocked ALUs as a simple example. Different architectures even more simple.

As for the Evergreen comparisons I'll twist it the other way around: if a 2GB/ 5870@850MHz has a TDP of 228W if I'd add to that one a theoretical 50% I'll end up at 342W. Minus the 14.7% frequency difference = 292W

That's the catch with dumb layman's math exactly: all roads can lead to Rome :p
 
Back
Top