NVIDIA GF100 & Friends speculation

So, GF104 will be launched in june/july and SI during september/october timeframe? That would be interesting. ~420mm² SI and ~350mm² GF104 would mean switched roles. nVidia's sweet-spot product against ATi's monolith...

How so? Last I checked, GF104 was just supposed to be a lower-end derivative of GF100.
 
G92 was also introduced as lower end derivate, but its performance was very close to G80... (that doesn't mean I have any info, of course...)
 
G92 was also introduced as lower end derivate, but its performance was very close to G80... (that doesn't mean I have any info, of course...)

Well G92 benefited from a shrink to 65nm. I can't see something like that happening for GF104 if it is to be released this spring/summer.
 
You are right... G92 vs G94 would be a better example (33% less transistors, but only 20% less performance at the same configuration).
 
There won't be a Fermi II before 28nm, but it seems that there will be a GX2 based on a 256-bit chip before then. Whether that chip is GF104 or not, I don't know. But certainly given the time between GF100 and GF104 tape-out, you'd expect architectural tweaks. I could be horribly wrong, but I seem to remember the tape-out time between NV30 and NV35 to be equivalent to this; not that I expect changes anywhere near as drastic or as domain-focused as that, of course.
 
There won't be a Fermi II before 28nm, but it seems that there will be a GX2 based on a 256-bit chip before then. Whether that chip is GF104 or not, I don't know. But certainly given the time between GF100 and GF104 tape-out, you'd expect architectural tweaks. I could be horribly wrong, but I seem to remember the tape-out time between NV30 and NV35 to be equivalent to this; not that I expect changes anywhere near as drastic or as domain-focused as that, of course.

Yeah, well, about that GX2, I've been giving a closer look at Techpowerup's GTX 480 performance per watt graph, the other day.

perfwatt.gif


There seems to be something horribly wrong with the whole thing.:oops: I don't know if I am reading it right, but I would swear that Nvidia jumping to 40nm with their Fermi, actually resulted in worse performance/watt ratio than their 55nm G200b! On the other hand, ATI did a huge leap forward. The name Evergreen seems to be well deserved. Hopefully, the Northern/Southern Islands names, will not mean that the cores will have to be surrounded by water!:LOL:

So, my conclusion is, that this 300W PCIe spec limit for any one card, will result in Nvidia being able to actually give less performance, than before! I guess there will have to be some major changes in order for all this power consumption thing to get better and be able to give a viable GX2 product.
 
Look at ati's past series tho. It was really bad performance per watt so ati really focused on that portion of the design.. nvidia on the other hand has had two generations pretty much on par. The previous flagship the gtx 280 is only 10% better than the gtx 480
 
Look at ati's past series tho. It was really bad performance per watt so ati really focused on that portion of the design.. nvidia on the other hand has had two generations pretty much on par. The previous flagship the gtx 280 is only 10% better than the gtx 480

The GTX 280 was a 65nm product. The GTX 480 is a 40nm product. Shouldn't there be a huge difference if favor of the GTX 480? I mean, if the GTX 280 was manufactured at 40nm, wouldn't the performance ratio of that graph, be closer to 200%?

As for ATI, the 4850 seems to be very close to the GTS 250 and these are the mainstream products that attract most buyers. Both at 55nm.

Still GTX 295 was both faster and more power efficient than the 4870X2. So the conclusion should be that there is something really really wrong with the GF100.
 
The GTX 280 was a 65nm product. The GTX 480 is a 40nm product. Shouldn't there be a huge difference if favor of the GTX 480? I mean, if the GTX 280 was manufactured at 40nm, wouldn't the performance ratio of that graph, be closer to 200%?

As for ATI, the 4850 seems to be very close to the GTS 250 and these are the mainstream products that attract most buyers. Both at 55nm.

Still GTX 295 was both faster and more power efficient than the 4870X2. So the conclusion should be that there is something really really wrong with the GF100.

If the gtx 480 was the same performance as the gtx 280 yes it would be. But it has much better performance and alot of new hardware and dx 11. So the fact thats its only 10% worse isn't that bad. It could be mu ch worse
 
If the gtx 480 was the same performance as the gtx 280 yes it would be. But it has much better performance and alot of new hardware and dx 11. So the fact thats its only 10% worse isn't that bad. It could be much worse

It isn't that bad??? :oops:

Let me repeat again, that the performance to consumption ratio of a 65nm prodcut, is actually BETTER than the performance to consumption ratio of the 40nm product, two shrinks later. How is this not that bad?

If this isn't bad, what ATI should say about the 90% improvement of say, the 5850 over the 4850, that its a God like improvement? The 5850 also has much better performance and alot of new hardware and dx 11, as well!

There is no need for the supposed GTX 280 @40nm to be at the same performace as the GTX 480 to make a hypothetical comparison of the performance to consumption ratio. In that same respect, neither the 5850 has the same performance, still it is 91% more efficient than the GTX 480.
 
The simple fact of the matter is that there is never a linear relationship between performance and power consumption, so I don't see any reason whatsoever to worry about such things as the ratio between the two. Ratios are only reasonable when there's (at least approximately) a linear relationship.

But if nVidia puts out, for instance, a mainstream part soon using a smaller core, it's likely to have vastly better performance/power. SLI, by contrast, has somewhat lower performance/power because the power consumption is doubled while performance is not.

It just makes more sense to me to look at the whole package: price, performance, power consumption, features, software compatibility, etc. and judge based on which product maximizes your own personal utility. Pulling out ratios that are meaningless from the get go doesn't help to elucidate this at all.
 
Generalizations are even more useless when you limit the metrics to a subset of the hardware's capabilities. There are workloads in which Fermi's performance justifies its higher power consumption over GT200. And that's even before considering the manufacturing aspect of it (I'm still not clear on how much of the power consumption issue lies at TSMC's feet). I don't expect GF104 to be the epitome of efficiency but it doesn't have to do much to improve on GF100.
 
The simple fact of the matter is that there is never a linear relationship between performance and power consumption, so I don't see any reason whatsoever to worry about such things as the ratio between the two. Ratios are only reasonable when there's (at least approximately) a linear relationship.

But if nVidia puts out, for instance, a mainstream part soon using a smaller core, it's likely to have vastly better performance/power. SLI, by contrast, has somewhat lower performance/power because the power consumption is doubled while performance is not.

It just makes more sense to me to look at the whole package: price, performance, power consumption, features, software compatibility, etc. and judge based on which product maximizes your own personal utility. Pulling out ratios that are meaningless from the get go doesn't help to elucidate this at all.

Well yes, the whole package is what matters, but the performance to power consumption ratio, is a part of this package and a very important part for me, since electricity is not exactly free.

Sorry but I don't find these ratios meaningless. They clearly show, what the architecture is capable of, or how stressed a core is. Ie, the 5850 has better ratio than the 5870, that means that the Cypress can give it's best ratio, when clocked at 5850 levels and not 5870 levels. So the more you push it, the less you get in return, compared to the power it consumes.

Anyway, this whole thinking of mine, started on the speculation of a GX2 product, which I don't see happening in the current state of Fermi. If it is based on a totally new chip though, then yes, a lot is possible.


Generalizations are even more useless when you limit the metrics to a subset of the hardware's capabilities. There are workloads in which Fermi's performance justifies its higher power consumption over GT200. And that's even before considering the manufacturing aspect of it (I'm still not clear on how much of the power consumption issue lies at TSMC's feet). I don't expect GF104 to be the epitome of efficiency but it doesn't have to do much to improve on GF100.

Agreed about the generalizations. I should have stated that it's pure framerate that interests me, but I thought that it was clear since we all know what Techpowerup studied in their GTX 480 review.
 
Well yes, the whole package is what matters, but the performance to power consumption ratio, is a part of this package and a very important part for me, since electricity is not exactly free.
That just makes the total power consumption meaningful. The ratio isn't, really. Particularly because while the power consumption is linear with cost, performance is not linear with respect to enjoyment.
 
The simple fact of the matter is that there is never a linear relationship between performance and power consumption, so I don't see any reason whatsoever to worry about such things as the ratio between the two. Ratios are only reasonable when there's (at least approximately) a linear relationship.

That's debatable:

perfwatt_1920.gif


No, it's not perfectly linear, but the HD 5670, 5570, 5970, 5870, 5770, 5750 are very close to each other, with the 5850 being a good bit ahead of the pack, and the 5830 lagging far behind (but it's weirdly crippled and clocked pretty high).

Sure, peformance/Watt in itself doesn't tell you much, unless something remains constant: it can be performance, power, die size...

And no matter how you look at it, the GTX 480's performance/power ratio is terrible: the 5970 does 60% better... and it's faster!
 
There are workloads in which Fermi's performance justifies its higher power consumption over GT200.
But these situations are pretty rare (cherry picking, isn't it?). Fermi has higher power consumption even in the rest of workloads, where GT200 performs only slightly worse. That's the problem.
 
Which do you consider rare and which are common? Is every game counted as a different workload or are "games" taken altogether just one entry? If the former then obviously the comparison is heavily skewed since games make up the vast majority of apps that can run on a GPU.

Cypress does look like a better jack-of-all trades architecture but the least of Fermi's worries is perf/watt relative to GT200. It does so much more than GT200 in terms of gaming and general conpute that the metric is rather meaningless.
 
Which do you consider rare and which are common? Is every game counted as a different workload or are "games" taken altogether just one entry? If the former then obviously the comparison is heavily skewed since games make up the vast majority of apps that can run on a GPU.

Cypress does look like a better jack-of-all trades architecture but the least of Fermi's worries is perf/watt relative to GT200. It does so much more than GT200 in terms of gaming and general conpute that the metric is rather meaningless.

The group I work in has seen speedups from 1.5 to 4x with Fermi over GT200 for our applications in computer vision & speech recognition, with 2x being common - and that's without recoding anything (or using any double precision). We expect Fermi to do even better once we've optimized our code for its memory subsystem. Fermi, for us, is actually pretty nice, even in terms of perf/W.
 
The group I work in has seen speedups from 1.5 to 4x with Fermi over GT200 for our applications in computer vision & speech recognition, with 2x being common - and that's without recoding anything (or using any double precision).
What are the reasons for the 4x gain?

Jawed
 
Back
Top