NVIDIA Fermi: Architecture discussion

itsmydamnation · Nov 25, 2009

XMAN26 said:
I have no preference, I'm simply trying expoosing his own biases. I already know he will NEVER admit he was wrong on any of his past predictions. Be it the performance ones or products.

i have lurked @ B3d since about then 2nd or 3rd interation of the R600 prediction thread

, i cant agree with your self assassment.

note im australian which means i go for the underdog which is generally AMD

CarstenS · Nov 25, 2009

Groo The Wanderer said:
You are missing the most important point for the HPC market, power. It is probably the number one concern for that market. If NV disables a cluster or two for fermi, that is one method of yield drop out that they could use.

If it doesn't clock high enough, that is because of timing problems or power use, one results in a crash, the other in out of spec power use. The first problem would work out fine for the HPC cards, but the second one won't. It is in NV's best interest to pick those cards for lower power.

Also, I am REALLY skeptical that there will be enough demand for $3K+ compute cards to soak up the imperfect GF100s from the desktop market. The market of "well funded HPC customers with money to piss away" is notably smaller than the market for gamers saving up their allowance.

For the Fermi to be a salvage part, it would have to be a salvage part for the '360' to be of any real value, and I doubt that there will be enough parts that fit a narrow enough bin to base a product on.

If I had to bet, I would say that the fermi parts are the cherry picked GF100s, after all, at 6-7 times the cost, I would suspect that the '380s' are the second tier parts.

-Charlie

With perf/watt being so much better with GPUs in general than with CPUs, i think any vendor who can provide an x-factor against CPUs doesn't necessarily need to care if that x equals 12 or 11 or 10 or even 9. And with each card possibly sporting a six- and an eight-pin-connector next to the 75 watts from the PEG slot, staying inside the power budget of 225 watts would be nice but not a necessity IMHLO.

Sure, given the prices asked one would expect that you'd get the cherry-picked chips with the higher price tags - but i guess a large portion of that higher price tag and the customers willingnes to pay the bills comes from a) the software ecosystem support (which is in case of commodity graphics cards only or mostly drivers) plus the real money, professional users can save by changing their hardware whereas gamers only get more Fps (or higher image quality).

And surely Nvidia will go out of its way to not sell "imperfect" GPUs, but "GPUs that fit a given specs".

CarstenS · Nov 25, 2009

Silus said:
Even if I don't consider the obvious architectural differences that should improve performance overall,[…]

Do we really know this for sure yet wrt to gaming performance?

Silus · Nov 25, 2009

rpg.314 said:
Which is what GTX260 was to GTX 280. And if you were around at the time of 4870's launch on B3D forums, 260 was said to be DOA on it's then launch price. And how well did it turn out for nv?

Again, I'm not assuming that what I called "GeForce 360" (384 SPs, etc, etc) is just a full Fermi chip, with some units disabled. I'm guessing that that part (with units disabled) is actually a GeForce 370, with 480 SPs, one or two ROPs disabled, etc.

I hope that this time, NVIDIA doesn't go around with "Core X" and/or "Reloaded" BS suffixes, to distinguish models from one another.

Silus · Nov 25, 2009

CarstenS said:
Do we really know this for sure yet wrt to gaming performance?

Of course not. This is a speculation thread, remember ?

Sxotty · Nov 25, 2009

itsmydamnation said:
i have lurked @ B3d since about then 2nd or 3rd interation of the R600 prediction thread , i cant agree with your self assassment.

note im australian which means i go for the underdog which is generally AMD

Actually the underdog is nvidia

AMD is only an underdog to Intel.

Via is a real underdog you ought to rush out and embrace them.

Silus · Nov 25, 2009

Sxotty said:
Actually the underdog is nvidia

An underdog doesn't have more market share than its competitor

CarstenS · Nov 25, 2009

Silus said:
Of course not. This is a speculation thread, remember ?

Right, now that you're mentioning it…

Seriously: We shouldn't forget the possibility that unit for unit and clock for clock, the nvs new arch could, theoretically, be slower than before. It's not even without precedence: Geforce FX.

Silus · Nov 25, 2009

CarstenS said:
Right, now that you're mentioning it…

Seriously: We shouldn't forget the possibility that unit for unit and clock for clock, the nvs new arch could, theoretically, be slower than before. It's not even without precedence: Geforce FX.

Sure, but why is there a need to always assume the worst for NVIDIA ? This phenomenon, which is usually what Charlie does, isn't really logical.

Unless you want them to fail, one expects that a new architecture brings actual improvements over the past one. GeForce FX was bad yes, just like R600 for ATI, but those failures can't be the starting point for the speculation of a new part by any of these companies.

rpg.314 · Nov 25, 2009

Silus said:
Again, I'm not assuming that what I called "GeForce 360" (384 SPs, etc, etc) is just a full Fermi chip, with some units disabled. I'm guessing that that part (with units disabled) is actually a GeForce 370, with 480 SPs, one or two ROPs disabled, etc.

I hope that this time, NVIDIA doesn't go around with "Core X" and/or "Reloaded" BS suffixes, to distinguish models from one another.

If the fermi (or it's salvage part) is neck and neck with 5870, then nv's margins in the consumer gaming high end market are toast.

Silus · Nov 25, 2009

rpg.314 said:
If the fermi (or it's salvage part) is neck and neck with 5870, then nv's margins in the consumer gaming high end market are toast.

Agreed, but again, that's not what I'm saying.

A.L.M. · Nov 25, 2009

Silus said:
And how do you figure that ?
A single HD 5870 is on average 40-50% faster than a GTX 285. Even if I don't consider the obvious architectural differences that should improve performance overall, a GT200 with over 50% more Stream Processors (384), more memory bandwidth (though even @ 320 bit, certainly not 50% more), more ROPs and TMUs, should on average be 50% faster than a GTX 285.

This is why (look below).

Tchock said:
Funny you're assuming that nVidia's ALUs will scale much, much, much better than ATI's (in your case ideal).

That's like counting architectural improvements twice when you don't even know how Fermi's modifications will impact how it takes on the graphics pipeline at large. Oh well, people are already saying Amen so others assume it's true.

At 4200Mhz (same as Rys' speculation) the 320-bit GTX360 will have 6% more bandwidth than the 285.
And again, if this is a salvage @575Mhz core (reasonable spec wrt GTX260a) or so, you have again a whooping 6% more texel and pixel fillrate.

The only point here is the substantial shader power increase- it's 1.6x base ALUs and say, 1600Mhz (VERY optimistic imo for a firstrun salvage part, I'd put my money on 1400Mhz instead) - 8.5% more shader clocks vs GTX 285.

73% more shader power but nearly flat fills beside Z. My money's on the 5870 being 10+% faster at least (and this is the part of me that's not bullish on future Catalyst releases- the other 99% of me is)

Silus said:
And they have been better. Without doubling G80/G92, GT200 was usually above 50% faster than G80 and G92, especially at higher resolutions.
From HD 3870 to HD 4870 (where ALU count alone increase 2.5x) plus the TMU increase, etc and at best, the HD 4870 doubled the performance of a HD 3870 at higher resolutions. And now we are seeing what happened with the HD 4870 to HD 5870, where the latter is on average 50-60% faster than the HD 4870, even though it doubles almost everything in the HD 4870, in terms of specs.

So yes, I am assuming the ALUs will scale better on NVIDIA's architecture, than on ATI's architecture, since they have been so far.

I don't think so.
The scaling in NVidia has been even lower than in ATI parts. e.g.: G94 is almost an half G92, but it performs much higher than half G92. That is due to:

- non linear ALU nr. - to - performance scaling
- other portions of the gpu not increased in the same proportion (i.e. bandwidth in the G94-G92 case)

trinibwoy · Nov 25, 2009

Jawed said:
I've just shown that a comparison of HD5770 and HD5870 is more useful, and from that you can clearly see ~80% scaling achieved instead of 100% theoretical. The gap is Amdahl's law, basically - perhaps with some driver immaturity thrown in for good measure.

The fact this gap is so large bodes ill for GF100...

Not sure why. The scaling of two very similiar AMD parts has little bearing on the scaling of a new Nvidia architecture vs the old.

rpg.314 said:
If the fermi (or it's salvage part) is neck and neck with 5870, then nv's margins in the consumer gaming high end market are toast.

Do you have numbers to back that up? I always see people talking about margins but the actual financial results of the companies should trump assumptions and speculation no? AMD's die size advantage looks to be smaller this round so it's kinda hard to tell who will have better margins until we see Fermi performance and pricing. Last I checked AMD was selling 2 RV870's and 2G of GDDR5 for $600.

digitalwanderer · Nov 25, 2009

neliz said:
Silus said:

Here you go:

http://www.fudzilla.com/index.php?option=com_content&task=view&id=12764&Itemid=1

Click to expand...

Hahaha! .. no.. he asked for proof, not an article fuad wrote because some store-clerk mailed him "This week we sold three GTX295's versus one HD4870X2"!

Yeah, Fudzilla itself doesn't count as a source and the story doesn't source anything!

Remember, Fudzilla just told us yesterday that Fermi was coming in January....if they're right on that one mebbe I'll give them a bit more credit on this.

leoneazzurro · Nov 25, 2009

And, in the most of the cases, even the GTS250 has not half the FPS of the GTX285, except in corner cases where only bandwidth matters.

mczak · Nov 25, 2009

trinibwoy said:
Do you have numbers to back that up? I always see people talking about margins but the actual financial results of the companies should trump assumptions and speculation no? AMD's die size advantage looks to be smaller this round so it's kinda hard to tell who will have better margins until we see Fermi performance and pricing.

I think that's just based on the assumption that if you have a die size which is still ~50% larger than what your competitor has, and that the chip manufacturing costs are significant, this should probably make some difference in your margins, if you're forced to sell at a similar price than the competition. Though maybe this time PCB doesn't have to be more complex (the part with disabled units could have only 256bit memory bus, even with 320bit it might not be much more complex though it would need more memory chips). Other than the chip and the memory interface width, there just doesn't seem to be much reason why one or the other card could be cheaper to build.
Though I guess if the top part is fast enough so the lower part can have quite a few units disabled and still slightly beat the HD5870, this would allow nvidia to charge quite a bit more for the top part improving margins significantly without making it a niche product.

trinibwoy · Nov 25, 2009

leoneazzurro said:
And, in the most of the cases, even the GTS250 has not half the FPS of the GTX285, except in corner cases where only bandwidth matters.

GTX 285 vs GTS 250

Fillrate: +76%
Texturing: +10%
Bandwidth: +126%
Flops: +51%

Performance advantage ranges from 50-80% depending on settings. How you can paint that as worse scaling than 4870->5870 is beyond me.

Jawed · Nov 25, 2009

trinibwoy said:
Not sure why. The scaling of two very similiar AMD parts has little bearing on the scaling of a new Nvidia architecture vs the old.

This gap only gets bigger as the performance of the enthusiast class card increases. It doesn't matter who makes the card

Things like amount of memory, cache size/behaviour, driver efficiency are in NVidia's control, so there are opportunities to minimise the problem. e.g. if setup rate is a serious bottleneck in HD5870 but NVidia tackles this head-on...

20% is a serious amount. It cautions that we should be thinking in terms of a similar shortfall needing to be applied to GF100. We can only wait to see what it turns out to be. It'd be silly to assume it's 0. I agree that neither GTX285 nor HD5870 are useful baselines. But if we cautiously assume 80% of what theoreticals indicate based on estimated unit counts and clocks, we'll be better armed to understand how it does scale when it appears.

I don't know of a basis to estimate this gap. HD5870 might be a refresh too far and a completely revised architecture fixes it, e.g. brings the gap down to 5%.

Separately, D3D11 games might show a marked reduction in the size of this gap, simply because with things like TS and CS, considerably more-efficient use of memory is possible.

Jawed

rpg.314 · Nov 25, 2009

trinibwoy said:
Do you have numbers to back that up? I always see people talking about margins but the actual financial results of the companies should trump assumptions and speculation no? AMD's die size advantage looks to be smaller this round so it's kinda hard to tell who will have better margins until we see Fermi performance and pricing. Last I checked AMD was selling 2 RV870's and 2G of GDDR5 for $600.

AMD will still have a ~50% die advantage. Not to mention that if 5850 can turn a profit at $259 (full supply price), so can 5870 in worst case (if 5850 is eol'ed), but not nv.

Silus · Nov 25, 2009

rpg.314 said:
AMD will still have a ~50% die advantage. Not to mention that if 5850 can turn a profit at $259 (full supply price), so can 5870 in worst case (if 5850 is eol'ed), but not nv.

You may say that about the HD 5870 (assuming worst case for NVIDIA), but definitely not about the HD 5850. It's a salvage part of the full RV870 chip and NVIDIA certainly won't be competing with it with a salvage part of the full GF100 chip. I'm guessing a new chip for a "GeForce 350" or "GeForce 340", with 256 SPs, half the ROPs and half the TMUs on a 256 bit is probably what will.

Maybe that's the second chip that will be released (and not the GeForce 360), according to Fudzilla. This means that the release would have the high-end and the mid-end portions of the market covered.

Going out on a limb and using G94 as an example, which is roughly half of G92 in every way (just like I'm speculating this "GeForce 350" to be, when compared to the full GF100 chip):

G92 @ 65 nm = 324mm2
G94 @ 65 nm = 240mm2

Assuming GF100 @ 40 nm = 480mm2 (as seems to be the most common speculation)
The chip that powers the "GeForce 350" (GF104?) @ 40 nm should be around ~355mm2.

If it competes or even beats the HD 5850 overall, then this segment will surely be interesting to follow.

NVIDIA Fermi: Architecture discussion

itsmydamnation

CarstenS

Moderator

CarstenS

Moderator

Silus

Silus

Sxotty

Silus

CarstenS

Moderator

Silus

rpg.314

Silus

A.L.M.

trinibwoy

Meh

digitalwanderer

leoneazzurro

mczak

trinibwoy

Meh

Jawed

rpg.314

Silus

Similar threads