NVIDIA Fermi: Architecture discussion

AMD doesn't go "our cards pretty much suck, just buy Nvidia" when nvidia has the lead.

That's true but Nvidia is notorious for talking big and then not backing it up. Based on the very limited information we have so far it's hard to tell exactly how well Fermi will do vs GT200. No word on texturing so far. If they need to replace MADs with a MUL+ADD due to precision issues with FMA that could reduce Fermi's shader throughput too. Can't see them maxing out GDDR5 so the bandwidth advantage will be pretty limited given the narrower bus.

It's not an impossible task that they face though. The mighty 5970 is averaging around an 85% advantage over the GTX 285. So the only question is whether Fermi gets anywhere near 2xGT200 speeds. How they do that with only ~50% more off-chip bandwidth is beyond me. Hard to say what their chances are without knowing more about the arch.
 
is it really a wait when AMD can't seem to get cards on shelves to?


Stock is just selling out as fast as it comes in. You can make a pre-order, and you'll still get your card a week later. That's what I did for a build about a week after launch. At no time were the cards listed as in-stock, but they still got sent out when it got to your turn in the queue. Sure, you can't buy a card today and have it arrive tomorrow, but it's not difficult to get a card in a week or so.
 
Stock is just selling out as fast as it comes in. You can make a pre-order, and you'll still get your card a week later. That's what I did for a build about a week after launch. At no time were the cards listed as in-stock, but they still got sent out when it got to your turn in the queue. Sure, you can't buy a card today and have it arrive tomorrow, but it's not difficult to get a card in a week or so.


Thats true, but that just shows you they aren't able to produce enough and that comes down to what? demand outstrips supply, but this was the first quarter we have increased demand for GPU but still lower then previous highes, so still have major production issues.
 
One reason more to put up or shut up. Limited availability is still better than nothing.

hmm doesn't really matter I would say, nV wanted to launch in Nov, it slide till Dec. availability Jan as we know for now. Its a 3 month difference from ATi's launch whats the big deal, if we start seeing it going to q1 to q2 availability or AMD is able to resolve thier stock issues and nV still doesn't have a card, I can see a problem at that point.
 
hmm doesn't really matter I would say, nV wanted to launch in Nov, it slide till Dec. availability Jan as we know for now. Its a 3 month difference from ATi's launch whats the big deal, if we start seeing it going to q1 to q2 availability or AMD is able to resolve thier stock issues and nV still doesn't have a card, I can see a problem at that point.

AMD has been expected to fix the supply issues by dec. So, i'd say it matters.
 
hmm doesn't really matter I would say, nV wanted to launch in Nov, it slide till Dec. availability Jan as we know for now. Its a 3 month difference from ATi's launch whats the big deal, if we start seeing it going to q1 to q2 availability or AMD is able to resolve thier stock issues and nV still doesn't have a card, I can see a problem at that point.

I'm talking about the pre-launch GF100 details NVIDIA is channeling; of course can I understand the PR necessity from NV's side to keep interest alive, it's just that I personally think it's completely redundant.

That's completely irrelevant to the above, but since you're going into another realm I'd say that NV would have launched even in late October if they could and finally launching in January for example doesn't guarantee me - as the situation looks like - that GF100 availability will be any better than with AMD's Cypress and Hemlock GPUs at the moment. Under that reasoning you'd have to make quite safe predictions how availability will look like for any future timeframe X in order for me to answer if it "matters" or not.
 
Thats true, but that just shows you they aren't able to produce enough and that comes down to what? demand outstrips supply, but this was the first quarter we have increased demand for GPU but still lower then previous highes, so still have major production issues.

Oh, I agree, but the situation isn't so bad that it's going to hurt AMD too much or that people are going to throw their hands in the air and decide to wait months longer for whenever Fermi is going to show up.

If people are using the 5800 constraint as some kind of excuse as to why the Fermi delay isn't really going to hurt Nvidia... well I don't think that's really the case at all. Fermi being very late does give a lot of the general/gaming market to the 5800 range, and that will hurt Nvidia.

It will be even worse by the end of the year when 5800 production picks up to feed the demand, and Fermi is still months away. AMD will make hay, and then cut prices if necessary.

Nvidia will either have to cut prices to painful levels, or have something quite remarkable in the mainstream version of Fermi. Or they are really only interested in the high margin HPC market and will focus their efforts there.
 
I'm not seeing any evidence so far that points to RV870 saturating the high-end market in the next few months. Certainly not to the point that G80 did, but time will tell.
 
Oh, I agree, but the situation isn't so bad that it's going to hurt AMD too much or that people are going to throw their hands in the air and decide to wait months longer for whenever Fermi is going to show up.

If people are using the 5800 constraint as some kind of excuse as to why the Fermi delay isn't really going to hurt Nvidia... well I don't think that's really the case at all. Fermi being very late does give a lot of the general/gaming market to the 5800 range, and that will hurt Nvidia.

It will be even worse by the end of the year when 5800 production picks up to feed the demand, and Fermi is still months away. AMD will make hay, and then cut prices if necessary.

Nvidia will either have to cut prices to painful levels, or have something quite remarkable in the mainstream version of Fermi. Or they are really only interested in the high margin HPC market and will focus their efforts there.


AMD has been expected to fix the supply issues by dec. So, i'd say it matters.

tell ya the truth, nV isn't launching because of TSMC's issues, nothing else to it right now. ;) TSMC has to fix AMD's supply issues and nV's issues as well. AMD has nothing much to do with thier problems right now.
 
I'm talking about the pre-launch GF100 details NVIDIA is channeling; of course can I understand the PR necessity from NV's side to keep interest alive, it's just that I personally think it's completely redundant.

That's completely irrelevant to the above, but since you're going into another realm I'd say that NV would have launched even in late October if they could and finally launching in January for example doesn't guarantee me - as the situation looks like - that GF100 availability will be any better than with AMD's Cypress and Hemlock GPUs at the moment. Under that reasoning you'd have to make quite safe predictions how availability will look like for any future timeframe X in order for me to answer if it "matters" or not.


oh sorry miss understood where you were coming from

Yes that is what my take on the availability issues is as well.
 
That's true but Nvidia is notorious for talking big and then not backing it up. Based on the very limited information we have so far it's hard to tell exactly how well Fermi will do vs GT200. No word on texturing so far. If they need to replace MADs with a MUL+ADD due to precision issues with FMA that could reduce Fermi's shader throughput too.
Seems very unlikely. FMA is trickier to implement than MAD because it has more precision, and from the point of view of graphics you can't have "too much precision" due to FMA in comparison with MAD.

In R800 FMA is only available on the X,Y,Z,W lanes. All five lanes have MAD. Seems like a cost-cutting measure - the availability of FMA might have something to do with the DP implementation too, i.e. there are extra bits there anyway, so they got used for FMA.

Can't see them maxing out GDDR5 so the bandwidth advantage will be pretty limited given the narrower bus.
GT215 has 8 ROPS for its 128-bit GDDR5 bus. GF100 seems likely to have twice the ROPs per channel, i.e. 48. I expect it'll chew through the extra bandwidth quite happily.

It's not an impossible task that they face though. The mighty 5970 is averaging around an 85% advantage over the GTX 285.
Even more with 8xMSAA at 1920, which seems like a very reasonable setting, particularly as it's such a popular monitor size.

You could argue that GTX285 only has to be faster than GTX295 to sell-out, i.e. >30-50% faster.

So the only question is whether Fermi gets anywhere near 2xGT200 speeds. How they do that with only ~50% more off-chip bandwidth is beyond me. Hard to say what their chances are without knowing more about the arch.
With 8xMSAA I bet GF100 catches-up - I think ROP efficiency is likely to be a high priority since it's clear it could be better. Also, there's a few new games between now and GF100's launch. By that time the reviews should be using less of the old games that "minimise" the differences with these newer cards. Also D3D11 introduces new kinds of bottlenecks for ATI and NVidia to compete on. Additionally, of course, D3D11 code can't be benchmarked against GTX285, so in that sense it's almost immaterial if GF100 is 30% or 80% faster...

Jawed
 
AMD does plenty of talking with their CPU side and I feel pretty sure it will permeate across.

Actually, the people who did that at AMD were promoted to the AMD Outer Mongolia research lab. It is across the street (such as it is) from the Intel Outer Mongolia research lab, where the people behind the P4 are toiling away on their next big idea.

If you didn't notice, AMD product direction has been taken over by many of the same people who fixed ATI.

-Charlie
 
Seems very unlikely. FMA is trickier to implement than MAD because it has more precision, and from the point of view of graphics you can't have "too much precision" due to FMA in comparison with MAD.

Perhaps but several people have raised concerns of "too much precision" causing artifacts in certain situations.

GT215 has 8 ROPS for its 128-bit GDDR5 bus. GF100 seems likely to have twice the ROPs per channel, i.e. 48. I expect it'll chew through the extra bandwidth quite happily.

Oh I wasn't talking about its capacity to saturate the bus. I mean they won't be maxing out theoretical bandwidth of currently available GDDR5 modules. I think the best we can hope for is 4.9 Gbps which isn't too bad I guess since that puts it within a few % pts of HD5970's 256GB/s.

Even more with 8xMSAA at 1920, which seems like a very reasonable setting, particularly as it's such a popular monitor size.

Yeah but given the obvious cliff at 8xAA for GT200 I'm not sure that's a valid "average case" setting.

By that time the reviews should be using less of the old games that "minimise" the differences with these newer cards. Also D3D11 introduces new kinds of bottlenecks for ATI and NVidia to compete on. Additionally, of course, D3D11 code can't be benchmarked against GTX285, so in that sense it's almost immaterial if GF100 is 30% or 80% faster...

There's no reason to expect that Fermi won't benefit from DX11 as much as RV870 does. In any case DX10/10.1 is still going to be the target for the majority of titles used in reviews in the next few months.
 
Yeah but given the obvious cliff at 8xAA for GT200 I'm not sure that's a valid "average case" setting.

8xAA is certainly as valid as 4xAA. But trying to make a general comparison between architectures evaluating only one of the two always has a bitter taste of cherry-picking for me. You could also take 2.560 with 8xAA, where through sheer lack of vidmem the GTX 295 merrily destroys itself.

If you're really into making single-number ratings out of your data sets, then you should maybe weigh different resolutions and AA settings according to market penetration. I think when we did the last quickpoll in our community this year, there was a ~37 to ~18 percent "majority" for various 4xAA-Settings. Likewise, there still were almost twice as much people usually playing in 1680 compared to 1920.

But the trouble is, getting the data for each specific target audience. I wouldn't be surprised for example if adoption of 30-inch-LCDs is in the double-digigt percentages for people buying or considering to buy a multi GPU graphics card.
 
Perhaps but several people have raised concerns of "too much precision" causing artifacts in certain situations.
Were they graphics programmers?

Regardless, if the hardware can do FMA then MAD is a rounding after the MUL. Admittedly this would appear to be extra circuitry beyond just doing rounding after FMA completes, but if it's as important as you say then it'd be a small cost worth paying.

Me, I just don't see the issue, it's a complete mystery.

Oh I wasn't talking about its capacity to saturate the bus. I mean they won't be maxing out theoretical bandwidth of currently available GDDR5 modules.
They're effectively the same, I don't know what you mean. If GF100 is under-clocked and/or doesn't get ROP efficiency gains, then yeah.

Yeah but given the obvious cliff at 8xAA for GT200 I'm not sure that's a valid "average case" setting.
That didn't stop comparisons of 4xMSAA being used in HD2900XT versus 8800GTX :p 4xMSAA isn't enough, particularly not for the big pixels on a 24" or 27" 1920 wide monitor.

There's no reason to expect that Fermi won't benefit from DX11 as much as RV870 does. In any case DX10/10.1 is still going to be the target for the majority of titles used in reviews in the next few months.
We'll see I guess - something similar happened with R520. If it had launched on time the game-mix would have made it look a lot worse than the game-mix did when it launched.

Jawed
 
Were they graphics programmers?

Regardless, if the hardware can do FMA then MAD is a rounding after the MUL. Admittedly this would appear to be extra circuitry beyond just doing rounding after FMA completes, but if it's as important as you say then it'd be a small cost worth paying.

I believe dkanter raised the issue at one point and also pointed out that Fermi did not have the option to do the intermediate rounding. Don't think he's a graphics programmer but nobody has stepped up to say that it's NOT an issue either.

They're effectively the same, I don't know what you mean. If GF100 is under-clocked and/or doesn't get ROP efficiency gains, then yeah.

How is it the same? Increasing efficiency doesn't guarantee that you won't be bandwidth limited.

That didn't stop comparisons of 4xMSAA being used in HD2900XT versus 8800GTX :p 4xMSAA isn't enough, particularly not for the big pixels on a 24" or 27" 1920 wide monitor.

Except 2900XT was slow at nearly everything, not just one setting.

8xAA is certainly as valid as 4xAA. But trying to make a general comparison between architectures evaluating only one of the two always has a bitter taste of cherry-picking for me. You could also take 2.560 with 8xAA, where through sheer lack of vidmem the GTX 295 merrily destroys itself.

Picking a setting where one architecture has an obvious performance cliff is cherry picking. Using 4xAA isn't cherry picking since performance at that setting typically scales in line with performance at other settings - 0xAA, different resolutions etc. 8xAA is the outlier.
 
If you're really into making single-number ratings out of your data sets, then you should maybe weigh different resolutions and AA settings according to market penetration. I think when we did the last quickpoll in our community this year, there was a ~37 to ~18 percent "majority" for various 4xAA-Settings. Likewise, there still were almost twice as much people usually playing in 1680 compared to 1920.
Gamers are notorious for lagging on use of MSAA though. If they're bitten by the bad performance hit it has caused in the latest games over most of the last 8 or so years, then I admit you can't blame them for leaving it off. Alpha and specular aliasing are also such eyesores that MSAA can seem like it's not worth the bother - "I've turned on 4xMSAA, so now my game's slower, but the aliasing still hurts my eyes!"

But the trouble is, getting the data for each specific target audience. I wouldn't be surprised for example if adoption of 30-inch-LCDs is in the double-digigt percentages for people buying or considering to buy a multi GPU graphics card.
Ooh, definitely agreed. Eyefinity's more of the same and it seems to defeat HD5970 quite easily. Though I think that's because it's a kludge dreamed up after the hardware was designed. Otherwise there'd be none of this fucking around with Dell mini-DP adaptors. But that's for another thread.

Jawed
 
Back
Top