GF100 evaluation thread

rpg.314 · Mar 30, 2010

FrameBuffer said:
"Availability was already known to be pushed to April, so what's your point ?" You mean that HARD Launch you proclaimed the GTX400 series was going to be ?? Apparently someone still hasn't figured out the difference between soft/hard and paper launches yet..

Facebook/Twitter launch more like it. :???:

rpg.314 · Mar 30, 2010

DemoCoder said:
That's true, but IIRC REYES relies on sub-pixel tessellation/micropolygons, which causes inefficiencies in current GPUs, so total ALU flops alone is not a good measure of how well you'll perform on REYES. (e.g. even if 90% of the time is spent in shaders, you could end up with inefficient pipeline usage where big chunks of the chip are idle waiting for work)

Could deferred shading mitigate the 4x fragment shading penalty with micropolygons?

rpg.314 · Mar 30, 2010

Chalnoth said:
That's possible, but it does depend upon how much die space that particular feature took up.

The int32 mul, which is essential to half rate dp in fermi, almost assuredly takes as much area as the fp32 mul unit.

The number of fp32 and int32 multipliers is matched.

KimB · Mar 30, 2010

rpg.314 said:
The int32 mul, which is essential to half rate dp in fermi, almost assuredly takes as much area as the fp32 mul unit.

The number of fp32 and int32 multipliers is matched.

Without knowing how much area the FP32 mul takes up as a fraction of die size, we don't know how much area the FP64 functionality takes up on top of that.

Bear in mind, by the way, that the cost to nVidia for going half-rate FP64 was almost certainly much less than ATI, for one simple reason: ATI has more math units in the same area, while nVidia relies upon more efficient execution on fewer units. The smaller number of units means that nVidia is spending more die area on getting the info to the math units themselves than ATI, which means increasing the size of those math units doesn't do as much.

aaronspink · Mar 30, 2010

Chalnoth said:
Without knowing how much area the FP32 mul takes up as a fraction of die size, we don't know how much area the FP64 functionality takes up on top of that.

Bear in mind, by the way, that the cost to nVidia for going half-rate FP64 was almost certainly much less than ATI, for one simple reason: ATI has more math units in the same area, while nVidia relies upon more efficient execution on fewer units. The smaller number of units means that nVidia is spending more die area on getting the info to the math units themselves than ATI, which means increasing the size of those math units doesn't do as much.

Yes but it is likely that each nvidia ALU requires more area due to upsizing of transistors and/or increased pipe stages in order to run at ~2x the frequency. 1/4 rate like ATI does it should be able to be implemented with almost zero area increase while 1/2 rate requires 2x the multiplier area.

KimB · Mar 30, 2010

aaronspink said:
Yes but it is likely that each nvidia ALU requires more area due to upsizing of transistors and/or increased pipe stages in order to run at ~2x the frequency. 1/4 rate like ATI does it should be able to be implemented with almost zero area increase while 1/2 rate requires 2x the multiplier area.

Well, if I'm reading the architecture specs right, ATI also has more than 3x the math units on a 33% smaller die (512 vs. 1600). Per die area, then, ATI has somewhere around 5x the the math units. I seriously doubt that the optimizations for higher clock rate take up 5x the die area.

psolord · Mar 30, 2010

Squilliam said:
The HD 5970 is more comparable to the GTX 480 than it is for any SLI or Crossfire setup.

Between the HD 5970 and the GTX 480 they are actually similar on almost all accounts. They both use similar power, they both have a single PCB, similar size, shape and noise and the only difference is whether someone has any negative feelings towards mutli gpu setups for whatever reason.

However between the HD 5970 and other multi-gpu solutions there are a few caveats which must be overlooked when thinking about the overall market.

1. Does the person have multiple PCI-E slots and the right power supply?
2. Does the person run on a platform amenable to Sli/Crossfire? Note* The 1156 pin P55 platform does not have enough PCI-E lanes and Nvidia boards do not do crossfire and AMD boards do not do SLI which leaves only the 1366 X58 platforms open to both.
3. Is the case big enough with enough airflow for 400W worth of graphics cards? 300W is bad but 400W is pushing most enthusiast cases quite hard especially when that 125W CPU may also be pushing north of 150W overclocked.

If you can run an HD 5970 you can run a GTX 480 and vice versa. Arguably the only difference is personal preference in relation to how multi-gpu setups perform.

"Not enough" is not accurate. Anandtech's 5870 crossfire test, showed that the 16X+16X vs 8X+8X configurations, have only 2-7% performance difference.

There are motherboards like the MSI Big Bang Trinergy (not to be confused with the Fusion) which sports a NF200 chip that gives a full complement of 32 PCIe 2.0 Lanes which can be divided at 16X+16X for the graphics cards (there's also the option for 16X+8X+8X). It works for both SLI and Crossfire and I should know since I own it as well as two 5850s.

The funny thing is that its cheaper than quite a few vanilla high end P55 mobos (vanilla=No NF200).

rpg.314 · Mar 30, 2010

Chalnoth said:
Without knowing how much area the FP32 mul takes up as a fraction of die size, we don't know how much area the FP64 functionality takes up on top of that.

fp alu's certainly form a substantial portion of SMs prior to fermi, though I don't have a hard number.

Bear in mind, by the way, that the cost to nVidia for going half-rate FP64 was almost certainly much less than ATI, for one simple reason: ATI has more math units in the same area, while nVidia relies upon more efficient execution on fewer units. The smaller number of units means that nVidia is spending more die area on getting the info to the math units themselves than ATI, which means increasing the size of those math units doesn't do as much.

:|
OTOH, I think nvidia is paying substantial penalty for half rate dp. It needs int32 mul that nobody else cares for, and certainly not in matched ratios. ATI's dp is as good as free as they reuse the xyzw lanes to make one dp mul for 4 sp mul's. ATI does quarter rate dp and that is the highest ratio you can do without paying dp cost somewhere.

Arty · Mar 30, 2010

Just wanted to throw this out so I was being called out. I picked a random review (Firingsquad) and after running the numbers throw, results were:

GTX480 54% faster than GTX285
5870 60% faster than 4890

By the same yardstick that Cypress was concluded to be "meh", GF100 is less than even a "meh", "WTF" territory since its less :!:

Unless the obvious double standards show up again.

KimB · Mar 30, 2010

rpg.314 said:
fp alu's certainly form a substantial portion of SMs prior to fermi, though I don't have a hard number.

That's the problem. Hard numbers are important here! If it only cost them, say, 5% of die area vs. doing it ATI's way, then it really isn't a big deal, is it?

rpg.314 said:
:|
OTOH, I think nvidia is paying substantial penalty for half rate dp. It needs int32 mul that nobody else cares for, and certainly not in matched ratios. ATI's dp is as good as free as they reuse the xyzw lanes to make one dp mul for 4 sp mul's. ATI does quarter rate dp and that is the highest ratio you can do without paying dp cost somewhere.

It might be substantial as a fraction of the die area allocated to the math units themselves, but if nVidia's math units take up less than half the die area of ATI's, then it's a much lower total cost than if ATI had tried the same thing. That's what I'm trying to say.

Ninjaprime · Mar 30, 2010

Arty said:
Just wanted to throw this out so I was being called out. I picked a random review (Firingsquad) and after running the numbers throw, results were:

GTX480 54% faster than GTX285
5870 60% faster than 4890

By the same yardstick that Cypress was concluded to be "meh", GF100 is less than even a "meh", "WTF" territory since its less Unless the obvious double standards show up again.

... How much time after the 4890 launched did the 5870 launch? Now do the same for the GTX 285 vs the GTX 480. 4890 ---> 5870, ~6 months. GTX 285 ---> GTX 480 ~15 months. Maybe thats the problem.

Arty · Mar 30, 2010

Ninjaprime said:
... How much time after the 4890 launched did the 5870 launch? Now do the same for the GTX 285 vs the GTX 480. 4890 ---> 5870, ~6 months. GTX 285 ---> GTX 480 ~15 months. Maybe thats the problem.

Didnt you know, time is irrelevant. (along with money, power, heat & noise)

Shame on ATI for launching so soon. Investing in AMD gpus is a bad idea since they get outdated pretty quickly.

air_ii · Mar 30, 2010

Silus said:
Which proves my point ? They wanted to release it in October (before the HD 5970 was launched - released in November). If they had launched it in October, they would have the fastest graphics card. Since they didn't, they can only claim fastest GPU. Isn't that surprising

You seem to be forgetting (as usual) what ATI also said about RV870 and didn't deliver (performance wise). But obviously you only have problems with NVIDIA's PR

Do you really believe that back in September they didn't know they wouldn't make it for October launch? Yet all those statements quoted before were made after Cypress' launch.

Silus · Mar 30, 2010

air_ii said:
Do you really believe that back in September they didn't know they wouldn't make it for October launch? Yet all those statements quoted before were made after Cypress' launch.

AFAIK, Cypress != Hemlock, so fastest graphics card after Cypress is correct. After Hemlock, only fastest GPU makes sense...

Anyway, whatever...enough with this. It's quite pointless to discuss anything when the people involved have their minds set on something completely different.

dizietsma · Mar 30, 2010

I'm sure it is about conditioning in relation to "meeting expectations". For the 58xx series it met expectations on performance in the most and also exceeded them in the power and temp case. Therefore people were conditioned for lower temp and power on the smaller process, hence why they now feel disagreeable to 4XX's extra temps and power. If 58XX had still been as hot as 4XXX series it wouldn't be so much of a big deal.

That's the problem when you come out of the blocks 2nd, you are a lot more likely to disappoint than to blow people away.

KimB · Mar 30, 2010

dizietsma said:
I'm sure it is about conditioning in relation to "meeting expectations". For the 58xx series it met expectations on performance in the most and also exceeded them in the power and temp case. Therefore people were conditioned for lower temp and power on the smaller process, hence why they now feel disagreeable to 4XX's extra temps and power. If 58XX had still been as hot as 4XXX series it wouldn't be so much of a big deal.

That's the problem when you come out of the blocks 2nd, you are a lot more likely to disappoint than to blow people away.

Indeed, though I am also beginning to suspect that nVidia missed their performance target somewhat with this part. Here's hoping that later iterations of the chip have significantly higher performance per die area (and per watt...).

Silus · Mar 30, 2010

Chalnoth said:
Indeed, though I am also beginning to suspect that nVidia missed their performance target somewhat with this part. Here's hoping that later iterations of the chip have significantly higher performance per die area (and per watt...).

I think it's obvious that they did missed their targets. Clocks are certainly not high enough (especially compared to GT200b) and they certainly didn't want to come out with their flagship with units disabled. Both would without a doubt, increase performance, had they been possible.
Still, power draw is the biggest problem I see that they need to address in future iterations. If the GTX 480 had less power consumption under load (let's say 50w), the results would be far more appealing.

Silus · Mar 30, 2010

Tech-Report has yet to publish their review. Was hoping to see it yesterday, but nothing...

Btw, is Rys working with Scott on this one ? (Since he had written a piece for TR a while ago)

rpg.314 · Mar 30, 2010

http://www.fudzilla.com/content/view/18277/34/

Back in October Nvidia has created a huge hype for its 512 shader compute chip and people were ordering these cards like there was no tomorrow.

Distributors and etail / retail stores have already ordered far more cards that partners can provide.

Our advice to anyone who's after Fermi cards is that you better be ready to buy one as soon as you see it listed.

If it's this limited, it makes me wonder how much truth is there in Charlie's yields claims, despite the the ridiculously low yields.

KimB · Mar 30, 2010

rpg.314 said:
If it's this limited, it makes me wonder how much truth is there in Charlie's yields claims, despite the the ridiculously low yields.

Yield numbers in the 70% range are what I've heard through the grapevine, and make far more sense economically than some of the more ridiculous estimates out there.

GF100 evaluation thread

Whatddya think?

Yay! for both

480 roxxx, 470 is ok-ok

Meh for both

480's ok, 470 suxx

WTF for both

rpg.314

rpg.314

rpg.314

KimB

aaronspink

KimB

psolord

rpg.314

Arty

KEPLER

KimB

Ninjaprime

Arty

KEPLER

air_ii

Silus

dizietsma

KimB

Silus

Silus

rpg.314

KimB

Similar threads