NVIDIA Fermi: Architecture discussion

Why bring drivers into this, he said software, not drivers. I have seen several ECC implementations on GPUs, all of which did their job at the program level. Read this:

http://saahpc.ncsa.illinois.edu/09/sessions/day2/session2/Maruyama_presentation.pdf

Next time, try looking around a little, you will be surprised by what you will find.

-Charlie

Do you think that Tesla based on Fermi will use this application based ECC checking? I reckon that it will support ECC for all applications without requiring program modification.
 
If they cannot hit their expected power and/or clock targets, then performance compared to Cypress will not be where Nv would want it to. And far worse, if their presumably big fat die will not be within a healthy margin out of current-issue cypress's reach, then AMD might do to them, what they did with RV790 to (most of) GT200b.

They don't have the option of just throwing a sub-expectations performing part on the market again with no immediate successor being in the pipeline, assuming they cannot go much bigger, denser or higher clocked on the current 40nm process and it's options (of whom i am not aware).

If yield is a critical issue for them and their coarse grained redundancy (which builds upon improvin process maturity over time or on the availability of optical shrinks), then they're in it even deeper - at least from where I am sitting.

So they might need to do an A4 or, as Ailuros suggests, even a Bn, to improve on either or both fronts if A3 isn't up to scratch in terms of delivering their intended performance. With G80, they could afford to wait with a new stepping/model/respin in the form of the Ultra until just before R600 launch. This time, a mediocre product, which might be on the market about six to eight months after the competition (I'm talking about Cypress here), will cost them dearly.
 
I think (but am not sure) that he implies that if they would go for a respin it would be a full respin. In that case I'd guess that they'd be looking into a Bx whatever rather than "A4".

In any case I don't sense anything reliable yet in the channel whether A3 is good enough to start production or not.

Basically, yes. If they can't get the clocks up after 3 tries for metal layer tweaks, a fourth probably won't produce miracles. It could, but I doubt it will in practice.

I am betting on a serious revamp of the architecture before a 28nm part, either that or NV will just suck it down in the benchmarks while proclaiming a crushing lead in some really odd benchmark that they do well.

-Charlie
 
I have no reason to think that at all. These are just the ones that are documented - as of April 2009, mind you.

The relevant point however is that these are DP Xeons and not desktop CPUs. Do you think that is different now?

You don't think Intel would make a custom CPU (fused/specified, not silicon changes) for Google? You don't think they have special bins of Nehalems for HPC/dense server farm operations? You don't think they would make a special mobo for the same uses that have higher temp operations, more efficient components, and in general higher specced everything?

/me looks to his left and sees a HPC Nehalem rack server with all of that other than the specially fused Nehalems.

-Charlie
 
You don't think Intel would make a custom CPU (fused/specified, not silicon changes) for Google? You don't think they have special bins of Nehalems for HPC/dense server farm operations? You don't think they would make a special mobo for the same uses that have higher temp operations, more efficient components, and in general higher specced everything?

/me looks to his left and sees a HPC Nehalem rack server with all of that other than the specially fused Nehalems.

-Charlie

That certainly seems quite plausible. So basically what you're saying is that you would agree that that other guy's claim that most server farms typically rely on budget single socket desktop components is unlikely?

Still, even lower TDP, higher efficiency components and the like don't really do much to change the probability that any Nehalem based solution will still be a factor or so off when it comes to computational density and power efficiency compared to a Fermi based Tesla setup.
 
A4 or B1 doesn't make sense at all. When A3 is crap nvidia would concentrate on the refresh chip and release a "Benchmark version" of A3 with limited quantities.

GF100 was planed for November 2009, so the refresh should be planed for about summer already.
And now for the really important question :
How much are these new gfx cards going to cost?
If Fermi is slower than Cypress it will cost less.
If Fermi is equal to Cypress it will cost almost the same.
If Fermi is faster than Cypress it will cost more.
 
Hopefully NV will do the smart move and price it the same making Cypres obsolete, at least until AMD lower prices. I think forcing them to do so earns a lot of mind share though.

But will they be able to do that? If GF100 launches in March, by then the HD 5870 could very well be priced around $300.
 
If you take GTX295 performance as a target for Fermi (let's say that's realistic) what about $499? a 25% premium over HD5870 for performance.
 
If you take GTX295 performance as a target for Fermi (let's say that's realistic) what about $499? a 25% premium over HD5870 for performance.

And there will be a GF100 card directly target HD5870. They did the same with the 8800GT (vs 3870) and the GTX260-192 (vs 4870). I don't see why they won't do it with GF100.
 
So they might need to do an A4 or, as Ailuros suggests, even a Bn, to improve on either or both fronts if A3 isn't up to scratch in terms of delivering their intended performance.
For yield, speed or power, an A4 won't do much. (I'm a bit on a mission to kill the idea in people's brain that metal spins have a lot of impact on those. ;)) A B spin is something else entirely, of course, but that would be a much longer term solution.
 
As in it's the retail stepping for upcoming products, and there's no A4.
 
It's interesting looking at the clocks on the various nvidia 40nm parts(in rough release order):
GT218 589/1402
GT216 625/1360
GT215 550/1340
GT215(OC) 585/1420
GF100 A1(Hotboy) 495/1100
GF100 A2(Various) ???/1200-1350
GF100 A3(Nvidia) ???/1250-1400

They seem very clustered around 1300-1400Mhz for shaders(excluding GF100 A1 obviously). Assuming they keep the same ratio as GF100 A1 core clock is looking around 550-630 maybe?

Contrast with the G9x generation which covered 1375->1836Mhz
 
Back
Top