NVIDIA Fermi: Architecture discussion

Techdemos/benchmarks have beyond doubt their own value and deliver valuable data. It shouldn't however influence someone's buying decision more than a long selection of game performance results (and yes hello if the list of games is long enough you can add an equal amount of titles that favor X and an equal amount that favors Y IHV), and in that notion I'm not excluding any benchmark like Futuremark's applications for instance.

After all I have the awkward tendency to buy a GPU as a mainstream consumer to play games on it and not endlessly masturbate at benchmark results.

Very true! Gaming is what matters the most. But what happens if under the table agreements start entering the playing field and when lawyers decide what I should play on what hardware?
 
Very true! Gaming is what matters the most. But what happens if under the table agreements start entering the playing field and when lawyers decide what I should play on what hardware?

Careful how I worded my proposal; I never said you should base your buying or gaming decisions on one or a handful of games with whatever awkward story behind them.

For the given case and today I'd buy a 58x0-whatever irrelevant if I can officially enable MSAA in Batman or not. I'd most likely use the same backdoor for enabling it as many others and call it a day instead of wasting energy and valuable time who the f is right or wrong in that one.
 
Um I would buy the card that gives the adequate performance at sufficient visual fidelity in games I actually like to play at the cheapest cost psolord. It is irrelevant why that card does so, whether it is due to sheer awesomeness, trickery, or something else. It is called satisficing.

edit upon reading his link:

Wow I was unaware that nvidia actually completely did the work for eidos as that link specifies. All AMD would have to do then according to your link is get off their lazy bums and write some code for them eidos. It is pretty sad that they haven't done that yet. That link diminishes AMD in my mind. I always assumed that there was some other reason that Eidos wasn't enabling it besides sheer laziness on AMDs part.
 
Hard to write code in a vacuum

And that's exactly why they should get in the game instead of shouting from the bleechers. This isn't about excusing Nvidia's behavior, but isn't the end goal to provide the best experience for ATI's customers? If that requires getting down in the trenches then so be it.

What "coding" is necessary to enable MSAA in Batman:AA for ATi cards? Removal of a Vendor ID check? That's not coding.

You're doing the same thing ATi is doing. Blaming external parties for their woes.

I wouldn't be so sure about this.

Well you can't be sure of anything at this point but I don't see much reason to be optimistic right now.
 
how do you see that, ATi could have wrote thier own code and submitted it to Eidos, nV went and wrote thier own code for it, why do they have to support ATi cards, and test for them, when ATi should be fully capable of doing so, instead ATi wanted Eidos to open up thier code (and nV's) which Eidos legal team said no to that, pretty straight forward english in those emails.
 
Funny that Ubisoft's legal team didn't stop a certain party from removing IHV specific features written by that same IHV from a certain game... :rolleyes:
 
Maybe GPU's live in an entirely different place, but whatever it is that you're describing is a complete fiction in my world.

The problems you tends to hit during silicon validation are not the ones that are easy to narrow down with a targeted test. Those are usually found and fixed during module simulations. You will typically rerun your chip-level verification suite on the real silicon, but that's the easy part: they are supposed to pass because they did so in RTL or on some kind of emulation device. The only reason you rerun them is to make sure that basic functionality is sane.

Hopefully you don't have any of those. What I was describing was logic bugs, and I do admit that those should be caught in simulation. I have heard of several cases where some escaped, especially considering the size of the GPU, and the complexity of a frame. Those are the quick and easy fixes that are nailed down fast.

The things you run into on the bench are hard to find corner cases. Something that hangs every 10 minutes. Or a bit corruption that happens every so often. They are triggered when you run the real applications, system tests that are impossible to run, even on emulation, because it just takes too long. In telecom, this may be a prolonged data transfers that suddenly errors out or a connection that lose sync. In an SOC, some buffers that should not overflow suddenly do or a crossbar that locks up. A video decoder may hang after playing a DVD for 30 minutes. When these things trigger, a complicated dance starts to try to isolate the bugs. It can take days to do so. Very often there is a software fix by setting configuration bits that, e.g., may lower performance just a little and disable a local optimization, but sometimes there is not. These kind of problems are present in every chip, but even if they're not, you need wall clock time to make sure they are not. Did I mention that sometimes you need to work around one such bug first before you can run other system tests?

Yup, I totally, agree, and I didn't mention this, but I wasn't trying to be exhaustive or write a book on the topic. There comes a point where you have to leave stuff out.

And then there's the validation of analog interfaces: making sure that your interface complies with all specifications under all PVT corners takes weeks even if everything goes right. (Corner lots usually arrive a week after the first hot lot, so your imaginary 2 week window has already gone down by half.) You need to verify the slew rates and driving strengths of drivers. Input levels of all comparators. Short term and long term jitter of all PLL's. etc.

Yeah, this is why I mentioned non-logic bugs in the earlier post (not sure if it was this or another that you are referring to). That is where they are tearing their hair out now, or were if A2 fixes things.

The whole idea that you can do all that quickly and then do a respin in 2 weeks is laughable (and you'd be an idiot to do so anyway: you know there are stones unturned if you do it too quickly.) If everything goes well, 5 weeks is the bare minimum.

That does not mesh with the numbers I get from various companies. Not sure if the validation requirements are the same in telecom and GPUs, or what the level of resources thrown at the problem is, but I have been given the two week number by a lot of companies.

And what about the claim that you can fix logic bugs with just 2 metal layers (the real smoking gun if ever there was one that you really don't know what you're talking about, thank you). This would mean that you're only changing the two upper metal layers out of 7 or 8. Which is surprising because, as I'm sure you're aware, the wire density of M7 and M8 is low. There doesn't happen a lot of interesting stuff at that level, so the chance of fixing anything remotely useful is very, very low.

Did I claim 2 layers? I do know about the density, I was referring to it more in reference to where the wafers were parked.

You usually get away with not having to touch M1, but in 99% of the cases, you don't even bother looking for a fix that doesn't touch M2. Respin time is 100% a function of the lowest layer you change. If you change M2, it doesn't matter that you also modify M3-M8. The majority of backup wafers are parked before M2 or V1.

OK, fair enough, I never found out where they were parked. I did ask, no one knew off the top of their head.

You may or may not have great insider info about tape-out dates and other dirty laundry. It's very entertaining, but please stay out of the kitchen?

I do have good knowledge there, watch the Fermi preso where JHH confirms it. :) I also admit I am not an EE, but I do know more than most who don't work in the field.

-Charlie
 
The truth is the truth. I can't change it.

If you would like to continue this discussion in its proper thread, please go to:

Batman: Arkham Asylum Demo and its amazing Nvidia ONLY MSAA technology!

What truth? Eidos locked them out. Eidos wanted them to come and verify AA worked on ATI hardware. AMD didn't send anyone. All Nvidia did was write code to enable AA n a game engine, UE#, that didn't have it before hand. AMD is blaming them for something Eidos is doing themselves, not Nvidia.
 
What truth? Eidos locked them out. Eidos wanted them to come and verify AA worked on ATI hardware. AMD didn't send anyone. All Nvidia did was write code to enable AA n a game engine, UE#, that didn't have it before hand. AMD is blaming them for something Eidos is doing themselves, not Nvidia.
What you're saying sounds good, it's just it's factually incorrect.

Take this over to the proper thread and I'll explain to you why everything you said is patently false. :yep2:
 
Hopefully you don't have any of those. What I was describing was logic bugs, and I do admit that those should be caught in simulation. I have heard of several cases where some escaped, especially considering the size of the GPU, and the complexity of a frame. Those are the quick and easy fixes that are nailed down fast.
I'm talking about logic bugs also: it's easier to make something bug clean at the block level than to do so at the chip level. But emulators are just not fast enough to hit enough corner cases by chance, so you just pray that you've covered everything. The size of the chip has little to do with it: it's about run time, which increases the chance that your logic hits a certain combination. It is very common for chips that are way smaller than GPU's to work fine for hours in a row in the lab and but then just hang. It's impossible to run such tests in emulation, yet they're pretty much always pure logic bugs also. Almost as often, they can be worked around with firmware, but until you know that, you have to treat them as a bug that needs to be isolated and fixed.

Hoping that you'll find *and* fix these kinds of bugs in two weeks is just not realistic.

Yeah, this is why I mentioned non-logic bugs in the earlier post (not sure if it was this or another that you are referring to). That is where they are tearing their hair out now, or were if A2 fixes things.
What I'm trying to point out, is that even if all analog cells are working 100% according to spec, you needs weeks to find out that they do. It's not sufficient to bring up a new silicon and see that it 'works'. You need to prove that they work under all corners with enough margin so that they also work when paired against a corner part of a companion chip (e.g. PCIe or RAM chips). It's a labor intensive process that just needs its time.
You're suggesting that those non-logic bugs are a big, major problem that can wreck a schedule. They're not more so than logic bugs because these problems are usually well isolated and every sane analog block has plenty of metal programmed trimmers that can be adjusted to make fixes. A full base respin is almost never needed. It's a routine fix.

That does not mesh with the numbers I get from various companies. Not sure if the validation requirements are the same in telecom and GPUs, or what the level of resources thrown at the problem is, but I have been given the two week number by a lot of companies.
Telecom requirement are stricter than consumer products, but lab validation takes much longer. It's not unusual to do a first spin only after 3 months or so: the customer cares more about having some silicon that's good enough to start working on it than about something that's perfect for production. First silicon may be in the critical path towards getting a product in full production, but the availability of production quality silicon almost never is.

But two weeks after first silicon back is really insane. Let me restate again: corner lots only arrive at least one week after first silicon. You simply can't give silicon enough soak time in the lab to spin it that fast. And that's not even accounting for the fact that the last metal fix has to come in at least half a week before the second tape out, because LVS and DRC take at least that long.

Let's put this in a different way: if a company is spinning metal within 2 weeks of first silicon, it's unmistakeable proof that the initial silicon was so broken that you needed a quick fix just to be able to do enough of your regular validation work.

Did I claim 2 layers? I do know about the density, I was referring to it more in reference to where the wafers were parked.
You wrote: "From there, it is just making masks for a layer or two."
Since the amount of layers is irrelevant wrt schedule (the only thing that counts is the lowest layer), I concluded that you pointed to the upper two layers.

I do have good knowledge there, watch the Fermi preso where JHH confirms it. :) I also admit I am not an EE, but I do know more than most who don't work in the field.
I suppose learning about tape-out dates is a matter of having enough connections in Taiwan who're willing to talk. That's fine. It's up to the reader to decide if you and your sources can be trusted.
But when you make wild guesses about internal schedules that any engineer in the field knows are fantasy, you automatically taint those pieces of information that may actually be true. (The same argument can be made about the emotional detachment of the journalist, BTW.)
 
I agree with silent_guy that two weeks is crazy short for first silicon debug. However, it might be enough time to verify if a spin fixed the intended bugs. Aggressive management might enter risk production in this time frame. The assumption being that no new bugs were introduced by the fixes.
 
Hoping that you'll find *and* fix these kinds of bugs in two weeks is just not realistic.

Rather than make this into a massively long post, I will simply go by what I hear in the field, and what I have seen in practice in the GPU and CPU business. I know several GPU spins that have been in the 2 week range, and several, like Fermi and R600, that were longer.

Another thing I don't know if you are considering is that you can fix a lot of the corner cases in software, something that GPUs are notorious for doing. The verification standard is a lot lower than CPUs or telecom grade chips. The last one I had full docs for were the Atari Jaguar chips, and they were buggy as hell, but it was worked around in practice. GPU do a lot of the same.

-Charlie
 
It's unfortunate, Charlie, that you would think that you could drop your [removed colour] where real people are sitting.

I *personally* worked with the Jaguar and you are talking out of your [removed colour].

It had a few problems but you are massively overplaying them.

You are *way* out of your depth here.
 
Last edited by a moderator:
I don't know anything about the Jaguar, but Charlie is correct that it's much easier to work around bugs in a GPU than many other types of chips. In this instance having a driver is a great thing for quick time to market. Of course not everything can be worked around via the driver in an acceptable manner.
 
Hai gais, what's going on. For the last time: if you want to bukake eachother do it in RPSC. No, really, go there and do whatever you want, but not here. If you feel one poster or another is wrong correct him in a less colourful manner, without attacks. Now, back to discussing Fermi, kthx.
 
Back
Top