3D Chip Validation

swaaye

Entirely Suboptimal
Legend
Supporter
Here's something that I don't think anyone has asked before.

How do the IHVs validate the chips they manufacture? Do they test each function of the chips? I remember back in the day with the original Radeons people were having problems with the HyperZ functionality (I did) in that some cards did it perfectly where others had visual artifacts due to Z problems.

It seems like it would be a huge task to try to make sure every chip works perfectly with every possible computation.
 
It is my understanding that every single ASIC produced is plugged into a socket on a ridiculously expensive machine that can run thousands and thousands of test vectors within a very short space of time.

The tests cover every functional block and based on their results, the ASICs can be quickly binned based on functionality and stability per clock.

MuFu.
 
The tests might be able to cover every functional block.

Or not. The idea is you want to test everything you can in the shortest amount of time.

However, many times things are not tested, either by schedule/design or by mistake.
 
Sorry... meant to say that they *can* cover every functional block. I assume the failure rate of certain cells is so insignificant that testing can be overlooked in order to save time.

MuFu.
 
Before the chip is made they run a large number of tests in software. This will cover a significant portion of logic functionality. Once the chips are back from the fab the chips can be tested much faster than they could through software simulation, but the software simulation assures that most of the chip will be functional.
 
3dcgi said:
Before the chip is made they run a large number of tests in software. This will cover a significant portion of logic functionality. Once the chips are back from the fab the chips can be tested much faster than they could through software simulation, but the software simulation assures that most of the chip will be functional.

But that doesn't cover manufacturing flaws.

swaayej: I'm not an expert in these matters, but MuFu's answer was pretty much correct. AFAIU before the chip is even packaged, values (bit vectors) are fed in and values read out and compared to expected results. These are chosen (often evolved) to catch as many defects as possible.

The registers (individiual storage bits) of the chips are also often all connected together in long chains so it's possible to 'scan out' the entire contents of the chip.

Finally, AFAIU, some sections of the chips (eg RAMS) may have BIST - Built In Self Test - which reduces the need to rely on Test Vectors for those parts.
 
Simon F said:
3dcgi said:
Before the chip is made they run a large number of tests in software. This will cover a significant portion of logic functionality. Once the chips are back from the fab the chips can be tested much faster than they could through software simulation, but the software simulation assures that most of the chip will be functional.

But that doesn't cover manufacturing flaws.

Right. That's why I said "logic" flaws. I wasn't saying that this testing is performed instead of what MuFu described. It is complimentary.
 
3dcgi said:
Right. That's why I said "logic" flaws. I wasn't saying that this testing is performed instead of what MuFu described. It is complimentary.
Sorry, I did not mean to contradict you. It's just that I interpretted swaayej's post as asking how IHVs guaranteed each individual chip was correct as opposed to the design itself.

In retrospect, I can see how it could be read "how do you confirm the design itself is correct".
 
If we're talking about post-production tests on vid cards, it's also worth mentioning that about 50 or so qualified samples are normally run for about a month continuously (a "control run") prior to them getting the go ahead for a full ramp.

At the end of the control run all the samples seem to disappear mysteriously. :LOL:

MuFu.
 
Most likely, these chips are run through burnin tests also (though not as long as the long term tests that take ~month or more).
 
Does the burn-in involve standard operation or the kind of low-temperature/high-voltage 24hr runs favoured amongst overclockers?

Heard of some people actually cooking their CPUs in the oven, lol.

MuFu.
 
MuFu said:
Does the burn-in involve standard operation or the kind of low-temperature/high-voltage 24hr runs favoured amongst overclockers?

Heard of some people actually cooking their CPUs in the oven, lol.

MuFu.
Burn in 'accellerates' wear on the chip. (By either raising the temperature or the voltage or both). You can take the burn-in data and extrapolate the chips failure rate over time. This is used to determine the MTBF of the chip, infant mortality, etc.

If your long term burnin tests show that you have a high infant mortality rate, you must burn-in every chip at elevated temperatures to weed those out, or at least test at elevated temperatures.

I believe that overclockers "burning in" their chips are simply performing placebo functions.

Production engineers have quite a bit of work to do attempting to balance cost cutting and high yields.
 
My scant experience in the ASIC business colloborates what others have already said.

From what I understand, TSMC/UMC's normal contract-arrangement is the delivery of (untested) processed-wafers. The burden is on the customer to deal with testing-issues. Historically, TSMC/UMC's delivery-terms gives them a price-advantage, as the foundry has no testing overhead.

Large fabless companies with very complex designs can't readily subcontract testing to third-parties, so they buy and operate their own test-equipment.

I believe LSI, TI, IBM contracts delivery on 'known-good-dies' (maybe they will also deliver bare wafers.) LSI, TI, IBM handle manufacturing testing (and even packaging) for the customer, and consequently they have specific guidelines for design submission and handoff. On the plus side, the customer is guaranteed a fixed pricing per quantity of parts. On the minus side, historically the customer pays a premium for the foundry to handle manufacturing test (as you would expect.)

I feel compelled to point out that the test-methodologies and strategies employed by the likes of Intel, IBM, NVidia, ATI, etc. probably deviate far from the usual 'industry practice.' An Intel engineer told me their design-for-test methodology was every bit as 'custom' (i.e. non-standard) as their logic-design process... Nvidia and ATI no doubt invested $$$ into their respective chip testing-strategies, out of necessity.
 
i wonder, how does stuff like
Code:
 mov ax,13h
 int 10h
gets tested on modern cards.
For instance, would be interesting to know what percentage of VESA BIOS code has changed on NV cards since first RivaTNT.
 
i wonder, how does stuff like
Code:

mov ax,13h
int 10h


gets tested on modern cards.
For instance, would be interesting to know what percentage of VESA BIOS code has changed on NV cards since first RivaTNT.

Well with XP supporting the use of VESA modes when drivers aren't loaded. Then that code should get excercised quite quickly.

Generally you make sure that you have tests for each piece of functionality. So to test the VGA support you will have a set of tests which do various VGA operations.

You can also use things like Sci-tech Display Doctor, and also test with older games, I think that the WHQL test suite also has some VGA mode tests, but can't remember for sure.

CC
 
"Design For Test" is part of the silicon design process.

I think the way it works is that there is actually a little extra logic in there that can have certain inputs fed in, and will test every single connection. You don't actually need to testr a complete run. If parts A, B and C work, and the parts connecting A, B and C together work, then the entire component works.
 
Squigs said:
"Design For Test" is part of the silicon design process.

I think the way it works is that there is actually a little extra logic in there that can have certain inputs fed in, and will test every single connection. You don't actually need to testr a complete run. If parts A, B and C work, and the parts connecting A, B and C together work, then the entire component works.
Yes, though just to make sure nobody gets confused: just because its manufactured correctly (which is what scan and those fancy expensive testers typically can test) not mean that the part is designed correctly.
 
Back
Top