NVIDIA GF100 & Friends speculation

Blazkowicz · Feb 27, 2010

you mean the GPU that were cancelled long ago?

aaronspink · Feb 27, 2010

3dcgi said:
What would that prove other than showing it can't retire kernels out of order? There can still be 16 running in parallel even if the chip hangs.

It that is true then it proves that can't really run concurrent kernels. Everyone's first instinct to anything that a graphics company claims about their hardware at this point is that they are lying.

ap_ · Feb 27, 2010

aaronspink said:
It that is true then it proves that can't really run concurrent kernels. Everyone's first instinct to anything that a graphics company claims about their hardware at this point is that they are lying.

This isn't clear. What do you expect to happen with 15 infinite loops and one regular kernel, concurrent or not?

Jawed · Feb 27, 2010

trinibwoy said:
RecessionCone's macro seems to be the closest thing to it. There's no single instruction that accomplishes what PSCAN does. Also, I was wrong about syncthreads_count as it returns a single value to all threads - the count of true predicate evaluations. At first I thought it was similiar to rank() from the patent but meh.

I saw his post after I made my post last night. Pretty sweet, nice and simple.

Jawed

two00lbwaster · Feb 27, 2010

Sontin said:
This would be a real fun because i read somewhere that GT200b is EOL.

Lol, was it the GTX260/275/285 that were EoL? Because that would make sense if they were to be renamed to 3 series chips.

nVIDIA would still be making GT200b chips for a GTX3xx series mainstream, though the cost of selling such large GPU dies in the mainstream would be telling on the bottom line.

Jawed · Feb 27, 2010

GPU shortage unlikely to be solved before May

http://www.digitimes.com/news/a20100226PD210.html

Although TSMC recently said the defect density of its 40nm technology has already dropped from 0.3-0.4 per square inch to 0.1-0.3, the sources pointed out that the improvement in overall yield still needs more time before catching up with market demand.

So a wafer is 113 square inches. Average defect counts were 34-45 per wafer, and are now down to 11-34 per wafer.

Jawed
http://www.techreport.com/discussions.x/18530

no-X · Feb 27, 2010

from 0.3-0.4 per square inch to 0.1-0.3

Per square INCH? Not square centimeter? Couldn't it by a typo? Because e.g. 0.2 per sq. inch (0.031 per sq. cm) would mean >90% yields even for RV870... It wouldn't make sense to produce HD5830/HD5850/HD5870 in 1:1:1 ratio...

neliz · Feb 27, 2010

Like no-x, It should be per square Centimeter?! I saw a presentation last week that they were aiming at 0.1 defects per square centimeter for 2010/2011.

AlexV · Feb 27, 2010

no-X said:
Per square INCH? Not square centimeter? Couldn't it by a typo? Because e.g. 0.2 per sq. inch (0.031 per sq. cm) would mean >90% yields even for RV870... It wouldn't make sense to produce HD5830/HD5850/HD5870 in 1:1:1 ratio...

There are multiple models used for representing yield. TSMC uses Bose-Einstein IIRC (or at least they did so in some materials they put out at some point), and in that context the quoted numbers make sense. They also equate to rather poor yields still, depending on what we assume the process complexity factor to be.

Jawed · Feb 27, 2010

TSMC Says Immersion Lithography Nearly Production Ready

This article from 2006:

http://www.design-reuse.com/news/12662/tsmc-immersion-lithography-nearly-ready.html

Immersion lithography systems use water, or a similar clear liquid, as an image-coupling medium. By placing water between the lithographic lens and the semiconductor, engineers can preserve higher-resolution light from the lens, enabling smaller, more densely-packed devices.

But liquid mediums present their own challenges, including defects such as bubbles, watermarks, particles, particle-induced printing defects, and resist residue. TSMC's R&D researchers resolved these issues by developing a proprietary defect-reduction technique that, on initial tests, produced less than seven immersion-induced defects on many 12-inch wafers, a defect density of 0.014/cm2. Some wafers have yielded defects as low as three per wafer, or 0.006/cm2. This compares to several hundred thousand defects produced by a prototype immersion scanner without these proprietary techniques and significantly better than published champion data in double digits.

TSMC's immersion lithography technology is targeted at TSMC's 45nm manufacturing process.

Now this article is talking about "immersion-induced" defects. I'm not sure what other kind of defect mechanisms apply to TSMC's current 40nm lines. If there are other defect mechanisms, then of course defect counts would increase.

Another variable here is that via defects prolly don't count in the count of defects. I dunno.

Jawed

no-X · Feb 27, 2010

neliz said:
Like no-x, It should be per square Centimeter?! I saw a presentation last week that they were aiming at 0.1 defects per square centimeter for 2010/2011.

That's it. I remember quotation of numbers like 0.4 and 0.2 per square centimeter in 2009... 0.2 defects per square centimeter means 54% fully working RV870s per waffer. That's more believable than 91%...

AlexV: Thanks. But that seems to be quite missleading. At least for me and other common users :smile:

Jawed · Feb 27, 2010

Compare Logic-Array To ASIC-Chip Cost per Good Die

http://chipdesignmag.com/display.php?articleId=375&issueId=15

The graph illustrates yield-versus-chip-area for Do = 0.16, 0.22, and 0.28 defects per square inch (see the Figure).

Meaty article, off to read it...

Jawed

Jawed · Feb 27, 2010

Choosing the Best Yield Model for Your Product

http://www.semiconductor.net/article/203327-Choosing_the_Best_Yield_Model_for_Your_Product.php

"The Bose-Einstein model is the most optimistic, while the Seeds model is the most pessimistic "

Going back to the prior article I linked:

Bose-Einstein: Y = 1/(1+ADo)^N

where Y = yield, A = die area, and Do = defect density per unit area. For the Bose-Einstein model, N = process-complexity factor.

We don't know what N is for 40nm. N was quoted as being 11.5 and 15.5 for TSMC processes that I can't discern. The Semiconductor article indicates that N is the number of critical layers. The formula assumes the same defect density at each level which is not the case.

The Chip Design article has a nice description of the classes of mechanism that affect yield. It seems to me per square inch is correct in the Digitimes piece.

Jawed

Jawed · Feb 27, 2010

Plugging in some numbers for Cypress (334mm² = 0.51in²), using 15.5 for N, for various defect densities per square inch:

0.4 = 5.6%
0.3 = 11%
0.2 = 22.2%
0.1 = 46.3%

Assuming 580mm² for GF100 (0.89in²):

0.4 = 0.9%
0.3 = 2.6%
0.2 = 8%
0.1 = 26.9%

Jawed

neliz · Feb 27, 2010

no-X said:
That's it. I remember quotation of numbers like 0.4 and 0.2 per square centimeter in 2009... 0.2 defects per square centimeter means 54% fully working RV870s per waffer. That's more believable than 91%...

AlexV: Thanks. But that seems to be quite missleading. At least for me and other common users :smile:

This is one of the articles I got the numbers from, but then, that's probably completely of the mark seen it's age. : http://ieuvi.org/TWG/Mask/2007/MTG071101/MaskTWGUpdate_071101.pdf

Squilliam · Feb 27, 2010

Jawed said:
Plugging in some numbers for Cypress (334mm² = 0.51in²), using 15.5 for N, for various defect densities per square inch:

Jawed

Wouldn't defects tend to be more clustered as they are not all random? So one die may end up with more than its fair share of defects. In addition to this are the percentages the rough number of 'perfect' dies which can be fully enabled?

Alexko · Feb 27, 2010

Jawed said:
Plugging in some numbers for Cypress (334mm² = 0.51in²), using 15.5 for N, for various defect densities per square inch:

0.4 = 5.6%

0.3 = 11%

0.2 = 22.2%

0.1 = 46.3%

Assuming 580mm² for GF100 (0.89in²):

0.4 = 0.9%

0.3 = 2.6%

0.2 = 8%

0.1 = 26.9%

Jawed

Ouch! Thank god for redundancy, but still, ouch!

KimB · Feb 27, 2010

Squilliam said:
Wouldn't defects tend to be more clustered as they are not all random? So one die may end up with more than its fair share of defects. In addition to this are the percentages the rough number of 'perfect' dies which can be fully enabled?

Well, even if they're uncorrelated, purely by chance a significant fraction of them will be clustered.

Squilliam · Feb 27, 2010

Chalnoth said:
Well, even if they're uncorrelated, purely by chance a significant fraction of them will be clustered.

I was thinking this too, but I couldn't think of a way to explain it. Thanks.

3dcgi · Feb 27, 2010

aaronspink said:
It that is true then it proves that can't really run concurrent kernels. Everyone's first instinct to anything that a graphics company claims about their hardware at this point is that they are lying.

Maybe we have a different definition of concurrent. To me it means that multiple kernels can execute simultaneously. Even if they can't finish out of order it wouldn't mean they are lying. Also, I want to make it clear I'm not saying Fermi can't pass your test. I was just trying to ensure I understood its purpose.

NVIDIA GF100 & Friends speculation

Blazkowicz

aaronspink

ap_

Jawed

two00lbwaster

Jawed

no-X

neliz

GIGABYTE Man

AlexV

Heteroscedasticitate

Jawed

no-X

Jawed

Jawed

Jawed

neliz

GIGABYTE Man

Squilliam

Beyond3d isn't defined yet

Alexko

KimB

Squilliam

Beyond3d isn't defined yet

3dcgi

Similar threads