Nvidia GT300 core: Speculation

Status
Not open for further replies.
Actually, he's saying they are parked before ANY of the metal layers are constructed. My question is whether there is any reason to construct one, a few, or most of the metal layers while not actually finishing the whole process. Based on your assumed ordering, I think it wouldn't make much sense, but I'd appreciate a more expert opinion.

Generally, you'll have wafers parked at a couple different metal layers. Effectively what you are doing with a metal respin in an asic flow is patching in bonus gates that have been automatically inserted wherever there is room. More most cell libraries, M1/M2 are generally intra-cell layers so it isn't out of the normal to spin most of the wafers to M1/M2 planar. Sometimes, depending on costs, you'll also park a couple wafers are a higher metal layer as well. Parking pre-M1/M2 is generally not that beneficial as if you have a problem in the M1/M2 you likely have bigger issues.
 
I could be mistaken, but I seem to recall GTX295s at Newegg, ZZZ and Microcenter on launch day.
They were in short supply for months - don't try to pretend anything else happened. The A3 revision was required to fix GT200b for volume production.

Jawed
 
They were in short supply for months - don't try to pretend anything else happened. The A3 revision was required to fix GT200b for volume production.

Jawed
Simply looking at how many were available versus demand isn't enough to state unequivocally that there was a production problem. It may just be that demand was underestimated and it took time to ramp up.

To know there was a production problem, you would have to first have an idea of what kinds of total production you'd get absent a problem, and then also be aware of how many total sales were made.
 
I've not seen any statement that NVidia is paying per good GT21x, so quote, please?
I don't have any public info to back that up directly, but read what I said regarding what TSMC claims: their margins will improve when yields improve. And NVIDIA was the company ramping the most wafers when TSMC made that claim. If a foundry is selling on a per-wafer basis, its margins do not magically improve with yields unless they are adjusting wafer price to achieve something that has mostly the same effects as PGD!

Like GTX295 was a hard launch?

A3 was required for GT200b, and that's on a process that has been in full production since July 2007.
I think nobody's going to deny NV's execution on 65/55nm has been laughable. The point is here they're willing to allocate a lot more resources and potentially 'waste' a lot more money to get it right. They could screw it up massively again - I'll never deny that, it's obvious. But I think the question we need to ask is whether they've got a reasonable chance at delivering what Charlie still claims is basically impossible, and my point is they do and he'd be wise to hedge his bets even further - just as I did myself in this paragraph.

Generally, you'll have wafers parked at a couple different metal layers. Effectively what you are doing with a metal respin in an asic flow is patching in bonus gates that have been automatically inserted wherever there is room. More most cell libraries, M1/M2 are generally intra-cell layers so it isn't out of the normal to spin most of the wafers to M1/M2 planar. Sometimes, depending on costs, you'll also park a couple wafers are a higher metal layer as well. Parking pre-M1/M2 is generally not that beneficial as if you have a problem in the M1/M2 you likely have bigger issues.
I was obviously aware of the first part, but the second part is very interesting and basically answers my question perfectly - cheers! :)
 
GTX295 was planned for a long time - GT200b doesn't magically get an exception as a viable chip just because it took so long to get it right.
GTX295 wasn't planned at all. They made it to get the performance crown back from 4870X2.
The thing is G200b B2 was already better than G200 and the only reason they went for a B3 is the GTX295 which needed less power consumption from the G200b.
So you really can't say that they _needed_ B3 for G200b, they needed it for the top-end dual GPU solution only. B2 was already faster than G200 which was faster than what AMD had at that moment (speaking of single GPU cards of course).
 
Yeah, I don't think it's possible to make new chips on-the-fly like that. Even if they are relatively minor tweaks on a mature architecture.

if B2 was as good as it is, it would be found in ALL kinds of products and not have ventured the way it did now. it showed up at the end of May (as the new Tesla). was announced as available in September (as Tesla) and found it's way to the consumer market in November in the B3 revision.

We only saw a handful of review cards as B2.
 
Really the question is about the meaning of "metal spin", what kind of faults you can fix.
You can fix pretty much anything in metal, as long as it's a dumb bug that requires a gate here or there.

From what I can gather, the top of the chip is bumps and vias into a few layers of power redistribution, along with signals. I don't know how many layers that is, 3,4? Then there's logic. Then, presumably cell connectivity (e.g. inter-transistor connections to make a memory cell). Then localised connectivity (e.g. a pipeline), then funtional connectivity (e.g. instruction decoder) and then inter-function connectivity (e.g. L2->L1 data bus).
It's much less organized than that.

M1: intra cell routing
Everything else: power and interconnect

Some blocks, like RAM's, will allocate additional metal layers, say, M2 and M3.

It's no problem to mix power and interconnect on the same metal layer. There are no hard and fast rules on which layer to put the power grid, but on M1, you have the horizontal power that connects all cells on the same row to each other. You'll have some intermediate layer to connect that low level grid vertically (M2 or M4, most likely) and then have an additional coarse power grid at the high levels.

More most cell libraries, M1/M2 are generally intra-cell layers so it isn't out of the normal to spin most of the wafers to M1/M2 planar. Sometimes, depending on costs, you'll also park a couple wafers are a higher metal layer as well. Parking pre-M1/M2 is generally not that beneficial as if you have a problem in the M1/M2 you likely have bigger issues.
Most cell libraries only use M1 for intra-cell connect, so the most logical place to park is directly after M1, before M1M2 via. Even parking after M2 can seriously increase your ability to fix problems. The difference between parking after M1 and after M2 is only a handful of days in the fab, so in terms risk / reward, it's not worth doing.
 
If a foundry is selling on a per-wafer basis, its margins do not magically improve with yields unless they are adjusting wafer price to achieve something that has mostly the same effects as PGD!
What? You're saying that TSMC cannot improve margin on a mature node.

Jawed
 
What? You're saying that TSMC cannot improve margin on a mature node.
I never said that. I said a foundry cannot magically/automatically improve margins on a mature node exclusively by increasing yields! The only way to improve margins based on yields is by *increasing* wafer prices over time, which means explicitly telling your customers that future orders after their current pricing contract expires are actually going to be more expensive to reflect the greater 'value' of the process.

This is not the normal way to increase margins on a mature node; the normal way is to focus on efficiency and cost reduction while benefiting from lower levels of capital expenditure depreciation (the latter is an accounting effect, but it's a very important one!)

On new process nodes, increasing margins through yield improvement can be achieved either by explicitly increasing wafer prices (once again rather unlikely IMO) or by using a Per Good Die pricing mechanism where the effective price of a wafer goes up over time because the foundry doesn't reduce the cost of a good die as fast as yields are improving; it is effectively subsidizing the initial ramp.

EDIT: Thanks silent_guy :) The M1/M2 clarity above is very interesting too, cheers.
EDIT2: Oh BTW, there is another way to make more out of a mature process node: shrink it! Believe it or not, I'm not kidding. Just read this PR: http://www.tsmc.com/tsmcdotcom/PRListingNewsAction.do?action=detail&newsid=3781&language=E - and FWIW, for the zero people who care on this forum, I am willing to bet ten bucks that CSR is (one of?) the early adopter for this slim node with their BC6110->BC6150 bluetooth headset range. It's a real node that'll get real volume; rather insane if you ask me!
 
and FWIW, for the zero people who care on this forum, I am willing to bet ten bucks that CSR is (one of?) the early adopter for this slim node with their BC6110->BC6150 bluetooth headset range.
Achieve negative audience: done! (j/k)
 
Achieve negative audience: done! (j/k)
Yeah, well, it's not like my audience seems much bigger when I tell mythical tales of 1GHz+ TMUs... So you can't blame me for caring much about the interest of what I write ;)
 
Yeah, well, it's not like my audience seems much bigger when I tell mythical tales of 1GHz+ TMUs... So you can't blame me for caring much about the interest of what I write ;)
What's so mythical about 1Ghz+ TMUs? Don't ATI TMUs already run pretty close to that freq?
 
What's so mythical about 1Ghz+ TMUs? Don't ATI TMUs already run pretty close to that freq?
Well, I was thinking of that as a bottom or average, not a top - like the 8500GT runs the shader clock at 900MHz, but the 8800Ultra runs it at 1500MHz. So I'd definitely expect at least one NV DX11 SKU with TMUs running around 1250MHz on 40nm. I was mostly being conservative because I don't know how bad variability will be with such a big chip on 40nm - and I didn't want Jawed to do like he did with my RV770 info, claiming that even though I was right about 40 TMUs/800 SPs, it didn't really count because I claimed it would be a much bigger chip. Bah! :D

EDIT: Oh, and the original post obviously mentions the reason.
 
What's your reasoning about 1 Ghz+ TMUs on GT300? Shared clock domain with the shader cores? What else?
 
Yeah, well, it's not like my audience seems much bigger when I tell mythical tales of 1GHz+ TMUs... So you can't blame me for caring much about the interest of what I write ;)

Well hey people showed interest but then you gave us the silent treatment :) So I'll ask again, what makes you think TMUs will be running that high?
 
Status
Not open for further replies.
Back
Top