Nvidia GT300 core: Speculation

Status
Not open for further replies.
I think GT300 is unlikely to launch for more than the GTX280 did, for a couple of reasons:

- It will most likely launch after ATI's 58xx series boards are in retail, and unless it's twice as fast, they can't sell at such a high price when alternatives are available.

- The world has moved on, and despite arguments about how fast the high end PC gaming market is shrinking, it's shrinking. We're in a deep recession, and people are we're now living in an age where at the $100, $200 and $300 price points, you can get amazing values. It's hard to believe people will pay much more than $500 for a top end part, since it will be competing with SLI / tri-SLI options at that point, and there is unlikely to be much DX11 software to really tempt people with at launch.
 
And would a launch with very limited quantities (given an adequate performance advantage, which is yet to be determined) not be very welcome for high pricing, thus reducing demand and upping felt in-stock-ness at the same time? ;)
 
And would a launch with very limited quantities (given an adequate performance advantage, which is yet to be determined) not be very welcome for high pricing, thus reducing demand and upping felt in-stock-ness at the same time? ;)

While this makes sense from a pure economic perspective, it means that Nvidia are lowering the value proposition at launch, which tends to enrage a lot of the "high volume talkers" on the forums who'd love nothing more than to own the GPU on day one but can't cross that price hurdle; it invites negative feedback from the reviewers and in general builds a lot of negative early buzz that the vendor then needs to build back from.
If I remember correctly, Nvidia actually provided refunds in the past when they had to quickly drop prices soon after launch, to avoid permanently pissing off the early adopters who would have the most to say during a time when everyone is eager to learn about the new generation of GPUs.

When you have a new GPU that's clearly faster than anything else out there you can price it relatively higher than the competition, but I just don't see the Nvidia single GPU launch doing well at $700 (which is what it would have to be to launch at 'more than the last generation'). ATI has too much to lose by pricing their GPU at $600+ at launch, so without that market parity, Nvidia would just look obnoxious by launching ridiculously high, limited quantities or no.

I think $500-600 is the upper limit on what the current market will bear given what we've heard about the relative performance of GT300. If GT300 is a single GPU that is 10x faster in raw throughput than the GTX 280, then I will be first in line to try and get it at launch, but I don't see that happening. There may be a $50 spike in the first week (max 2) over the MSRP, but that's not really Nvidia's fault.
 
I think what Neliz is trying to say is something like this:
Picture a time a couple of years out from G200 launch, 65nm process is well known & the design is well advanced but its going to be a big chip.

NV get some intel about ATI, nothing new coming till RV770 & a good estimate on the date it should be coming (maybe a hint on the size/performance?).
The ETA for RV770 is a fair time after the current ETA for G200

They also see G92b on 55nm coming along nicely without any big hitches.

1 + 1 = Shift G200 to 55nm.
There is plenty of time & it'll help with yield/profit margins.

So they shift most of the G200 team to migrating the design over to 55nm, leaving enough on the 65nm team to finish it off as a back up plan in-case something crops up with 55nm.

Later they get more info about RV770, including a new, closer ETA & realise that the 55nm chip won't be ready in time.
Only option is to launch the 65nm version to make sure they have something in direct competition with RV770.
 
..., leaving enough on the 65nm team to finish it off as a back up plan in-case something crops up with 55nm.
There is no reason whatsoever to assume that whatever goes wrong with 55nm will not go wrong with 65nm. If at one point you to decide to go straight to 55nm, you shut down the 65nm effort. It's as simple as that. Anything else would be a ridiculous waste of resources.

Over here, way, way too much emphasis is put on the difficulty of process shifts as if it's some mysterious, hard to tame beast. It's a lot of work in terms of man power to do so, which is a major consideration, but other than that it's really not such a big deal.
 
I'd just like to provide my impression on what it means for changing processes:

1. In principle, if your chip has very loose tolerances, a process shrink should "just work". The difficulty comes in when you optimize a chip as much as possible for performance, which is obviously the case with 3D graphics. If, for instance, you instead had a pad-limited chip, you might engineer it with very loose tolerances so that a die shrink might be almost trivial.

2. One problem with die shrinks is that not everything shrinks by the same amount: if you compare the sizes of all of the various sorts of components at a 65nm process versus a 55nm process, they won't all come in at a 65/55 ratio. So when you have a die shrink, circuits that worked fine on 65nm might no longer even fit on 55nm: they may have to be rerouted just due to the different relative sizes. If the components aren't packed so tightly together, of course, this is less likely to be so much of an issue.

3. Another problem is that when you perform a die shrink, all of the distances between the various components are reduced. This causes differences in electrical interference between channels, which means that some number of lines that had no problems at one die size will now have to be rerouted for the new die size. Again, if you don't pack things together tightly, it's not an issue. Obviously we want to pack things as tightly as possible with 3D graphics hardware.

4. Another issue is that as things get smaller, the timings of everything change. With the above issues, we may not only be talking about things getting smaller, but also with a number of the circuits rerouted. This means that all of the new timings have to be taken into account to as to assure that the whole chip can run at a certain clock speed. Again, if your tolerances are loose, this won't be an issue. But we want our 3D graphics chips to run as fast as possible.

Those are, as I see it, the main issues facing die changes. If we have large, complex chips, there are other issues to be considered:
1. When laying out a chip, engineers will make use of computer models of the process features as well as the physics involved in order to determine what sort of circuit will work and what won't. In reality, these computer models are not going to perfectly match the actual process. The exact dimensions of the process are not going to be known to perfect accuracy, which will make for some mistakes. And the model will need to use some approximations to the full physics calculations in order to make the computations tractable. As a result, these software validation tools won't be able to perfectly determine when a chip will work and when it won't. Every time you design a new chip, you will have to deal with some degree of uncertainty as to how good your computer model is.

Obviously as more chips are designed on the same process and more errors are found and subsequently corrected, the software model of the design process will get better and future chips will be easier to produce.

2. Each of the above issues has some percent chance of affecting any given circuit in the chip. The more complicated your chip is, the more parts of it will have difficulties when you go through a die shrink. Thus more complex chips tend to be more difficult to work with, in general.

So, in the end, as silent_guy mentions, if you want to design a chip, you really should pick a process and stick with it from start to finish. If you don't, you could be forced to perform major reconfigurations of the chip. And it would really be a tremendous investment in a company's limited man-hours to design a full product merely as a backup.
 
1. In principle, if your chip has very loose tolerances, a process shrink should "just work".
What you mean by tolerances? There are blocks in a chip that have a lot of timing margin and others (most) that don't. But it doesn't really matter: shrinks almost always just work.
(Not clear what tolerances have to do with a chip being pad limited?)

2. One problem with die shrinks is that not everything shrinks by the same amount:
The time of pure optical shrinks of everything is long gone. Analog blocks either don't shrink or you don't want them to to reduce risk or schedule. So you need to do some rework on your chip floor plan anyway. Pretty much everything after the initial netlist is different. Reusing existing wires or cell placement is a non-starter.

3. ... This causes differences in electrical interference between channels, which means that some number of lines that had no problems at one die size will now have to be rerouted for the new die size.
Non-issue due to point 2: everything is rerouted from scratch.

4. Another issue is that as things get smaller, the timings of everything change.
Not really an issue for a shrink: cell timing are usually slightly better and, more important, the timing constraints have already been fully debugged. So you get smoother flow.

... As a result, these software validation tools won't be able to perfectly determine when a chip will work and when it won't.
In terms of total farm time, probably 95% is spent on digital simulations. It's still the place that's responsible for most of the bugs. Process technology is irrelevant here.

Even if there's a bug that's process related, it's pretty much always due to analog designer mistakes and not really process related. If someone designs a DAC in 65nm and it works, by pure luck, in a condition that wasn't simulated. Then he ports it to 55nm, but this time it fails in that corner. Is that a case of "they're having problems with 55nm technology" or just a testing hole?

Pretty much all non-digital bugs are like this. Issues due to model inaccuracies are extremely rare.

Every time you design a new chip, you will have to deal with some degree of uncertainty as to how good your computer model is.
Only for a really immature process, in which case the fab will send updated Spice decks and libraries with revised characteristics (somehow always slower than initially thought.) Annoying, but hardly ever a concern.

... the software model of the design process will get better and future chips will be easier to produce.
Not really, because models are hardly ever the issue anyway.

... The more complicated your chip is, the more parts of it will have difficulties when you go through a die shrink.
Not in the way I think you meant it. The best estimator is the amount of unique (non-repetitive) placed gates. Larger chip. More gates. More effort, but largely linear.

And it would really be a tremendous investment in a company's limited man-hours to design a full product merely as a backup.
100% agreed.
 
What you mean by tolerances? There are blocks in a chip that have a lot of timing margin and others (most) that don't. But it doesn't really matter: shrinks almost always just work.
(Not clear what tolerances have to do with a chip being pad limited?)
All sorts of tolerances. Timing, interference, capacitance, leakage, etc.

If a chip is pad limited, you could move circuits further apart and not lose much, which would make the chip more tolerant to changes in relative component size, as well as more tolerant to electrical interference. If you don't care much about the clock speed, then you also have much larger timing tolerances, obviously.

The time of pure optical shrinks of everything is long gone. Analog blocks either don't shrink or you don't want them to to reduce risk or schedule. So you need to do some rework on your chip floor plan anyway. Pretty much everything after the initial netlist is different. Reusing existing wires or cell placement is a non-starter.
Wasn't aware of that. I figured that at least some components could be more-or-less copied over, as long as the tolerances on those components were relatively loose. But if today's chip layout tools are advanced enough, then this may not even be of any benefit.

Not really an issue for a shrink: cell timing are usually slightly better and, more important, the timing constraints have already been fully debugged. So you get smoother flow.
Hmmm. I would have expected that the differences in relative size make it so that if you want to get the most out of a die-shrunk architecture, you have to rework the timing to the new process. Naturally when you do a die shrink, timing in general gets easier, so you should have no problem running at the same or even a slightly higher clock speed. But there is likely a fair amount of work in getting the die-shrunk chip to operate at its optimum.

In terms of total farm time, probably 95% is spent on digital simulations. It's still the place that's responsible for most of the bugs. Process technology is irrelevant here.

Even if there's a bug that's process related, it's pretty much always due to analog designer mistakes and not really process related. If someone designs a DAC in 65nm and it works, by pure luck, in a condition that wasn't simulated. Then he ports it to 55nm, but this time it fails in that corner. Is that a case of "they're having problems with 55nm technology" or just a testing hole?
Well, I was assuming that any chip design is going to go through a huge gauntlet of software tests before it reaches production. This *should* leave only model inaccuracies and layout bugs that could cause problems once a design has been sent off to the fab.

Only for a really immature process, in which case the fab will send updated Spice decks and libraries with revised characteristics (somehow always slower than initially thought.) Annoying, but hardly ever a concern.
But isn't that precisely the situation we're talking about? I mean, with 3D graphics hardware, aren't they working with the fab during the initial stages of application of a brand-new process, in order to get out their spiffy new chips on the smaller process as fast as possible? Sure, this should definitely not be an issue with any part which starts production when a process is already mature, but I have to imagine that for companies like ATI and nVidia that try to get the smaller parts out ASAP, they would probably have to deal with these issues quite frequently.

Not in the way I think you meant it. The best estimator is the amount of unique (non-repetitive) placed gates. Larger chip. More gates. More effort, but largely linear.
It depends upon what you're talking about, whether or not the issue is linear. Detecting bugs in the layout should be more-or-less linear, provided that they've effectively compartmentalized the layout of a complex chip. The non-linear aspect comes in more with respect to the tolerances on all of the various components of the chip. That is to say, every time the fab writes a circuit to the silicon, there is going to be some amount of variance as to the precise layout. If you have a small, simple chip, then you can tolerate a relatively large error rate and still have lots of good chips. If your chip is large and complex, on the other hand, then you need a vastly lower error rate. So you might design a large, complex chip with much wider tolerances than you would design a smaller, simpler one.

Where this has application to what I was talking about above is with respect to detecting issues with the software model of the process, which, as you mention, would only happen with a very immature process. Each error in the software model would have some probability of becoming apparent in some sort of design layout. The more complex your chip is, the larger space of possible layouts you'll explore, and thus the more likely you are to hit on an error in the model of the process.
 
http://www.semiaccurate.com/forums/showpost.php?p=1208&postcount=14


Don't know why the tape-out would, in itself, provide any information on specification.

Anyway, looks pretty unlikely for a 2009 release if it'll be August before it tapes out. I suppose key developers might get their hands on a prototype card in November.

Jawed

He could mean final frequencies but even say 10% less than original projected frequencies won't make a huge difference if you really want to estimate any hypothetical performance.

My question would rather be how the HELL is anyone able to estimate any kind of performance not one ONE but on two next generation GPUs from different IHVs without being completely sure what changes have occured in A or B and how much they affect efficiency in real time. Ok IHV A might claim 2x times XXX and IHV B might claim 3x times YYY; ask me if I trust either/or before I see independent third party measurements anyway. Unless of course if my crystal ball has a very clear preference than of course I'll never be disappointed at all.
 
Charlie doesn't seem to know much about GT300. He's been a bit more speculative than usual with his Nvidia bashing. And while he's gungho about evergreen he still leaves room open for potential issues. So even the mighty Charlie doesn't seem to know what's going on.
 
Charlie doesn't seem to know much about GT300. He's been a bit more speculative than usual with his Nvidia bashing. And while he's gungho about evergreen he still leaves room open for potential issues. So even the mighty Charlie doesn't seem to know what's going on.

Well some sites already have extensively speculated on both next generation X11 GPUs unit counts and even frequencies. My problem is that I can't figure out based on a bunch of sterile numbers what each can or cannot achieve as long as I don't know what each unit is capable of or even better any possible architectural changes for either/or side.

Under some twisted logic I could have figured years ago out that G80 before its launch is nothing else but an 8-quad G7x with a wider bus, more ram and X10 compliance and leave me wondering afterwards where the hell all that added efficiency came from.

Or vice versa the transistor count of GT200 before its launch could have led me to believe that it would have ended up a lot faster than G80 than it actually did.

If someone wants to bet on an advantage AMD might have in this round I'd quickly estimate that NV might get in trouble early 2010 especially with OEM/netbook deals since I doubt them to be ready with any relevant D3D11 offerings.
 
All sorts of tolerances. Timing, interference, capacitance, leakage, etc.
Except for timing, all of those don't matter during the PNR process. That sounds blunt, but it's basically true. SI and capacitance are intermediate products on the way to getting your next timing report. Leakage is just there. You can't do much about it anyway at that stage of development, other that being selective in putting fast cells in there.

If a chip is pad limited, you could move circuits further apart and not lose much...
Nobody does that. There's nothing to gain by it. As if by magic, pad limited chips always fill themselves up. "Let's increase that RAM, because we can."

... if you want to get the most out of a die-shrunk architecture, you have to rework the timing to the new process.
"Rework" makes it sound complicated. If the process are sufficiently faster, you'll update the clock parameters in the synthesis script, synthesize, write out new PNR timing constraint files and feed them into the placer. I make it sound easy and it usually is. We're talking a couple of days for the whole chip, a fraction of the overall backend time.

Well, I was assuming that any chip design is going to go through a huge gauntlet of software tests before it reaches production.
Definitely. Literally tens of thousands of tests. There's quite a bit of coverage overlap in there, but you can assume that each individual test also covers something specific. It's still very easy to overlook something then, especially pesky temporal interaction bugs. You have no idea how many hacks are present in real production silicon to work around logic bugs. For each bug that's fixed in a metal spin, there are easily 10 bugs that are worked around by software hacks. ("If we insert a couple of NOPs there, we'll avoid those two events from happening in the same clock cycle and the state machine won't lock up after all. Phew!")

This *should* leave only model inaccuracies and layout bugs that could cause problems once a design has been sent off to the fab.
Oh, how sweet that would be! ;)

But isn't that precisely the situation we're talking about? ...
In that case of a half-node shrink: definitely not. Parameters are very well controlled there. For a full node, I'd guess it's still only an annoyance, even if you're bleeding edge: when you're already in PNR, the parameter updates aren't that dramatic anymore. You'll see some disturbance in your timing report, which you fix during the next iteration.
I was more talking about the early stages of new designs: those who want to squeeze out maximum performance out of upcoming 32nm processes are probably still seeing fairly large changes, but there's still plenty of time to design around those.

That is to say, every time the fab writes a circuit to the silicon, there is going to be some amount of variance as to the precise layout. If you have a small, simple chip, then you can tolerate a relatively large error rate and still have lots of good chips. If your chip is large and complex, on the other hand, then you need a vastly lower error rate. So you might design a large, complex chip with much wider tolerances than you would design a smaller, simpler one.
No, it doesn't work that way. Timing problems are localized and uncorrelated with overall die size. A electron doesn't really care if it's flying around a die of 50 or 500mm2: the largest stretches of uninterrupted metal are measured in um. If you're 50mm2 chip has issues with a particular timing path, it will wreck the yield in the slow corner no matter what and it's going to have to be fixed before production.

When a timing path is detected, it's a safe bet that it will be due to 1 of 2 reasons:
- the timing script had a bug.
- some number was overlooked in a report or incorrectly waived as safe.

I can't remember the last time we had to say "we did everything right, but the model screwed us."
 
We have a weekend in Munich and led *garbled* interview with the technical director of Nvidia, which after a few beers......

I thought this would be a strong enough hint at the reliability of that article :LOL:
 
Status
Not open for further replies.
Back
Top