So has the talk of going multi-die.
Fair enough. We're really have to see about what's going to happen. My belief is not that GX2-like solutions won't happen, but they don't necessarily mean that larger dies than R600 and G80 are impossible. I think we'll see both solutions for quite some time.
Wafer diameters haven't grown beyond 300mm, so there are manufacturing parameters that have not scaled at all in recent years.
I'm walking a bit out of my comfort zone here, but there aren't really that many technology parameters that are related to wafer diameter. The reason to go to larger wafers has mainly 2 reasons, both economic:
- increase fab capacity: higher die throughput per handled wafer
- reduce the amount of unusable wafer real estate: this is important when dies grow larger. Even with current large die sizes, it's still not that much of a factor, though it's definitely part of some equation in some cost calculation spreadsheet.
AMD's had known issues with its 65nm A64s.
I tend to conveniently ignore these kind of issues.
They are really not much of a concern for vendors who are using external fabs and standard cell design. So let me clarify: worst case electrical characteristic (which are always used to calculate the critical timing path of a chip) are really quite reliable, even in 65nm. Going forward, I don't expect major changes with this.
I suspect this is because fabs for one tend to add a certain margin of error exactly to make sure customers get what they expect.
This is less of a concern for the CPU fabs of Intel and AMD, so they'll try to get closer to the limits of their process. As long as AMD will continue produce GPUs externally (which is obviously not a given), I'd like to stay with that model, and there I believe my argument still holds.
If there are multiple timing skews for R670/G92/... it will be interesting to see how much the clocks differ from from each other. They are really close for e.g. the 8800Utra/GTX/GTS, so there is clearly not yet a problem.
There is no expectation for this to be any easier at 45nm and below.
If wires continues to play a larger role in the overall delay, I expect that variance in speed actually to go down. (Just like we're already seeing now.) You can't significantly reduce wire delays by increasing voltages.
Fabs don't normally release the distribution data, but there was a rough idea that the worst-case variation was something close to linear for clock timing, and much worse when it came to leakage between devices on the order of 10x (all else being equal).
I agree that leakage variation can be quite high within the same process. Speed variation is much less so, once again keeping my more restricted rules in mind. Unlike the GPU world, there are lot of silicon products where all chips have to run at the seem speed. (Think cell phones, modems, TV chips, ...)
Device variation was characterized as having a linear factor between devices, something like a factor of two between the extremes, ...
I assume you mean speed variation. That sounds about right. In reality, nobody really cares about the fast corner, except for hold time violation checks. Everybody simply uses the slow corner, because FABs demand it.
Intel's success at getting quad core CPUs out over a year before AMD's single-die solution shows that there are benefits to multichip at 65nm.
I'd argue that this is more a matter of getting to market faster. Just like the 7950GX2 was a nice way to crash the R580 party while the next big thing (in the same process!) was getting ready back-stage.
Anyway, my main initial argument was that debugging chips with 1B transistors didn't have to be a major burden. We deviated quite a bit from that.