[*]"three" is a pretty funny number for a computing architecture. I wouldn't be surprised if each shader unit in R580 (and RV530) is actually composed of four quad-pipelines, with one dropped for the sake of redundancy. That's 25% redundancy, on a total of approximately 128M transistors (64 pipelines, 16 dropped for redundancy). If that's the case, then we prolly won't see a "36 pipeline" variant of R580.[/LIST]Obviously, all guesses... The point being that R580's "size" might look like a huge disadvantage, but there are hierarchies of fine-grained redundancy at play, and I expect that redundancy in R580 is significantly more advanced than R420. I also suspect that the "3:1" architecture of RV530 and R580 adds a significant layer of redundancy. The end-result being, perhaps, that practically every core comes out as a fully functional R580 (or RV530).
Jawed
Hmmm. There's definitely a possibility that there are a number of excess units for redundancy. But an overhead of 33% (because that's how you should look at it!) is out of the question.
Let's use some numbers from the past (I don't have the latest info). It is perfectly possible to get yields of >70% in a 180um process for dies of 100mm2 *without* using any redudancy.
When you think about it, that's really staggering: if you have 300 dies per 8" wafer, that means that you have only around 100 defects per wafer. It's actually a bit more, because multiple defects per chip do not decrease yield in this case.
Looking at the histogram of the amount of defects per die, you'll see that 70% have 0 defects, 15% have 1 defect, 10% have 2 defects etc. (Just making up the numbers, but it's something like that.)
Using the defect/die distribution above, it means that adding just 1 redudant atomic block, will increase the yield from 70% to 85%. 2 redudants blocks increase it from 85% to 95% etc. Clearly, the bang/buck for each additional redundant block does down very quickly.
Now this is an ideal case, where the core of the chip is nothing but exact copies of the same atomic blocks. (If you ever wonder why DRAM's have production yields above 98%, here's your answer.)
In the real, non-DRAM, world, you have different functional blocks. E.g. a number of identical DSPs, transmogrifiers, pixel shaders, what not. If you want redundancy there, you'll have to add redundancy for each of those individual units. Obviously, this complicates matters somewhat, but trust me that all big chip houses have software to calculate the !/$ ratio for adding or not adding redundancy for each of those units.
Defect density for 90nm must be quite a bit higher than 180um. And a die of 350mm2 instead of 100mm2 also increases the number of defects per die so a higher redundancy may be needed, but I'd be very *very* surprised if it's higher than, say, 10%.
Note also that for the chip of the same size, you need less redudancy if you have more but smaller atomic blocks instead of less but larger blocks, since in the former case you have a higher granularity of enabling or disabling a defective unit. The 580, with it's 48 pixel shaders, is probably in that camp.
As for the multiple of 3 instead of a power of 2: nah, that's really not an issue. It means that some 2-bit busses will carry numbers from 0-2 instead of 0-3, which wouldn't make Claude Shannon proud in terms of information code density
. , but that's about all there is.