-> AlStrong thinking aloud, not intended as a rebuttal to anyone in particular, regarding yields & overheating due to the GPU.
Yield has to do with the number of acceptable chips produced from a single wafer. In PC land it leads to speed binning as certain chips just simply cannot reach the particular clock speeds due to manufacturing issues with the transistors - be it improper doping or slight physical defects. In that sense, the voltage/power needed to operate the chip is out-of-spec and then you have the thermal issues, which are not the norm - some chips operate fine at a speed, others require more power to accomplish the same.
In the case of Xenos, the mother-daughter dice scenario is not too unlike Pentium D or Core 2 Quad; there are two chips produced separately to increase the manufacturing yield.
The mother die of Xenos has 232 million transistors, so it should be able to have higher yields per wafer as the die itself is smaller and be able to clock nicely (obviously, the two chips operate somewhat differently with respect to performance per watt, but 50-70 million transistors should be a big enough difference to entail lower thermal characteristics). The crutch is that there is heat also being produced from the 100 million transistor eDRAM that sits nearby (i.e. ROPS ).
Putting the two close to one another just creates another problem as it enhances the thermal issue, which should have been dealt with appropriately in the first place. But that does not necessarily make the yields of both dice worse (could be one or the other).
---------------
What I am not clear on is if the RRoD is due to insufficient cooling, or shoddy, inconsistent soldering or cheap components or even poor application of thermal paste or all of the above. Heck, I'm not clear on just how serious these RRoDs really are versus what the internet says.
If it's purely a cooling problem of the GPU, then some chips are clearly worse quality than others, and the speed binning is rather aggressive at 90nm for the combined package.
Assuming they don't mess with the voltages and power for every single box, the problem might not be related to the temperature so much as insufficient voltage. The console is supposed to be of constant construction in every aspect, and they would have to adjust and increase the voltage supply of the "worse-quality" GPUs, in which case, the temperature would increase... I'm not sure what their policy is on that.
Of course... shoddy soldering would create its own set of problems, be it increased resistance due to insufficient solder or short circuiting due to too much solder. Melting seems improbable... the lowest melting temperature for a lead-free solder is 118 C. *shrug*
Poor quality capacitors would just screw with the power regulation.