There's zero evidence that the chip has any such problems. I'll let you dig for evidence to the contrary - it's ridiculous to use this as the basis of any kind of argument.
A few days ago, somebody posted a link to graphs that illustrate performance vs memory clock speed. There were ugly effects in there with negative correlation. That's often a sign of chaos theory at work and very difficult to design away.
The unusually high performance drops when switching on AA are suspicious also.
At lot of resources are fighting in parallel for the ALU's and the memory controllers. It's extremely easy to overlook secondary effects during design that can that cut a large slice of your theoretical performance (especially with decentralized bus architectures.)
Several times, ATI has praised their ability to tweak their arbiters for individual performance cases, which has been interpreted by many as a great advantage. A different point of view would be that this exposes a significant weakness in their architecture. If you've ever been involved in the design of anything arbitration related, you'll know that software driver guys hate to mess with this kind of knobs they don't understand and usually leave them at 1 standard setting. It also gives little hope to those who want to see a game optimized that's not part of the top-20 marketing line up.
About as ridiculous as ignoring the yield-advantages that R6xx's fine-grained redundancy affords ATI?
All we know is that they have 1 spare for every 16 ALUs. We've yet to see an example of other places with redundancy, unlike, say, an 8800GTS which has both cluster and (a first) fine grained MC redundancy. It's possible that R600 supports MC configurations of, say, 448 bits, but I doubt it. R600 doesn't seem to be able to use its full bandwidth anyway, so there's little reason not to improve yields this way.
As we discussed long time ago, it is nice to have extremely fine grained redundancy, but it's more important first to make sure that as much area of the chip is potentially redundant while still having a nice ratio of active vs disabled units. With the ALU's of R600 alone, the overall area part is not covered. And the R600 configuration with a 256-bits MC doesn't have a nice ratio.
With that in mind, I don't think there's a lot of regret at Nvidia about their redundancy strategy.