As we all know, the trick is to keep IPC up at the same time as getting clock speeds to ramp up. Obviously Intel had little problem getting to the 4Ghz+ mark, but was complete shite when it came to IPC.
I don't really know where the barrier is, but if I had to guess, it would be the logic routes / traces / something else other than process lithography. If Intel could mass produce a 3.73Ghz CPU on 90nm, I'm doubting that AMD's 65nm node is actually behind that curve, so it has to be something in the layout or logic.
It's probably a combination of a lot of factors, including AMD's 65nm process.
The 10h pipeline is not significantly different from K7, a design that was created before the serious issues with leakage and signal integrity that popped up below the 130nm node.
Without a design that really took these new issues into account, the basic architecture would become a stumbling block.
Intel with the P4 was in some ways right. While the thermal problems were a nightmare, it was clear that Netburst was not timing or signaling limited.
Both Intel and IBM ramped up either totally new or significantly redesigned chips for 65nm, and they got respectable clocks for it, though IBM's different design constraints make it a difficult comparison.
AMD did not make such a redesign, and it suffered for it.
AMD's voltage scaling has been lackluster, and its power curve is extremely steep and early, even when compared to the paltry gains other manufacturers have made, due once again to problems with signal integrity and leakage.
The process isn't blameless, since it shouldn't have been such a barrier for K8's scaling at 65nm.
AMD's designs have not significantly changed the FO4 per pipeline stage, which means there is one less factor between final clocks and the quality of the process.
We can look at the gate oxide thickness, which regressed from 90nm to 65nm, for part of the problem with clocking high.
I ran across discussions that part of the reason for this regression was AMD's shift to 12 inch wafers, and device variation worsened parametric yields on the wider wafers. The thicker oxide kept things in line, but at the cost of cutting off the top speed bins.
AMD's design has not changed enough with a significantly more stringent manufacturing environment, and its process is not up to snuff.
A lot of this is the result of limited resources and serious missteps.