Take a look at actual Prescott vs Northwood benchmarks - at 3.2 GHz, the Prescott generally falls behind Northwood by about 1-3 percent most of the time, so it definitely has a worse IPC at that clock speed. There are many changes done in Prescott that are meant to help clock speed, but also do hurt IPC: L1 and L2 cache latencies are doubled, the execution pipeline is 55% longer (from 20 to 31 steps), the integer ALUs are apparently no longer double-pumped, and many common FPU/SSE instructions have got 1 clock extra latency.
http://www.aceshardware.com/read.jsp?id=60000315
According to the first chart the L1 and L2 cache latencies seem unchaged.
Yes, they made those changes for clockspeed, but the larger caches and improvements to HT do feed the CPU better. They also added more execution units and various other improvements in terms of scheduling and issuing.
One issue is the deeper pipes. The longer the pipeline the greater the amount of bandwidth needed to keep those pipes full. At the moment that doesn't seem to be an issue as Prescott doesn't appear to be suffering from bandwidth starvation, in fact seeming to use it more efficiently, but it's going to need a lot more as it scales. Then we may well see problems, because neither memory nor hard-drive speeds are scaling anywhere near as quickly as processors.
Huh? Couched within your statements seems to be that higher IPC doesn't require more bandwidth as well. Overall, the desired effect you SEEM to be aiming to get with your post is bunk. More computation per unit time strongly correlates to the need of more data movement per unit time. With deeper pipelined architectures you can better hide memory latencies and NOT rely less on faster memory architectures and more speculative execution. Then again, not all software solutions are suited towards this.
I still think a more balanced approach of increasing IPC and clockspeed would be the way to go.
Based on what, a gut feelling, which in turn is based on what, it sounds reasonable if one disregars pragmatics? x86 code has a tendency to yield low parallelizeable instructions (3 usually), this is why Intel is getting on the HT bandwagon to take their execution resources further by getting parallelism across threads.
Northwood's overall performance increase over Willamette is due to increasing both clock speed and IPC. Northwood's actually a very good example of how to improve the performance of a CPU since it hits all three of my points. The smaller process allowed for improved scaling, HT improved the IPC, as did the larger onboard cache, while increasing the FSB removed a system bottleneck. I differentiate that from the cache because it only involves changing settings, not the core.
Yes, but sustaining IPC at high clockrates is the real issue. Yes, at lower clock rates the Northwood is showing up the Prescott, point conceded; my predictions were otherwise. This, however, doesn't cover the fact that at higher clock rates Northwood can't sustain its IPC while the Prescott can. There were many a fool saying Intel should have just extended the PIII architecture, rather than goign with the Willimette core, the issue with that is an extend PIII architecture, saying improved, scheduling, HT,issue, clock distribution, longer pipeline and great execution resources is a new core -- why not make a new core, oh, wait, they did! A PIII with improvements made for clock doesn't make sense since your IPC would just drop, like a rock too! Sustaining a level of IPC at higher clock rates. In otherword, let the Prescott scale a bit, clock up a Northwood and you'll see the Prescott's IPC fall off more gracefully by comparision.
NetBurst is dead? Back to P6 microarchitecture??? Another testamony of slowing CPU clockspeed...
That article says nothing about the configuration of this MPU, one cannot say that Netburst is dead. Who knows, the Banais might see a more gradual shift into a Netburst-esque beast than PIII to P4 transition. In anycase, there will be differing computing needs in that time, likely computing pads will be all the range and thusly power will be a larger factor in governing MPU design.