Should Intel discontinue the Netburst architecture?

It seems that the the "Netburst" architecture was not fundementally sound in the first place even though the Pentinum 4 has won more of their share of "Top CPU performance crowns", Am I alone to think that Intel should just discontinue the "Netburst" architecture and just starts from scratch? :?
 
...

It seems that the the "Netburst" architecture was not fundementally sound in the first place even though the Pentinum 4 has won more of their share of "Top CPU performance crowns", Am I alone to think that Intel should just discontinue the "Netburst" architecture and just starts from scratch?
It's not a question of "should", as Intel has already decided to drop NetBurst by 2007 and roll back to P6 microarchitecture according to Taiwanese motherboard venders...
 
Cool! :D I didn't know that! :oops: Do you know what direction that Intel is going to take in 2007? More effiectent clock speed? Multi Core CPUs?
 
Re: ...

Deadmeat said:
It seems that the the "Netburst" architecture was not fundementally sound in the first place even though the Pentinum 4 has won more of their share of "Top CPU performance crowns", Am I alone to think that Intel should just discontinue the "Netburst" architecture and just starts from scratch?
It's not a question of "should", as Intel has already decided to drop NetBurst by 2007 and roll back to P6 microarchitecture according to Taiwanese motherboard venders...
Great news :D
Anylink ?
 
I don't have a link, but I also read that Intel's next core will be based more on the Pentium M than the Pentium 4.
 
It's still a ways off before netburst is EOL. It might not be until late 2006 or later.

I am curious to see how much the Pentium M has changed from the p6 form. I haven't seen any dissections of it yet, though it may be that the core may be only distantly related to the p6 lineage.
 
...

I am curious to see how much the Pentium M has changed from the p6 form. I haven't seen any dissections of it yet, though it may be that the core may be only distantly related to the p6 lineage.
The first Pentium Ms were being identified as Pentium IIIs by utilities, so go figure...
 
Re: ...

Deadmeat said:
The first Pentium Ms were being identified as Pentium IIIs by utilities, so go figure...

That's just because Pentium M also use 6 in its family ID. All P6 processors also use 6 as family ID.

Pentium M is clearly derived from P6 core. However, it has gone through major revision, which may be more significant than the difference between Pentium III and the original Pentium Pro.

This article talked about some key features of Pentium M.
 
Two biggest changes from P6 (PRO/2/3) to P-M is the bus interface (quad pumped like the P4) and the substitution of the decoder front-end with a simpler decoder+trace cache (again like the P4).

AFAIK the OOO execution core is a more less un-modified P6.

I doubt Intel will use the P-M to compete against Athlon 64 (they would lose). But something build using the same guidelines (much more focus on power).

Cheers
Gubbi
 
Merom and Conroe

I wouldn't quite count the netburst architecture out just yet. Tejas is shaping up to be a big let down. But there are a few others waiting in the wings that I've heard rumors about.

Let me summarize what I can say.
- all future netburst will be prescott derivatives and not tejas
- L1 prefetcher, giving large performance gains
- approximately 20-30% ipc boost over prescott in Comanche
- approximately 50% ipc boost over prescott in gzip and I'd imagine it would be a similar increase for other compression tools as well.
- improved SMT, ability to put threads to sleep when they have a cache miss and need to go to memory, that way another thread will be able to get some useful work done.
- there has been talk of an integrated memory controller, but I'm unsure if that is actually going to pan out.

Tejas tidbits:
- Longer pipeline than prescott
- 10% ipc improvement over prescott
- runs really hot
- currently having alot of trouble clocking high. Clocks more than 10% lower than prescott. Though I'd assume this is going by just gate delay since they haven't taped out yet.

Over all it's actually still undecided on whether or not netburst is going to be axed even though it does look like it. Either way we still have another 2-3 netburst cores to look forward to before the expected transition happens, if it does.

I haven't been able to drag out of my source what exactly Merom, Conroe, or any of the other P6 based chips are shaping up to be, so sorry I can't fill you in there.

Edit: Removed a link that Kenneth already posted and I didn't notice.
 
Gubbi said:
Two biggest changes from P6 (PRO/2/3) to P-M is the bus interface (quad pumped like the P4) and the substitution of the decoder front-end with a simpler decoder+trace cache (again like the P4).

Actually just a minor correction, I don't believe P-M has a trace cache, yet. And if anything the decoder got more complex with the inclusion of the micro-op fusion.

Along with the better branch prediction, wider/faster bus, I think if Intel adds an on die memory control, AMD64 compatibility with the extra registers, then possibly adds a second core in order to continue to have "HyperThreading" (personal opinion here, but I don't think P6 is either deep enough nor wide enough to make true SMT much use) I think they Intel will certainly have a decent competitor for future K8 chips.

Edit: I forgot to include a link to alot of good Banias info http://arstechnica.com/cpu/004/pentium-m/pentium-m-1.html
 
Gubbi said:
Two biggest changes from P6 (PRO/2/3) to P-M is the bus interface (quad pumped like the P4) and the substitution of the decoder front-end with a simpler decoder+trace cache (again like the P4).

AFAIK the OOO execution core is a more less un-modified P6.

Are you sure? In the article I linked in my previous post, it looks like the decoder remains almost the same as P6 (it also has the 4-1-1 template, as in P6). Most modification is in the OOO execution core, such as a new branch predictor, micro-ops fusion, and the dedicated stack engine, all are inside the execution core (except the micro-ops fusion is somewhat related to the x86 decoder). There is no trace cache in Pentium M.

As for the netburst architecture, actually I think the Northwood core is pretty good. The double pumped ALU is amazing, with 0.5 cycle latency for many simple operations. The new Prescott core is very strange, with many questionable design decisions. IMHO if Intel decided to just shrink Northwood core into their 90nm process, it may perform just better.
 
Trace Cache

Now I have an interesting question for anyone with more engineering knowledge than me. Do you think it would be worth while to add a trace cache (replacing the L1 instruction of course) to the P6 core?

I'd imagine it would lessen the impact of branch mispredicts even more. Weren't tualitan and coppermine somewhere around 12 pipeline stages? I don't remember exactly, and of course banias and dothan have extended that slightly. So a trace cache would effectively shorten the pipeline in a tualitan/coppermine to approximately 8 stages, and only a few more for the P-M's.

So what does everyone think? Is P6 too short of a pipeline to even bother with a trace cache or would it help?
 
pcchen said:
As for the netburst architecture, actually I think the Northwood core is pretty good. The double pumped ALU is amazing, with 0.5 cycle latency for many simple operations. The new Prescott core is very strange, with many questionable design decisions. IMHO if Intel decided to just shrink Northwood core into their 90nm process, it may perform just better.

Once again here is some rumor and heresay, but from what I've been told Prescott was supposed to be a very impresive performer with improved SMT over that of Northwood, what exactly happened I have no idea. And I have to agree that lengthening the pipeline of a chip that is already scaling really well by 50% seemed just a bit crazy and possibly quite counter productive. Now the chip needs to make up for the lost performance, as well as burning extra power in the extra stages.

Well with any degree of luck a future stepping in combination with the new socket, will take care of some of the heat problems, and hopefully there are still some hidden goodies inside the chips that they might turn on, well besides the 64bit parts but that we know is coming.
 
In P6 there are two pipeline stages for x86 decoding. It is probably not worthwhile to replace them with a trace cache. Note that the Northwood core spends about 96KB SRAM for its 12K uOps trace cache. It could be better if these SRAM are used for L1 cache, rather than shortening 2 pipeline stages. On the other hand, there are probably more than 5 stages before TC in Northwood core.

Of course, P6 core suffer from the 4-1-1 template restriction. However, in most applications it's not the major bottleneck. Furthermore, it is always possible to design three complete x86 decoders, just like AMD did in K7.
 
There are improvements to the Banias FPU as well, increasing FPU performance by 50%, IIRC.

The trace cache takes the x86 decoders out of the critical path and increases clock rate, nicely. The Northwood has 7 stages before the tc.

A smaller and faster L1 cache just makes a whole lot more sense than a larger one, at least in x86, where load stores plauge the code thanks to the lack of registers. Yes, renaming does alleviate some of these issues to a degree.
 
Yes.

NetBurst is predicated on the idea that the best way to improve performance is to improve clockspeed. Any other deficiencies can be obviated by the simple process of raising speeds. Unfortunately this only works as long as you can continue to raise speeds. Prescott really needed to be introduced at 3.6 to 3.8GHz to really show Northwood its heels. It wasn't. What with heat, power, and scaling issues, Netburst is starting to run out of steam.

It's time to retire it.

They're already seeing limits now, and those limits are signs of things that just won't go away. Northwood may have taken the CPU crown, but now it's Prescott's turn-- and with the issues it's already had, Prescott may not be able to scale as high as Intel wants because no one will want a miniature sun in their desktop.

Intel's going to have to change their design strategy.
 
So what about the Intel's CPU code name "Nehalem"? I thought the Nehalem was Intel's first chip away for the Netburst? Is Nehalem still on schedule? I was really looking for this chip.
 
kenneth9265_3 said:
So what about the Intel's CPU code name "Nehalem"? I thought the Nehalem was Intel's first chip away for the Netburst? Is Nehalem still on schedule? I was really looking for this chip.

Yes, Nehalem was supposed to be the first chip afer netburst, but I don't think it ever got off the ground. The initial plans involved a chip far more complicated than netburst ever could have been... stuff like dynamic multithreading (making single threaded apps multi inside of the chip) and what not. But what I think Intel realized is that netburst is incredibly complicated and that making even simple little changes costs them alot of time, money, and manpower. So now they're starting to look for simpler solutions such as multicore P6 and Itanium.

Edit: A little birdie has just told me that Nehalem is actually being restarted. They're still unclear as to what precisely they're going to do, so don't expect any major news for several more years.
 
Back
Top