Bob Colwell (chief Intel x86 architect) talk.

MfA said:
Those designs are made at the conceptual level. They decided to make the cuts, and then implement what they had left ... the implementation was an effort to make good on a poor design, at which they succeeded as well as could be expected.

Ahh, thanks Mfa. I stand corrected.
 
Sure. But if you remember, the 2.0 GHz P4 wasn't released until quite a bit afterwards. If Intel had continued with P3 line, they'd have been quite a bit faster than 1GHz by the time the 2GHz Willy was released.

Um... too bad you're memory isn't good enough to remember, during that time, they didn't have problems keeping up with AMD, infact, they scaled back their roadmap when it came to speed bumps, because of that reason alone. Mind you that was performance taken on RDRAM platforms and only comparing the flagships of each company. Not price performance or anything like that.
 
I'll respond to this, but I'll be quite heavy handed in the editing.

WaltC said:
Entropy said:
In the context of the B3D forum, for people involved in gaming (which includes all of 3D gaming) his comment that Intel is really attentive to that group must be gratifying. However, he immediately followed up with "but you can't base a 30 billion dollar company on them" which should be a warning.

I'm not really sure what to make of that comment, as it's never really been obvious to me that Intel has ever at any time in its history been a company "based on" developing products primarily for people who play computer games...;)
To me the conclusion is pretty obvious - when asking the question "Is there any group of significant size that actually actively desire higher performance?", the only answer that crops up is "Gamers.".

And when he says "but you can't base a 30 billion dollar company on them", he implies that Intel won't drive their mainstream CPU development based on the interests of that group. And now we see the first practical result of that - the P4 architectural branch is cut off. It will probably get a die shrink, but that's it. Such a roadmap change is no small thing by the way, the entire industry from memory makers to Dell is part of that. And have probably driven it, in the case of Dell.

The short term consequences of that roadmap change is that it will take a very long time before we see a factor two performance improvement over the 3,06 GHz P4 - very far from 18 months which is what it has roughly taken during the 80s and 90s up to the beginning of 2000. And if that means that gamers aren't happy, then so be it.

Incidentally, I feel that their "lets take the mobile chip and adopt it for the desktop" is a reasonable short term approach. Taking a longer term view it might be reasonable to ask if there has to be desktop chips at all. Intel could let their marketeers loose, and simply declare "the era of ergonomic computing", say that all their mainstream chips would fit within a XXW power envelope, and bring out reference solutions that take advantage of that. World+dog would rejoice, new business opportunities all around. Not likely to happen though, the inertia in the industry is enormous.

And the general gist of the presentation really told the story - the age of pushing performance forward at the cost of other parameters is coming to an close for general purpose computing. It's not over yet, and may never fully be, but other factors will get progressively more attention as soon as the marketeers figure out how to sell them, and they will find ways to sell those features, because apart from gradually loosing attraction value performance has already ceased to improve at the accustomed brisk pace.

Interesting that you'd use the word "age" in this context. It's actually like about "five years" from Intel's perspective as opposed to an "age," don't you think? Wasn't it 1999 in which the primary x86 workhorse for Intel was the PIII, which it stuggled mightily to bump to 1GHz in response to the cpu perfomance of AMD's K7, which was nowhere near as dependent on ramping MHz clocks for its overall processing performance?
Of course competitive pressure enters into it, but there has always been some. In the early 80s it was the 68000 family for instance, there was RISC, there has been x86 clones et cetera. The point is that historically Intel competed on performance (and industry thumb screws). Not based on power consumtion, or amount and quality of integrated functionality et cetera. "Age" stretches back to the 8086, easily. The performance vs power issue simply wasn't the overwhelming concern it is today.
(By the way, I'd say that the K7 and the P3 was within spitting distance as far as IPC was concerned, though the nod would have to go to the K7.)

The Q&A session had a notable passage from 1.11.30 onward that made it very apparent that in Colwells opinion x86 really carries a lot of baggage, and that it may not be able to compete quite as impressively going forward with clean-sheet designs. That's probably not much of an issue in PC space for compatibility reasons, for consoles however other rules apply. What will stagnating CPU speeds mean for the development of future PC graphics engines? And does this have any short term or long term implications for PC vs. console gaming?

Is it really "notable" that he'd say this, considering that Intel opposed x86-64 from the start, and was busy publicly telling the world that "Hey! If you run x86 software, then relax! You don't need 64-bit computing. But the good news is that when you get to Itanium you're going to love it!"...? Also, there's no doubt in my mind that by far the biggest piece of "baggage" relative to x86 that Intel would like to chuck is AMD...;)

Also, the last remarks you make as to general "graphics engines" and "consoles" and what you term "stagnating cpu speeds" by which I assume you mean "stagnating MHz clocks"--which as I pointed out does not have to mean "stagnating performance" at all--sound very much like you assume that he's speaking for AMD and everybody else. I don't think it would be wise to view any of his remarks outside of an Intel-specific context.

It is "notable". Particularly in view of the criticism he levelled at Itanium. They lost the XBox2 to another architecture, which is decent volume even in Intels book so it is not as if they do not try to compete with x86. In fact, he was so emphatic when he separated compatible and incompatible approaches that I wouldn't be terribly surprised to see an entirely new architecture out of Intel in less than 5 years. He really left very little doubt as to his personal opinion, and the man is a very senior employee at Intel. Does anyone honestly believe that Intel can see something like the Cell architecture, and not consider that it may be a good idea to have something more radical up their sleeve than on die x86 multiprocessing to counter with in case this first Cell processor turns out fine, and is easily scaleable to boot?

And no, I mean exactly what I wrote. Stagnating speeds including performance (as architectures are largely untouched since the introduction of the P4). How long was it since the 1400 MHz K7 was introduced? I can almost go out and buy something twice as fast today, depending on what benchmarks you use. Thing is, CPU speeds used to double every 18 months, and we're nowhere near that pace today, nor does this stagnation look like a temporary hic up.

He also remarked on how graphics processors are getting more programmable, and how "this hadn't gone unnoticed at Intel", a remark that's quite intriguing, and a bit disturbing if you happen to be a grahics IHV. (And of course he let slip the amount of i-cache on a gfx-processor unfortunately without saying which one.) What could he have meant by that remark?

I have no idea what he meant by it--just as I had no idea what he was talking about in saying that it should be understood that Intel couldn't be a company "based on" making products for computer gamers...:) (Since Intel never has been that--and he might be the only person alive confused about that, should he actually ever have thought that himself.)
Bob Colwell didn't strike me as particularly confused.
Anyway, I don't know what he meant by the remark - that's why it is interesting speculation fodder, no? Because it could mean a number of different things, some interesting, some not. For instance, two facts:
1. Intel is currently eating up the graphics market from below.
2. GPUs and CPUs do not compete in functionality.
Could either of these change in any way?

You seem to be of the opinion that he was spending this seminar making excuses for Intel. That's not my impression of it at all. So different people do indeed interpret things differently.
 
Chalnoth said:
Sure. But if you remember, the 2.0 GHz P4 wasn't released until quite a bit afterwards. If Intel had continued with P3 line, they'd have been quite a bit faster than 1GHz by the time the 2GHz Willy was released.
Well, they tried to release faster 180nm P3's but we all know what happened to them. Intel pushed the P3 until it broke (originally it was supposed to scale at a much more leisurely pace to around 700 MHz) and still could’t touch neither Athlons nor P4's. I don't think that Intel could have gotten much more out of the P3 architecture without a full redesign like the PM and there wasn't any time for that.

Now, I would like to say is that not everything Intel did with the P4 was bad. There are definitely many things about the architecture that are very good for performance. But I think that every single one of its benefits could have been done better on a chip that was made for lower clocks and higher IPC.
I take the opposite viewpoint – I don’t think everything Intel did with the P4 was good (eg. L1 cache that sometimes behaves like it’s direct mapped and slow shifters). I think that the decision to go with a speed demon design was one of several good ones available and certainly the P4 matched the Athlon on the 180 nm process and often outperformed it on the 130 nm process. Sure it’s more dependant on optimised code but Intel is big enough to force that on developers (just see what happed between IL-2 and IL-2 FB) and in the long run that’s a good thing for everyone. Normally one would expect a 90 nm Netburst core to have competitive performance but somewhere along the way some combination of core fixes, more use of automated design tools, mystery transistors, overestimating the effect of strained silicon, 90 nm leakage and 31 stages borked the Prescott.

But, the Intel brass has for a very long time believed that MHz is king, that people equate high frequency with high performance.
Yes, but that is only wrong if they don’t deliver the performance and the Alpha EV5 and P4 shows that speed demons can work very well.
 
glappkaeft said:
Well, they tried to release faster 180nm P3's but we all know what happened to them. Intel pushed the P3 until it broke (originally it was supposed to scale at a much more leisurely pace to around 700 MHz) and still could’t touch neither Athlons nor P4's. I don't think that Intel could have gotten much more out of the P3 architecture without a full redesign like the PM and there wasn't any time for that.
It's not like they had to go the route of the P4, though. My point was that the P3 had much higher IPC than the P4. You'd think it would have been pretty straightforward for Intel to have either maintained or enhanced IPC for their next architecture. But no, they decided for, "MHz or bust." This was a dead-end, and they should have known it.

It's not that high frequency is a bad thing to pursue. It's that pursuing high frequency at the cost of IPC is bad. That's what Intel did, and now they're paying for it.
 
Entropy said:
To me the conclusion is pretty obvious - when asking the question "Is there any group of significant size that actually actively desire higher performance?", the only answer that crops up is "Gamers.".

And when he says "but you can't base a 30 billion dollar company on them", he implies that Intel won't drive their mainstream CPU development based on the interests of that group.

Yes, but the problem is that no one ever thought that an urgent need to support computer gamers was the driving factor behind Intel's mad dash to MHz ramping since 1999 as they announced when boasting of ramping the P4 to 10GHz eventually...;) In fact, it was quite obvious what that actual pressure was---it was AMD in the form of the K7. So it's only if you want to avoid admitting the real reasons your company would embark on a course so insane as to announce the extreme MHz ramping of an existing architecture (P4), several years before you determine whether or not it is even technologically possible to ever do so, that you might want to say, or infer, or imply something like "we were pushing P4 MHz ramps to support computer gaming, but have since realized that we can't base our $30-billion company on such a strategy." Heh...;)

I'm confident that at the time Intel announced its glowing, extreme MHz ramp roadmap for the P4 that Intel had no intention of ever basing its company on the needs of computer gamers, but instead was "basing the company" on a strategy it deemed necessary for competing with AMD. As such, computer gamers benefit peripherally from the robust performance competition of the last 5 years we've seen between Intel and AMD. But so do many of ther groups of computer users who use software to provide cpu performance-dependent tasks within the markets AMD and Intel serve with their x86 cpus, like scientific computing, commercial 3d-rendering companies, companies who need powerful low-cost servers for a variety of functions, not the least of which is the Internet, etc. In short, computer gaming is but a single market of many which demand the highest performance possible from an x86 cpu.

My point is that there isn't any factual foundation for an Intel employee to make an assertion that Intel embarked, post k7, with its wildly optimistic P4 MHz roadmap simply out the company's perceived need to support "computer gaming" to the degree that the company was "based on" such a proposition. What Colwell might think would persuade anyone at all to even consider this as even fractionally likely is beyond me...;) It's a bizarre statement to make, considering all of the other, more pertinent issues Intel was facing at the introduction of the P4 which plainly were of much greater urgency to Intel than any overbearing need to get more frame-rate out of Q3...;)


The short term consequences of that roadmap change is that it will take a very long time before we see a factor two performance improvement over the 3,06 GHz P4 - very far from 18 months which is what it has roughly taken during the 80s and 90s up to the beginning of 2000. And if that means that gamers aren't happy, then so be it.

Right, these are consequences for Intel, not necessarily for anyone else. Again, we see Colwell sidestepping mention of AMD, as if x86 gamers should decide Intel's approach to cpu performance in the short term isn't as attractive to them as the approaches to x86 performance offered by AMD, then it will be gamers "getting happy" with AMD, and it actually will be Intel which is to some degree made "unhappy" with the result. "So be it"...? I agree with you that this is the point Colwell was making, albeit as obliquely and vaguely as humanly possible....;)

Incidentally, I feel that their "lets take the mobile chip and adopt it for the desktop" is a reasonable short term approach. Taking a longer term view it might be reasonable to ask if there has to be desktop chips at all. Intel could let their marketeers loose, and simply declare "the era of ergonomic computing", say that all their mainstream chips would fit within a XXW power envelope, and bring out reference solutions that take advantage of that. World+dog would rejoice, new business opportunities all around. Not likely to happen though, the inertia in the industry is enormous.

Well, of course it's a reasonable short-term approach for them, especially when you consider that what all of this means, the whole P4 roadmap dump in terms of the MHz ramp Intel announced at the launch of the P4, is that, short term, Intel has no other choice. That's what all of these "P4 roadmap dump" announcements are all about.

Of course competitive pressure enters into it, but there has always been some. In the early 80s it was the 68000 family for instance, there was RISC, there has been x86 clones et cetera. The point is that historically Intel competed on performance (and industry thumb screws). Not based on power consumtion, or amount and quality of integrated functionality et cetera. "Age" stretches back to the 8086, easily. The performance vs power issue simply wasn't the overwhelming concern it is today.

I think we need to stop here for a moment and ponder something. What the subject is here is Intel's abandonment of the P4 roadmap in terms of the MHz ramp they announced for the P4 just a few years ago. What you are describing, however, when you go back to the 80's and talk about MC68000s and so on, is Intel's entire involvement in the general cpu marketplace since the 8086. But Intel is not dumping its competitive involvement in the whole of the general cpu market, is it? Nope, all they are doing is dumping their original P4 MHz ramp as originally scheduled, and saying, "Sorry, folks! We're not going to take the P4 to 10GHz, or 7Ghz, like we said we were going to do originally--because we just can't do it from the standpoint of the technology we have at this time. We know it would have been swell if we'd known this when we made the original MHz ramping roadmap for P4, but--hey--we thought we could do it, you know? So give us a friggin' break, people. We may not even be able to take it to 5GHz, but we are sure going to try and hit decent yields at 4GHz--uh, sometime later this year, that is (fingers crossed.)" I just can't see how you'd characterize any of this as Intel saying something like: "Sorry, folks, but you know how we've been competing in the general cpu market since 8086? Well, we're dumping that strategy, and we're going into the golf-cart business, instead."

Plainly, the only thing Intel is dumping here is their former plan to ramp the P4 to 10GHz ultimately, as it looks as if maybe 4GHz is all she wrote for P4 at the present time, and Intel feels it is compelled to publicly announce it--so it seems that internally Intel is convinced of it (trust me when I say that at the time it shipped the 8086 Intel had no plans whatever about 10GHz cpus of any type, which is why this pertains to the P4 exclusively and no other Intel cpu architecture.) And of course, when people expect you to do one thing based on your previous statements, but you realize you cannot actually execute on your previous statements and you know you have to do something else, instead, it's usually good policy to announce that sort of thing publicly. Trying to spin it as some kind of "paradigm shift" in general cpu technology that you've "discovered" only recently is exactly what I'd expect Intel to do. What surprises me with respect to your posts here is your eagerness to simply believe what's being said on its face without any effort to analyze these kinds of spins either factually or historically for accuracy or relevancy.

(By the way, I'd say that the K7 and the P3 was within spitting distance as far as IPC was concerned, though the nod would have to go to the K7.)

I'd agree with you, and only add that Intel's problem with the P3 was Mhz scaling versus the K7, so--presto--enter the P4 and the recently discarded "P4 to 10GHz or Bust" roadmap Intel presented for the P4 at the time.

It is "notable". Particularly in view of the criticism he levelled at Itanium.

Kind of safe at this point point to criticize Itanium, seeing as how everyone and his mama has been criticizing it for the last couple of years, and seeing how it seems to be pretty much a dead duck at this time. You've got to remember that Intel's strategy relating to getting people off of x86 and onto Itanium was formulated long, long before people discovered that Itanium was such a gosh-awful mess that nobody was interested in it (Intel actually was paying people to adopt it at one point), and that, by comparison, Itanium has actually made x86-64 look so much better, almost the proverbial "gift from the gods" in contrast. Kind of ironic how that strategy has backfired on Intel--it's an apt warning for ambitious young would-be-monopolists everywhere, I think...;) It's also suitable for children's nursery rhymes and other fables:

Humpty-Monopoly sat on a wall.
Humpty-Monopoly had a great fall!
And all Intel's horses and all Intel's men
couldn't put Humpty-Monopoly together again!

or,

"An Itanium by any other name would smell just as bad."

(This looks forward to future versions of Intel's "clean sheet" designs through which Intel hopes to once again imprint its designs on the general cpu markets towards regaining it's pre-K7 monopoly and killing off AMD in the process of sacrificing x86 at the altar of Beelzebuub. I suspect Intel will certainly need supernatural help of some kind to succeed in this endeavor.)

They lost the XBox2 to another architecture, which is decent volume even in Intels book so it is not as if they do not try to compete with x86.

Perhaps that's because 700MHz P3's are circa ~1998 cpu technology which M$ can purchase for a song these days? Remember as well that it's not Intel who picks which cpu will sit in xBox, but M$. Intel's current x86 offering, of course, isn't P3 but it's P4. It's just that now we know that the fabled 10GHz P4 MHz ramp isn't going to happen, after all. They are going to try and shoot for decent yields with P4 at 4GHz, according to what they are presently saying.

In fact, he was so emphatic when he separated compatible and incompatible approaches that I wouldn't be terribly surprised to see an entirely new architecture out of Intel in less than 5 years. He really left very little doubt as to his personal opinion, and the man is a very senior employee at Intel. Does anyone honestly believe that Intel can see something like the Cell architecture, and not consider that it may be a good idea to have something more radical up their sleeve than on die x86 multiprocessing to counter with in case this first Cell processor turns out fine, and is easily scaleable to boot?

Again, Itanium was an effort made by Intel to usurp x86, and not coincidentally, to conveniently kill off AMD at the same time (as AMD is strictly x86.) It failed in that regard because of many reasons, not the least of which is that Itanium benefits Intel far more than it does any of the propsective Intel customers Intel was asking to buy it. With Itanium Intel was seeking to serve Intel's interests, with x86-64 AMD was seeking to serve the economic and practical interests of the general cpu markets, and that's why the Itanium-A64 stories are a stark study in contrasts commercially, a failure and a success, respectively.

The problem with being accustomed to operating as a cpu monopoly as Intel did for so many years is that you become accustomed to considering the welfare of the company first and the welfare of your markets second, because your markets simply have had nowhere else to go besides you to buy the products they want. AMD has never operated under this kind of stagnating inertia, and so I think is blessed with a much clearer general understanding of the kind of products its markets wish to buy, and an understanding that you do not dictate to your markets, but rather, you seek to serve their interests by manufacturing products which fit their needs. I think it's obvious that there are highly placed people within Intel who simply do not understand that the issue isn't remotely about "clean sheet" cpu designs--we are long, long past that point in terms of the maturity of the general computer markets--as the effective penetration of the world markets with x86 is very deep and companies are heavily invested in it in terms of hardware and software, not to mention familiarity and expertise. In short, they are no more likely to dump x86 to move to a cpu incompatible with their x86 software as they might be to dump the current OS they've invested in to run a new OS which is incompatible with their base of invested software.

The problem for Intel with respect to this topic is that x86 technology is plainly getting the job done for the people who use it and are invested in it, and Intel will have to produce a whole lot more than a performance bump to get currrent x86-invested companies to undertake huge expenses relative to new hardware roll out, not to mention making huge investments in training and software. Itanium fell far short of those requirements, obviously, and merely served instead to point out how much better in a purely practical sense x86-64 was by comparison, from the point of view of the companies asked to buy A64 and/or Itanium.

And no, I mean exactly what I wrote. Stagnating speeds including performance (as architectures are largely untouched since the introduction of the P4). How long was it since the 1400 MHz K7 was introduced? I can almost go out and buy something twice as fast today, depending on what benchmarks you use. Thing is, CPU speeds used to double every 18 months, and we're nowhere near that pace today, nor does this stagnation look like a temporary hic up.

Refresher: In terms of Mhz, how many years did it take Intel to go from the very slow clock speed of the original 8086 up to the 1Ghz clock of the P3 in 1999? Somewhere around 15-17 years, right--just to get to 1 Ghz. How many years had AMD been in business prior to introducing the K7 in 1999? Long time, right? k7 was AMD's first cpu that put the company into competition with Intel in all x86 cpu market price points. OK, now check the MHz and performance specs for both the P4 and the Athlon in the last 5 years, since 1999, and what we see is an *accelerating* rate of cpu performance between these two cpu companies, one that is at least double the total perfomance achieved by either company's products in the the whole 15 years prior to 1999.

So generally, the pace of cpu development ignited like a rocket booster in 1999. The fuel? Competition, baby...;) Out with the monopoly (and it's slow, tedious, water-torture like release of cpu tech promulgated entirely on the milking paradigm by a fat and slothful International monopolist), and in with competition, spawned by an aggressive cpu company determined to take the game to Intel for a change, a company with the both the balls and brains to do it successfully, and a company entirely unfamiliar with the milking paradigm so beloved of monopolists everywhere...;)

Anyway, the truth is that far from stagnating, in the last five years cpu performance has gone into orbit compared with the whole of the x86 period pre-K7, from 8086 to P3/k7. You have to rember that Intel loathes competition, is not yet acclimated to working in a competitive cpu landscape, and when it looked to Intel like AMD was going to beat it at its own x86 game, Intel tried to change the rules and "transistion" the market to Itanium--but that has failed for all of the reasons I've already mentioned, and a few more besides.

In other words, what Colwell's telling you pertains specifically to Intel, not AMD, which ought to be clear as the cancelled P4 roadmap is Intel's cancellation and not AMD's (since of course AMD had no such P4 MHz roadmap to cancel, did it?) It's Intel decrying the fact that the P4 will never see the kind of MHz roadmap Intel announced for it--not AMD.

Of course, though, the pace and pressures of recent competition and the demands of technology have made it tougher going for both cpu companies--everybody knows that and has known it for a long time. But ask yourself whether it was necessary for Intel to cancel a long-held MHz roadmap for the P4 simply to explain that cpu manufacturing is getting tougher? It's not as if any of that was a secret. I'm sure that from Intel's perspective things are much, much tougher than they ever were pre-K7, but you should also understand that things have always been much tougher for AMD than for Intel, and the difference is that whereas AMD has always worked hard under pressure in surmounting obstacles, the competitive pressure aspect of the situation from Intel's perspective since 1999 is something that is comparatively a brand new experience for Intel, and the people who work inside the company at every level...:)

Really, what was the sense in Intel announcing such an ambitious MHz ramp for P4 in the first place, since Intel did not know at the time whether it was doable, and now has admitted it never was? My opinion is that someone's finger hit the panic button at Intel in 1999, post K7, and the idea was born that "If we can take the P4 and ramp it in MHz through a succession of process reductions over time, we could conceivably push it to 10GHz theoretically, and AMD most likely would fall by the wayside in the process." This seemed to satisfy the panicked powers-that-be in the company (paranoia is a virtue at Intel), and the strategy was announced as the "P4 MHz ramping roadmap" we've all seen in various forms since 1999, which has now been aborted well short of fruition.

It's important to note, again, that AMD concentrated on a different strategy with K7 from the beginning, a strategy concerned with things like processing efficiencies and general IPC performance over future, process-dependent MHz ramps, which Intel planned into P4 with longer pipelines and lots of other things which assumed much higher MHz clocks would not be a problem (HyperThreading, etc.) So the strategies of the two companies have been very different during the last five years of this competitive phase, which underscores why it's important not to generalize and equate the things Intel spokesmen say as applicable outside of Intel, or the things AMD spokesmen say as applicable outside of AMD.

Bob Colwell didn't strike me as particularly confused.
Anyway, I don't know what he meant by the remark - that's why it is interesting speculation fodder, no? Because it could mean a number of different things, some interesting, some not. For instance, two facts:
1. Intel is currently eating up the graphics market from below.

Intel's IGPs so far are fairly unimpressive beside the middle and upper-end products by IHVs like ATi and nVidia. Intel doesn't make anything comparable to nV40/R420--at all--so no competition there. The gpu market is segmented in terms of price & functionality and performance just like all other peripheral markets. Intel's always done better in the low-cost, low-function, low-performance IGP market by virtue of the fact it has an economic edge in terms of producing its own core-logic chipsets and motherboards. In the discrete retail 3d gpu markets and system OEM markets for discrete peripherals, Intel simply isn't visible.

2. GPUs and CPUs do not compete in functionality.
Could either of these change in any way?

No, because the relationship between 3d gpus and cpus has always been symbiotic, and they depend on each other within a system. GPUs are by contrast specialist processors dealing with pixels and cpus much more generalized, and the more complex and powerful each becomes respectively the less the likelihood of them merging, due to manufacturing and yield concerns if nothing else.

You seem to be of the opinion that he was spending this seminar making excuses for Intel. That's not my impression of it at all. So different people do indeed interpret things differently.

The difference between our perspectives, I guess, is that because he works for Intel I don't find it surprising that his view is decidely Intel-centric, and I'd expect him to spin things into a congruency with his own corporate perspectives and technical biases. As such nothing he said suprised me as I'd expect Intel to put the best face on that it can wear in the situation...;) I guess I'm a bit surprised to see that you'd automatically assume that his viewpoint is an objective one unconnected with his employment, as I sure didn't see that as being evident.

My goodness, sorry for the length here...;)
 
Jesus christ, that takes up about 46% of the scroll bar on my IE. O.O Very interesting though. Edit: That was 3608 words, Walt. I am truely impressed.
 
GPU manufacturers are competing for gamers, and the CPU is dragging them down compared to their main competitors ... consoles.

CPUs are doing occlusion culling now, and doing a lousy job. That has to move to the GPU sooner or later. Physics? Probably going to move to the GPU too. AI? Well it doesnt need SIMD so the CPU has a bit less of a disadvantage, but it is still massively parallel (each actor is independent in a given timestep). For gaming the CPU is a poor match for any computationally intensive task.

Maybe the symbioses wont change, but then PC gaming will start bleeding even more players.
 
When you look at the computation market, GPU manufacturers have targeted the floating point performance segment far more than the CPU manufacturers.

I agree that pretty much all of the computation intensive tasks will eventually move to what is now considered the GPU. This includes all the physcial simulation, collision detection, 3d graphics (including all the culling), etc.

This means the GPU of the future will need to gradually transform into a massively parallel general purpose floating point vector processor that can be programmed using standard programming languages like C++. This in turn means general purpose addressing, branching, and stack management. Something that looks more like a FASTMATH processor than a current GPU. However, one major difference compared to a CPU is that the vast majority of the transistors will be used for actual logic with large numbers of floating point ALUs rather than for on-die cache. Instead future GPUs/vector processors will likely continue to rely on very high external memory bandwidths with a small amount of on-die cache.

It also means scaling up frequencies significantly using dynamic logic.
 
WaltC said:
In other words, what Colwell's telling you pertains specifically to Intel, not AMD, ...
This is the central point your post is trying argue.
I'm claiming that you are wrong. The problems with limited performance scaling with shrinking feature size is not Intels alone. I can see why you'd like to kick Intel in the shin for the P4, but this problem is deeper than a particular x86 implementation.

SA said:
This means the GPU of the future will need to gradually transform into a massively parallel general purpose floating point vector processor that can be programmed using standard programming languages like C++. This in turn means general purpose addressing, branching, and stack management. Something that looks more like a FASTMATH processor than a current GPU. However, one major difference compared to a CPU is that the vast majority of the transistors will be used for actual logic with large numbers of floating point ALUs rather than for on-die cache. Instead future GPUs/vector processors will likely continue to rely on very high external memory bandwidths with a small amount of on-die cache.
Now, compare the above to the Sony patent on the Broadband Engine. The traditional "CPU" and "GPU" division is demonstrably not the only option. There are more ways than one to skin this particular cat. I would further contend that Intel is very aware of this. The interesting question is - what will they do about it? Nothing or something? And if something, how? What can be done within the current PC paradigm? Outside it?


Drifting back to the the question if this is the breaking of a trend of escalating power draws for PCs, it still remains to be seen if Intels move to a core originally intended for portable application is just to enable further scaling. Will we see 120W desktop variants of these cores? This quote I nicked from The Inquirer from an Intel financial Q&A gives hints:
Otellini Q&A said:
He said: "If we had not started making this right hand turn we would have run into a power wall which the graphics guys are seeing today. We think we can still deliver more performance inside a better thermal envelope."

While four core server multiprocessors are a way away, putting together such cores to create 16-way server cores wasn't beyond Intel's vision.

Will notebooks and desktops merge? Otellini said there's been a lot of speculation about that. He said: "The feature set for multicore gives us the ability over time to converge our cores much more synergistically than we have in the past. That doesn't mean the same chip will go into different form factors. You'll see more common architecture but not the same implementation of the chip. It will happen in the continuity of time".
 
CPU and then a Math Co, we've been there before, they could merge.

I feel that, there will come a time where a motherboard will have very fast interfaces, within which one can attach a processor geared towards differeing workloads, however.

I'm leaning towards the latter because, we can start affording that sooner than we can the transistors in a single package to support a unified solution.

If I understand things correctly, logic heavy chips have sprawling networks for power and signalling, these large networks will utilize interconnects more heavily, which, as the feature size decreases, present greater resistances; won't all of this require greater voltage to propagate a singal and thusly increase power quadratically as opposed to smaller PU and smaller networks at higher frequencies?
 
Saem said:
If I understand things correctly, logic heavy chips have sprawling networks for power and signalling, these large networks will utilize interconnects more heavily, which, as the feature size decreases, present greater resistances; won't all of this require greater voltage to propagate a singal and thusly increase power quadratically as opposed to smaller PU and smaller networks at higher frequencies?
I think the voltage that a processor can use is limited by the transistors that are used.
 
SA -
I hope that your description of a "massively parallel GPU of the future" is coupled with a language that allows the software developer to directly specify parallelism to the compiler. Keeping the syntax similar to C/C++/Java/C# is IMO a good idea, but to my mind using any of those languages as they stand would be like having an old grannie drive a Porshe.

Somewhat OT, an interesting link :
http://wavescalar.cs.washington.edu/

exciting times ahead...
 
SA said:
When you look at the computation market, GPU manufacturers have targeted the floating point performance segment far more than the CPU manufacturers.

I agree that pretty much all of the computation intensive tasks will eventually move to what is now considered the GPU. This includes all the physcial simulation, collision detection, 3d graphics (including all the culling), etc.

This means the GPU of the future will need to gradually transform into a massively parallel general purpose floating point vector processor that can be programmed using standard programming languages like C++. This in turn means general purpose addressing, branching, and stack management. Something that looks more like a FASTMATH processor than a current GPU. However, one major difference compared to a CPU is that the vast majority of the transistors will be used for actual logic with large numbers of floating point ALUs rather than for on-die cache. Instead future GPUs/vector processors will likely continue to rely on very high external memory bandwidths with a small amount of on-die cache.

It also means scaling up frequencies significantly using dynamic logic.

If I understand you correctly, ATi is already on the right track, cause they have licenced the Fast14 - process for dynamic logic from Intrinsity which is also producing or licencing the FastMATH processor. So it could very well be that they have also licenced the FastMATH processor too.

IMHO there could even be another path to high performance computing for massive parallel multimedia-tasks :

http://stretchinc.com/

They embedd an FPGA, called ISEF within an RISC-processor so that they can program the FPGA with C/C++. The performance of this combination seems to be very high. An S5000 @ 300MHz programmed in C++ is faster than an FastMATH processor @ 2GHz programmed in assembler in the EEMBC telemark benchmark.
The only drawback I see at the moment seems to be the slow interchangeability between different algorithm due to the need to program the ISEF/FPGA.
 
FPGAs are hugely inefficient as far as power consumption and area use when working with wide words is concerned. Anything suited to our needs would still need to have specialized circuitry for arithmetic, datapaths, register sets and caches IMO. The overhead of implementing any of these in a FPGA is unrealistic.

There might be levels below simple multi-core chips with relatively standard processors/shader-units though which could work well, something like MOVE. I still think condensed graphs might translate well to hardware.
 
Chalnoth said:
It's not like they had to go the route of the P4, though. My point was that the P3 had much higher IPC than the P4. You'd think it would have been pretty straightforward for Intel to have either maintained or enhanced IPC for their next architecture. But no, they decided for, "MHz or bust." This was a dead-end, and they should have known it.

It's not that high frequency is a bad thing to pursue. It's that pursuing high frequency at the cost of IPC is bad. That's what Intel did, and now they're paying for it.

I think we more or less agree on that. I just don't think it was a bad idea at 130 nm an earlier and it should have worked decently enough at 90 nm but all this Intel talk about superscaling 10 GHz P4's is looking more and more like someones bad acid trip.
 
Entropy said:
This is the central point your post is trying argue.
I'm claiming that you are wrong. The problems with limited performance scaling with shrinking feature size is not Intels alone. I can see why you'd like to kick Intel in the shin for the P4, but this problem is deeper than a particular x86 implementation.

What I'm simply trying to point out is that AMD realized these things long ago when it was designing the original K7 core, and so AMD never embarked on the same kind of "less-efficient-but-clocked-to-the- stratosphere" approach to Athlon that Intel announced early on for the P4. The roadmap cancelled here is the P4's--not the Athlon's--so that should tip you to the fact the problems faced by AMD with Athlon as to its particular roadmap, and the problems faced by Intel with respect to the P4 roadmap are entirely *different* sets of problems, because the strategies behind the roadmaps are different, and the cpu architectures themselves are different. AMD never at any time announced a 7-10GHz ultimate production target for the K7, Intel did for the P4, and so obviously these differing roadmap strategies produced different kinds of problems for each company to solve.

Of course, as I said, everybody knows cpu manufacturing is getting tougher--that's not news--and is certainly something generally well known prior to Intel's P4 MHz-ramp roadmap being cancelled. What's going to count in the future is how these companies respond to the problems and challenges they face ahead. This does not simply boil down to a matter of technology--it boils down to the strategies companies employ to deal with those problems--which in turn becomes much more a matter of judgment than one of pure technology. As you say, there are always several ways to skin the cat...;) What counts in the end, though, is whether or not one company chooses a better method of skinning the cat than another. With the introduction of Athlon, AMD embarked on one method of skinning the cat (architecurally-driven increases in processing efficiency), with P4 Intel embarked on another (less-efficient architectures driving performance through MHz ramps achieved through process reductions), and it certainly seems to me that what's been cancelled here is the P4 strategy--not the Athlon strategy, right?

In fact, Dothan and Itanium also eschew the strategy of driving performance through the MHz ramping of less efficient architectures, don't they? So does IBM's G5, Sun's SPARC, etc. In other words, what AMD's done relative to its Athlon strategy is actually much more common in cpu design and manufacturing than was the P4's MHz-driven strategy, even within the Intel family of cpus itself. Cancellation of the P4 roadmap tells us only about problems Intel had with its P4 MHz-driven roadmap, which Intel has found to be insurmountable. However, they are changing strategies to bring their x86 design strategy in line with the strategy used by AMD for x86 cpu design and manufacturing--just because the problems are tougher these days doesn't mean Intel's giving up--it's just shifting gears and changing strategies.
 
In fact, Dothan and Itanium also eschew the strategy of driving performance through the MHz ramping of less efficient architectures, don't they? So does IBM's G5

I agree with most of what you're saying, but I wouldn't lump IBM's G5 into this.

It's focused on more floating point performance and parallel processing not the Mhz as seen they are barely able to reach 2GHz.

Speng.
 
Let me clarify my positions below.

The Pentium 4

I really dont know what happened but there are some levels of possible problems like:
a - the basic idea of use long pipelines to implement a speed deamon.
b - the conceptual phase of the design based on "a"
c - the implementation phase of the design

Looks like the conceptual phase of the P4 design had some modifications of the initial idea, and the implementation phase had problems like:
- few people (3) really understand the entire design
- too large development team
- probably the tools where the best possible, but not good enough to help with such complex design

The point is I am not sure a long pipeline is not a good idea if you:
- have a better ISA
- dont have heavy logic overhead for hyperthreads, i32e, DRM, etc...
- have better tools to develop it

Maybe if Intel had redesigned/improved the northwood core they could:
- Had implemented a faster 1MB L2 cache ( 5% performance increase)
- implemented a larger L1 microops cache
- implemented a larger L1 data cache
- a second FPU unit
- some speed gain without the hyperthread logic overhead

This could
- have "only" around 80 Millions transistors
- more chips per wafer and better yields
- colder
- higher IPC

Then a 2.8GHz P4 could have a 3.2GHz performance and a 1.8AGHz heat dissipation.
And a 4GHz P4 could be as fast as a 5GHz Prescott.
This could be a winner in the current PC market :)

edited: also we dont dont know what impact a low latency onchip memory controller could have with a long pipeline CPU.

The wintel PC model

Independentelly of the P4 success or failure IMHO the current wintel Personal Computer model is old, ineficient and it is time to change for something better.
We need some alternatives urgentlly. Maybe I will post a new thread about it.
 
Back
Top