Bob Colwell (chief Intel x86 architect) talk.

nutball said:
/me checks price of 3.4GHz Pentium 4
/me checks price of 1.7GHz Pentium 4

I'm gonna have to disagree with that!
One thing you haven't paid attention to is that most PC's are sold in the low-end range, where the processors are all about the same price (usually ~50-100 USD), and adding a second processor would be a significant cost.

Not to mention Intel is the one that wants to sell those 3.4GHz Pentium 4's at a premium (to offset the huge R&D costs). If the market when dual/multi-processing for performance, they'd be less able to. The high-speed chips themselves aren't any more expensive to manufacture in most cases.

And you didn't address this bit:

If the software solution was so easy, it would have been done already. Why aren't we running Microsoft Word on Beowulfs-Of-486s-In-A-Box?
And the answer again is: it'd be vastly, vastly more expensive than single-processor designs. Remember that most computers purchased are low-end.
 
Diplo said:
Can anyone expand on the 'chaos-theory-like' behaviour? In the future are we going to see embedded chips in, say, aircraft that suddenly start behaving unpredictably because the complexity is beyond our individual comprehension? I'm almost seeing some kind of Terminator-style future where machines build machines and no human can understand what they are doing anymore....
The answer is simply to build parts out of multiple simpler components.
 
nutball said:
That was what I was getting at responding to Chalnoth -- building multi-processor computers it pretty easy. If it was as simple as building a two-processor PC and adding the '--parallelise_my_code_please' flag to the compiler to get a factor 2 speed-up in MS Word, it would already have been done, we'd all be using parallel desktop applications on multi-processor desktops right here, right now.
Well, also don't forget that speed-up in MS Word is no reason to purchase a high-end PC. Simple burst-processing tasks really don't need it. As far as the types of things your typical consumer might get into, I'd say it's the media-type stuff, like MPEG-4 encoding that may be more likely. I would think it'd be pretty trivial to write an MPEG-4 encoding algorithm that would work very well on multiple CPU's.
 
Chalnoth said:
Well, also don't forget that speed-up in MS Word is no reason to purchase a high-end PC.

That's true (at the moment, though I'm sure Microsoft are busy adding new "features" that will make it burn 10 times the compute power in the next version and the one after!).

As far as the types of things your typical consumer might get into, I'd say it's the media-type stuff, like MPEG-4 encoding that may be more likely. I would think it'd be pretty trivial to write an MPEG-4 encoding algorithm that would work very well on multiple CPU's.

I have a feeling that NVIDIA/ATI and Creative Labs might have a view on where the transistors for media processing are going to be living!

It's an interesting point to ponder whether we'd be better off with dedicated hardware for MPEG-3 and MPEG-2/4 encoding and decoding. Obviously dedicated sub-systems could be much simpler than a general-purpose CPU of equivalent performance. On the other hand the dedicated hardware will be wasted when the user isn't doing media stuff, whereas in the general-purpose processor they'll more than likely always be being used for something.
 
Chalnoth said:
The answer is simply to build parts out of multiple simpler components.
Isn't that what a CPU in essence is? A load of very simply transistors? The trouble is, when you put lots of simple components together you have to have something that controls them so they can operate together. Once that happens you have something that quickly becomes complex and less predictable.
 
Diplo said:
Isn't that what a CPU in essence is? A load of very simply transistors? The trouble is, when you put lots of simple components together you have to have something that controls them so they can operate together. Once that happens you have something that quickly becomes complex and less predictable.
That's not the point. You put them together in a simple way as well. This is why I still think that parallelism is the future for CPUs. All of the interesting tasks that actually take a significant amount of processor power can be made to work well with parallelism.
 
nutball said:
Chalnoth said:
Well, also don't forget that speed-up in MS Word is no reason to purchase a high-end PC.
That's true (at the moment, though I'm sure Microsoft are busy adding new "features" that will make it burn 10 times the compute power in the next version and the one after!).
Well, one thing about burst processing is that dual-processors may actually be more likely to reduce delays because the chance that both processors will be tied up is smaller than if only one processor is in the system.

Regardless, it doesn't really matter whether you think it's feasible or not to develop software for parallel processors. Parallelism will soon be the only way to advance silicon-based processor designs.
 
Chalnoth said:
Regardless, it doesn't really matter whether you think it's feasible or not to develop software for parallel processors. Parallelism will soon be the only way to advance silicon-based processor designs.

I know, that's what I've been getting at!. That's what worries me about the future! I do parallel programming for a living and I know for a fact that it's not as easy as you seem to want to make it sound!

Some things just don't scale well onto multiple processors by the very nature of what they are. Yeah, I know you've provided a bunch of examples of stuff which will scale, but there are also a bunch of other examples of things that don't!

Fundamentally certain operations within computer applications have to occur in the correct order. Correct ordering implies synchronisation. Synchronisation implies finite scalability, it's that simple.

The whole point I've been trying to make (and apparently failing) is that if parallel programming was easy we'd all be doing it, and and the chip guys wouldn't have spent billions of dollars scaling single-processor performance to ludicrous clock-speeds. The truth is it's easier to solve the hardware problem than the software problem, because in a good fraction of use cases there is no software solution.
 
Heh. Thought you guys would enjoy it. I was amazed at how straightforward and clear he was - very geek to geek, and with a strong sense of technology integrity shining through.

In the context of the B3D forum, for people involved in gaming (which includes all of 3D gaming) his comment that Intel is really attentive to that group must be gratifying. However, he immediately followed up with "but you can't base a 30 billion dollar company on them" which should be a warning. And the general gist of the presentation really told the story - the age of pushing performance forward at the cost of other parameters is coming to an close for general purpose computing. It's not over yet, and may never fully be, but other factors will get progressively more attention as soon as the marketeers figure out how to sell them, and they will find ways to sell those features, because apart from gradually loosing attraction value performance has already ceased to improve at the accustomed brisk pace.

The Q&A session had a notable passage from 1.11.30 onward that made it very apparent that in Colwells opinion x86 really carries a lot of baggage, and that it may not be able to compete quite as impressively going forward with clean-sheet designs. That's probably not much of an issue in PC space for compatibility reasons, for consoles however other rules apply. What will stagnating CPU speeds mean for the development of future PC graphics engines? And does this have any short term or long term implications for PC vs. console gaming?

He also remarked on how graphics processors are getting more programmable, and how "this hadn't gone unnoticed at Intel", a remark that's quite intriguing, and a bit disturbing if you happen to be a grahics IHV. (And of course he let slip the amount of i-cache on a gfx-processor unfortunately without saying which one.) What could he have meant by that remark?

Regarding power draw, in Dave Baumanns interview of Dave Orton, Orton said:
"DB: At the end of the day though, is there really the desire to continue with that – is the drive there to keep pushing that type of model?
DO :There’s always the debate of who steps down first. I think what’s going to happen is we’re going to hit a power limit, so through other innovations and technologies we have to manage efficiencies."
which implies (to me) that nVidia and ATI are also feeling that they are closing up on the end of an evolutionary branch. While gfx is very amenable to parallell processing, and thus can make very productive use of additional die area/transistors, power concerns won't let the current trends scale at the pace of the past. So what paths are going to be followed? Selling HDTV capabilities? Low power draw/silence? Video encoding features? Or will it be business as usual only with more power management technology thrown at the problem?

Anyway, I felt that if someone of Bob Colwells caliber speaks about the state of computing, in a way that was just recently backed up Intel scrapping their entire P4 roadmap (!!!), then maybe even the graphics nerds will take notice as his words weigh infinitely heavier than those of an anonymous "Entropy". Times they are achanging although to what degree remains to be seen. Maybe it would be smart to ask ourselves how this is likely to affect the graphics business?
 
Entropy said:
(And of course he let slip the amount of i-cache on a gfx-processor unfortunately without saying which one.) What could he have meant by that remark?

Yeah, that made me sit up and listen. He was talking about NVIDIA in that sentence, though you're right he didn't really say which GPU.

which implies (to me) that nVidia and ATI are also feeling that they are closing up on the end of an evolutionary branch. While gfx is very amenable to parallell processing, and thus can make very productive use of additional die area/transistors, power concerns won't let the current trends scale at the pace of the past. So what paths are going to be followed? Selling HDTV capabilities? Low power draw/silence? Video encoding features? Or will it be business as usual only with more power management technology thrown at the problem?

I wonder what the true upper limit of the power envelope is (say, for the hard-core gamer market). Even if this latest generation of GPUs are pushing the limits of the maximum size of a manufacturable chip, there's always the multi-chip option to get round that.

But the overall power consumption of the card would still climb and climb, so where is the practical limit? 200W? 300W? 400W?

Would the idea of going very, very wide (say 4 x 16-pipeline chips on a card) and dropping the core frequency address the power issue?
 
nutball said:
I wonder what the true upper limit of the power envelope is (say, for the hard-core gamer market). Even if this latest generation of GPUs are pushing the limits of the maximum size of a manufacturable chip, there's always the multi-chip option to get round that.

But the overall power consumption of the card would still climb and climb, so where is the practical limit? 200W? 300W? 400W?

Would the idea of going very, very wide (say 4 x 16-pipeline chips on a card) and dropping the core frequency address the power issue?

I suppose this really depends on how exotic and far-out you want to look. Using diamond for semi-conductors would let you run chips *really* hot for example. For silicon though, I don't know. I'm pretty impressed that the current breed of chips work as well as they do (let alone at all).

Nite_Hawk
 
Entropy said:
in Colwells opinion x86 really carries a lot of baggage, and that it may not be able to compete quite as impressively going forward with clean-sheet designs.
Very true. I would also argue this is the case for the Windows operating system, too. This is the big price we pay for backwards compatability, but surely at some point this model has to be broken?

Would it be possible to scrap the x86 architecture and start from a clean-sheet? The only way I can see this happening is some quantum leap in processing power that enables future CPUs to efficiently emulate x86 code in parallel with whatever new architecture evolves. Rather like, I suppose, the AMD 64 bit chips emulate 32 bit code. Would this be possible? Or is there another way forward?
 
Nite_Hawk said:
But the overall power consumption of the card would still climb and climb, so where is the practical limit? 200W? 300W? 400W?
That's a tricky question.
Do the high-end parts have to pay for themselves? Or do their value as PR devices and technology drivers render their immediate return on investment irrelevant?

My personal belief is that the IHVs will continue to push the envelope as long as it makes sense for them to do so. And that this is likely to be further than such a product merits in and of itself, mostly for PR reasons. There is an undeniable value to having your brand at the top of every performance chart. Just look at the Intel EE processors - they don't have to actually sell a single processor, as long as reviewers use them in their shootouts. Which of course is an indication of when the bubble will burst - when reviewers say "To hell with these expensive, noisy powerhogs. I'll ignore the boutique cards, and just review the parts that the public are interested in actually buying". I doubt this will ever happen though.
And since the product doesn't have to make practical sense, nor really pay for itself in terms of sales it is pretty much impossible to make any valid predictions about power limits.

However, the climbing gfx power requirements are at odds with the direction the the rest of the industry would like to go in terms of ergonomics, and is completely at odds with the ever growing market share of portables. The extreme focus on these top of the line cards seems odd to me.
 
Diplo said:
Bjorn said:
Yep. Look at the Itanium f.e.
I'm not so sure that's the best example to choose ;)

Agree. The only two obviously redeeming features of Itanium (IPF architecture) is the 128 FP registers, which make it very well suited for numerical work, and the ALAT, which gives the compiler/programmer explicit control on discovery of memory aliasing.

Almost every other feature in that architecture is/will be baggage. Not as serious baggage as the x86 has but baggage nevertheless.

Cheers
Gubbi
 
nutball said:
It's an interesting point to ponder whether we'd be better off with dedicated hardware for MPEG-3 and MPEG-2/4 encoding and decoding. Obviously dedicated sub-systems could be much simpler than a general-purpose CPU of equivalent performance. On the other hand the dedicated hardware will be wasted when the user isn't doing media stuff, whereas in the general-purpose processor they'll more than likely always be being used for something.

That is my theory that things will become more and more specialized in what they do in the future. In an ideal case they would reach a point where you could upgrade when and what you wanted for a longer time period than now, such as GPU's and ram for them actually be interchangeable... of course that would mean that performance would always be less than macimum so I don't know...
 
Chalnoth said:
That's not the point. You put them together in a simple way as well. This is why I still think that parallelism is the future for CPUs. All of the interesting tasks that actually take a significant amount of processor power can be made to work well with parallelism.

Parallelism is the past, present, future, and the infinite void. :rolleyes:

Aaron Spink
speaking for myself inc.
 
nutball said:
The whole point I've been trying to make (and apparently failing) is that if parallel programming was easy we'd all be doing it, and and the chip guys wouldn't have spent billions of dollars scaling single-processor performance to ludicrous clock-speeds. The truth is it's easier to solve the hardware problem than the software problem, because in a good fraction of use cases there is no software solution.

It has never been easier to solve the hardware problem than the software problem. The software designers just haven't tried as hard. This can generally be laid down on the fact that they have CS degrees vs EE degrees. This can be correlated by looking at the bug rates of software and hardware. ;)

Seriously though, there are software solutions that will work for the issues that matter, and there are hardware problems that can reduce or eliminate the syncronization issues. In general they haven't been done in the past because there wasn't a need or it was the path of higher resistence.

Over the past 20-30 years the hardware designers were willing to take the weight on their shoulders and move performance forward. This gave software designers free reign to do things that frankly sometimes didn't make a lot of sense. It also allowed software designers to be aloof and less knowledgable than they really needed to be. This is going to change. Software designers will have to deal with being much more exposed to the parallism in the hardware than they were in the past. They will have to think about the impact their chosen algorithm will have on performance.

The reallity is that we really are on the verge of a great leap in hardware performance, it will just take the software guys another 5-10 years to catch up with where it is going.

Aaron Spink
speaking for myself inc.
 
nutball said:
Would the idea of going very, very wide (say 4 x 16-pipeline chips on a card) and dropping the core frequency address the power issue?

In general you get cubic reductions in power with a reduction in frequency (and matching reduction in voltage)... This assumes that you are opperating within linear regions wrt Vt.

Doing some back of the hand calulations, a 4 die R420 at 330 Mhz, vs 525 Mhz 800 Xt, would have ~ the same power and 2.5x the performance.

All for 4x the cost.
And we haven't dealt with memory power. Or ineffeciencies.

So while doable, it is probably not economically viable for the vast majority of the enthusiest market.

Aaron Spink
speaking for myself inc
 
Back
Top