Bob Colwell (chief Intel x86 architect) talk.

Discussion in 'Architecture and Products' started by Entropy, May 13, 2004.

  1. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    Diminishing retuns have only really set in around a decade ago. Until recently there was no existing market which would have a quick enough uptake to invest real money in parallel architectures, GPUs and consoles are changing that ... once both of those, and the PCs developed using the console chips, make a big enough dent in the willingess of people to shell out for new "C"PUs even the desktop processor manufacturers will have to buckle.

    The only applications the majority of us need lots of cycles for have plenty of parallelism.
     
  2. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,360
    Likes Received:
    1,377
    Generally speaking, gfx chips already trade off clock-speed for parallellism. That's why we have these huge die-area, 16-pipe chips running at relatively modest clocks. This works very well in graphics, while being quite a lot trickier for general purpose code.

    Since it parallellizes nicely, graphics has been limited mostly by how many transistors you could cram on to a die with sufficient yield per wafer that it could be sold at a profit. The problem for that model now is that power draw isn't dropping with feature size the way it used to, thus the payoff for moving to finer lithography will be more limited than it used to be. Graphics will still stand to gain a lot by going to finer lithography though, more than CPUs for sure.

    {CPUs are likely to move to on-chip parallellism as well, but the performance payoff will be more limited in general, particularly if we move from two to 16-64- et cetera. That is, there is some low hanging fruit to be picked early on in the game, but as long as we are talking vanilla SMP and single user machines, the payoff isn't likely to be impressive. Then again, the advances from improving lithographic technique doesn't look likely to stagger us with its pace either. We've had a factor of two in three years now - almost twice as long as it used to take.

    Windows and x86 isn't exactly a dream team to base a massively parallell architecture on. For single user machines, you'd like your parallell processors to work efficiently towards solving a single problem fast, more of a supercomputer scenario as opposed to the server scenario where you can run relatively independent processes on different processors.}
     
  3. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    I don't think so. I think we'll simply see not much gain at first, but as software gets more used to the idea (with parallelism in the range of 2-4), software developers will get used to writing parallel code, and we'll start to see massive improvements going forward.

    Remember that for any processing-intensive code today, there are many separate pieces that need to be processed. Of course you won't be able to make as good of use of the parallelism as is made in a GPU, but it'll definitely be a huge improvement.
     
  4. pascal

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,968
    Likes Received:
    221
    Location:
    Brasil
    IIRC the P4 had three generations and since the begining heat and performance were a concern. It has some valid points like improved bandwith (compared to the P3) but something went wrong with the hyperpipeline.

    IMHO the P4 design was a failure since the begining. I dont know if the problem is at the conceptual level or the implementation level but my guess it is the later.

    Now Intel has only the redesigned P6 core to start again. Probably they can quicklly do something redirecting it for better performance like higher memory bandwith next year (800MHz or 1066MHz).
    CPU paralelism just happened because the cores are small and then it doesnt cost much to put two cores in the same die. Definetlly it will be much better than hyperthread :)

    Software will slowlly start to use it but I dont expect much. PC software in general are low quality hype driven software. I really dream about a good clean light RTOS with high quality human interface and tools.

    But the redesigned P6 core is not the ultimate core. Maybe in the future we will see something (new core) much better with extreme superscallar, improved FPUs, maybe something like a vector unit.

    Also memory latency and bandwith has to be improved. The multipliers are too high. IMHO this is holding the CPUs today. Cray said that you cannot fake bandwith.

    I dont know if the hyperpipeline will be used again with a new design and better project management but CPUs design cost too much and business value is too high.

    Another future possibility is having the CPU, some generalized GPU and onchip memory controller in the same chip sharing a large fast low latency UMA 8) About power consumption, heat generation and dissipation. Consoles like the PS2 and gamecube should be the model.

    IMHO we need some kind of PC-console hybrid.
    Cool, quiet, low-heat, efficient, low cost, visualization driven, flexible and with a lot of power.
     
  5. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    You think that it could have been running at an even higher clock with better implementation and that would have saved it?

    I think we need such a hybrid too, maybe if enough people buy IBM's workstation we will get it too. Cause x86 wont get out of the chicken and egg problem anytime soon, and graphics card developers can only do so much with a large part of the cost of a PC still going to supporting some archaic wide superscalar monstrosity even if they start supporting more generic processing.
     
  6. pascal

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,968
    Likes Received:
    221
    Location:
    Brasil
    Higher clock no but some kind of improved Northwood core with lower-heat and higher Instruction per cycle maybe could have saved it.

    It is a shame we are still using x86 in the 21 century. I hope the best for IBM too.

    Who else could do this hybrid for us?
     
  7. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    I was talking about the Cell workstation.

    The problem isnt x86, PPC is no better ... which is not to say I think they are bad, I just think that wide superscalar implementations make no sense. A simple in order dual issue architecture with a scalar and a SIMD pipeline is what is needed for parallel oriented processors (and both x86 and PPC could do that decently enough, not ideal but as x86 has proven you can do good enough with a non ideal ISA).

    If NVIDIA or ATI merged with someone with a x86 license (and preferably a fab) they might able to pull it off.
     
  8. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,360
    Likes Received:
    1,377
    From a computer architecture point of view, I find it extremely disturbing to see 256 MB of very fast RAM that only the GPU can use, (and in practise typically don't need!). Imagine the positive overall effect if the CPU designers knew they would have access to such resources.

    I think you put the finger on Microsofts Great Fear. Look at the PS2. Look at a PC box. Imagine a close future where someone could buy a new gaming console, hook it up, and play games, watch films, and if they like, access the net and do small time utility computer tasks. Just how interested will consumers be in a bulky, noisy PC that, adding insult to injury, is more expensive? Of course, consumers already have that option today, but this generation it turned out that the consoles simply removed PC upgrade impetus. Which is bad enough, in Microsofts view. (The first vision though has a better chance of becoming reality the next time around since in 5 years or so quite a few people will have TV screens that can do a decent job as monitors as well, unlike the situation for the PS2.) The console doesn't have to replace the PC completely - simply turn it into a utilitarian commodity, and the PC has lost much of the battle for consumer dollars. I would contend that this has already happened to a large degree. The PC can fight back by going to extremes - $500 gfx cards drawing over 100W isn't an option for consoles, on the other hand the PC market for such beasts isn't exactly huge by either console or total PC volume standards.

    IMO, the best path for the PC to take is to take a leaf from the console book, exactly as pascal describes it above. It would mean taking an initial step backwards in power, but two steps forward in ergonomics and fitting into peoples lives. It would also be percieved as something new and a positive change - something the PC sorely needs. It would also retain all the traditional advantages of PCs such as upgradeability, flexibility in peripherals and software et cetera.

    Intel scrapping the P4 roadmap show that even they aren't hell bent on following the beaten track. The question remains - when will the gfx IHVs follow suit? The trick, as I see it, is daring to take that initial step backwards in performance, and finding ways of effectively selling the advantages. I think it would be successful. It could be argued that consumers are already favouring such a path as demonstrated by the continuing growth of portable marketshare.
     
  9. pascal

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,968
    Likes Received:
    221
    Location:
    Brasil
    I know.

    I agree. I was just pointing how loocked to the past we are. We have to get away with most of the legacy for the practical and spiritual reasons and clearlly signalling a new direction.

    Agree.

    [high speculation mode on]Some more possibilities.
    Nvidia buy SGI.
    Recognized brand worldwide. Lots of IP. Some engineering. Some corporate recognition. MIPs IP.

    Now SGI start to sell some kind of open hybrid with improved MIPS CPUs, fast UMA, generalized GPU, DVD, etc... for the desktop (game, SOHO and corporate) and the living room. Maybe 2 internal slots for expansions/flexibility.

    Also deliver HW&SW tools integrating/scalling it in some way.

    IIRC Jim Clark wanted to go to the consumer level.
     
  10. pascal

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,968
    Likes Received:
    221
    Location:
    Brasil
    Me too. Also the comunication between CPU and GPU could be improved a lot. And the idea of using the generalized shaders units could be done more easilly.
     
  11. overclocked

    Veteran

    Joined:
    Oct 25, 2002
    Messages:
    1,317
    Likes Received:
    6
    Location:
    Sweden
    A very logic person from the impression i got.
    Stands with his two feet´s on the ground.
     
  12. WaltC

    Veteran

    Joined:
    Jul 22, 2002
    Messages:
    2,710
    Likes Received:
    8
    Location:
    BelleVue Sanatorium, Billary, NY. Patient privile
    That's fine, but never lose sight of the fact that he's an Intel geek....;) (Not an independent geek. Heh...;))

    I'm not really sure what to make of that comment, as it's never really been obvious to me that Intel has ever at any time in its history been a company "based on" developing products primarily for people who play computer games...;)

    It's an interesting comment, though, and kind of an odd one, if we assume, which I do, that Colwell has not spent his time with Intel believing that the company was "based on" creating products for computer gamers...;) I refuse to believe that he's just now found out that Intel was "based on" some fundamentally different concerns, and made these remarks out of his shock at this discovery...;)

    Perhaps it's a back-door apology offered in advance for Intel being unable to push x86 performance much further, which he assumes will be of concern to computer gamers, and he wants them to know that while Intel appreciates their business it's important that gamers remember that Intel has other fish to fry...?

    Still, I'm not convinced he meant to say that, either, exactly. Just an odd remark in this context, I think, as I really don't know a soul who's ever thought that Intel was based on creating products for computer gaming.

    Interesting that you'd use the word "age" in this context. It's actually like about "five years" from Intel's perspective as opposed to an "age," don't you think? Wasn't it 1999 in which the primary x86 workhorse for Intel was the PIII, which it stuggled mightily to bump to 1GHz in response to the cpu perfomance of AMD's K7, which was nowhere near as dependent on ramping MHz clocks for its overall processing performance? Prior to AMD's introduction of the Athlon, it's absolutely certain that Intel was in no MHz ramp rush whatever, as the company routinely released new models of older cpus clocked 50-75MHz higher than the last one, with somewhat large gaps of time in between, in a lazy, unconcerned fashion befitting a confident monopolist. Intel found it couldn't ramp the PIII much in MHz in response to K7, though, and then the P4 made its debut.

    Remember in the beginning how Intel talked often in glowing terms about ramping the P4 to "10 GHz," eventually? And Intel was saying things like this without the slightest clue that it could take the P4 to 10GHz in the first place, and I found it amazing that people gave it any credence at all at the time. Interestingly enough, you never heard similar talk out of AMD at the time about the future MHz performance of Athlon, because AMD was too busy looking for ways to increase processing performance other than in ramping an architecture to 10GHz. It strikes me that Intel is only now apprehending the "core" notions AMD was working with when it was designing the original K7 prior to introducing it.

    The thing about performance in cpus is that there are other ways to describe it apart from MHz...;) It's heartening to see that Intel is finally acknowledging this publicly.

    (It's also a bit silly and utterly facetious, too, since Intel has always known this basic fact--Itanium proves the premise conclusively. But then, so does Athlon, G5, etc., and of course Intel doesn't like to talk much about that.)

    Is it really "notable" that he'd say this, considering that Intel opposed x86-64 from the start, and was busy publicly telling the world that "Hey! If you run x86 software, then relax! You don't need 64-bit computing. But the good news is that when you get to Itanium you're going to love it!"...? Also, there's no doubt in my mind that by far the biggest piece of "baggage" relative to x86 that Intel would like to chuck is AMD...;)

    Also, the last remarks you make as to general "graphics engines" and "consoles" and what you term "stagnating cpu speeds" by which I assume you mean "stagnating MHz clocks"--which as I pointed out does not have to mean "stagnating performance" at all--sound very much like you assume that he's speaking for AMD and everybody else. I don't think it would be wise to view any of his remarks outside of an Intel-specific context.

    "Clean sheet" just sounds so "clean," doesn't it? It surely sounds better than saying "An architecture wholly incompatible with the entire world- wide x86 software market," no doubt. That's the other kind of "baggage" Intel should have considered--the hundreds of millions of dollars, if not billions of dollars, that companies and individuals have invested in x86 software in the last few years. That's definitely not the kind of "excess baggage" companies and individuals consider disposable, is it? Obviously not...;)

    I have no idea what he meant by it--just as I had no idea what he was talking about in saying that it should be understood that Intel couldn't be a company "based on" making products for computer gamers...:) (Since Intel never has been that--and he might be the only person alive confused about that, should he actually ever have thought that himself.)

    Why should IHVs be concerned at all about an off-the-cuff remark which says, essentially, absolutely nothing?....;) Intel talked about 10GHz for the P4, too, which was a lot more specific than this remark, and nothing came out of that, either. Intel similarly did a lot of talking about Rdram, etc. that proved wholly inaccurate. I think what will "concern IHVs" coming from Intel is when Intel introduces and markets retail 3d chips & reference designs competitive with theirs--that's when I think they'll be concerned. Intel got into the retail 3d-chip business a few years ago, briefly, in trying to prove the validity of the AGP bus's practical application for 3d gaming, got their socks knocked off by local-bus products from 3dfx and nVidia, and they took their marbles and went home as I recall.

    First, I think that Intel needs to say something specific here , and then Intel needs to do what it said, and then it will be the proper time for any parties in the competitive landscape to become "concerned." I think you are stretching his remark here way out of context.

    Just goes to how show different people interpret things differently--as I didn't see it as "backing up" Intel's ditching of the P4 roadmap at all--I saw it as an apology for Intel being unable to bring it's original proclamations as to the inevitable MHz ramp for P4 to fruition, and an acknowledgement that the expectation had never been sound from the beginning. The problem with it, of course, as I point out in response to you remarks above, is that a whole lot of relevant information as to the innovations and directions of companies other than Intel have been of signal importance in the scheme of things, and Intel spokespersons almost always talk about what Intel is doing as if nobody else existed. It's certainly a convenient security blanket to wrap up in, but I doubt it has much substance in the way of effective insulation properties...;) There's a whole world of technology out there aside from Intel, but sometimes I wonder if Intel itself won't be the very last party to realize it.
     
  13. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Oh, I definitely think it was at the conceptual level. Higher MHz at the expense of IPC is a dead end. Intel should have known this.
     
  14. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    The P4 wasn't a bust from the begining. Prescott is the problem, they should have attacked IPC, instead they went after clock rate TOO aggressively. Sustaining the clockrate or getting small bumps, while persuing things like TLP more aggressively would be smarter.

    But to say it's a bust is pretty stupid consider the performance results achieved. You can go about the k8, but they're more reliant on exotic materials for clock bumps, considering their solution is SOLELY attack serial performance.
     
  15. ChronoReverse

    Newcomer

    Joined:
    Apr 14, 2004
    Messages:
    245
    Likes Received:
    1
    Eh? AFAIK, the K8 design was to both increase IPC (such a vague concept) and frequency headroom over the K& design.
     
  16. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Sure it was. The first P4's often couldn't outperform the P3 1GHz, let alone the Athlons at the time.

    There were some benefits to be had on the bandwidth side, but I don't see how that had anything to do with the Pentium4 architecture. That had more to do with the bus architecture: a bus which could have been strapped to any CPU architecture.
     
  17. glappkaeft

    Newcomer

    Joined:
    Jul 20, 2002
    Messages:
    17
    Likes Received:
    0
    Yes, but that was 1.5 GHz P4's running code most often compiled for 486 and Pentium 1 processors. With todays codebase I'd much rather have a 2.0 GHz Willy P4 over a 1.0 GHz P3. What I don't understand is how Intel went from the Northwood (IMO the overall best 130 nm processor) to the Prescott. With all the signs of accelerating leakage on the 130 process and beyond, why go with a 31 stage design and double the amount of core transistors?
     
  18. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    Sure. But if you remember, the 2.0 GHz P4 wasn't released until quite a bit afterwards. If Intel had continued with P3 line, they'd have been quite a bit faster than 1GHz by the time the 2GHz Willy was released.

    Now, I would like to say is that not everything Intel did with the P4 was bad. There are definitely many things about the architecture that are very good for performance. But I think that every single one of its benefits could have been done better on a chip that was made for lower clocks and higher IPC.

    But, the Intel brass has for a very long time believed that MHz is king, that people equate high frequency with high performance.
     
  19. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    Implimentation Level... I wish I could post Bogg's actual comments from Micro-33, but my laptop is.. er.. broke; so this shall have to do:


    The rest of the article talks more about the cuts, if you have time it's an interesting read. Sorry I can't give you the actual presentation, but it's around the net somewhere and if I could find it... ;)
     
  20. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    Those designs are made at the conceptual level. They decided to make the cuts, and then implement what they had left ... the implementation was an effort to make good on a poor design, at which they succeeded as well as could be expected.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...