Bob Colwell (chief Intel x86 architect) talk.

Discussion in 'Architecture and Products' started by Entropy, May 13, 2004.

  1. Vince

    Veteran

    Joined:
    Apr 9, 2002
    Messages:
    2,158
    Likes Received:
    7
    Ahh, thanks Mfa. I stand corrected.
     
  2. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    Um... too bad you're memory isn't good enough to remember, during that time, they didn't have problems keeping up with AMD, infact, they scaled back their roadmap when it came to speed bumps, because of that reason alone. Mind you that was performance taken on RDRAM platforms and only comparing the flagships of each company. Not price performance or anything like that.
     
  3. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,360
    Likes Received:
    1,377
    I'll respond to this, but I'll be quite heavy handed in the editing.

    To me the conclusion is pretty obvious - when asking the question "Is there any group of significant size that actually actively desire higher performance?", the only answer that crops up is "Gamers.".

    And when he says "but you can't base a 30 billion dollar company on them", he implies that Intel won't drive their mainstream CPU development based on the interests of that group. And now we see the first practical result of that - the P4 architectural branch is cut off. It will probably get a die shrink, but that's it. Such a roadmap change is no small thing by the way, the entire industry from memory makers to Dell is part of that. And have probably driven it, in the case of Dell.

    The short term consequences of that roadmap change is that it will take a very long time before we see a factor two performance improvement over the 3,06 GHz P4 - very far from 18 months which is what it has roughly taken during the 80s and 90s up to the beginning of 2000. And if that means that gamers aren't happy, then so be it.

    Incidentally, I feel that their "lets take the mobile chip and adopt it for the desktop" is a reasonable short term approach. Taking a longer term view it might be reasonable to ask if there has to be desktop chips at all. Intel could let their marketeers loose, and simply declare "the era of ergonomic computing", say that all their mainstream chips would fit within a XXW power envelope, and bring out reference solutions that take advantage of that. World+dog would rejoice, new business opportunities all around. Not likely to happen though, the inertia in the industry is enormous.

    Of course competitive pressure enters into it, but there has always been some. In the early 80s it was the 68000 family for instance, there was RISC, there has been x86 clones et cetera. The point is that historically Intel competed on performance (and industry thumb screws). Not based on power consumtion, or amount and quality of integrated functionality et cetera. "Age" stretches back to the 8086, easily. The performance vs power issue simply wasn't the overwhelming concern it is today.
    (By the way, I'd say that the K7 and the P3 was within spitting distance as far as IPC was concerned, though the nod would have to go to the K7.)

    It is "notable". Particularly in view of the criticism he levelled at Itanium. They lost the XBox2 to another architecture, which is decent volume even in Intels book so it is not as if they do not try to compete with x86. In fact, he was so emphatic when he separated compatible and incompatible approaches that I wouldn't be terribly surprised to see an entirely new architecture out of Intel in less than 5 years. He really left very little doubt as to his personal opinion, and the man is a very senior employee at Intel. Does anyone honestly believe that Intel can see something like the Cell architecture, and not consider that it may be a good idea to have something more radical up their sleeve than on die x86 multiprocessing to counter with in case this first Cell processor turns out fine, and is easily scaleable to boot?

    And no, I mean exactly what I wrote. Stagnating speeds including performance (as architectures are largely untouched since the introduction of the P4). How long was it since the 1400 MHz K7 was introduced? I can almost go out and buy something twice as fast today, depending on what benchmarks you use. Thing is, CPU speeds used to double every 18 months, and we're nowhere near that pace today, nor does this stagnation look like a temporary hic up.

    Bob Colwell didn't strike me as particularly confused.
    Anyway, I don't know what he meant by the remark - that's why it is interesting speculation fodder, no? Because it could mean a number of different things, some interesting, some not. For instance, two facts:
    1. Intel is currently eating up the graphics market from below.
    2. GPUs and CPUs do not compete in functionality.
    Could either of these change in any way?

    You seem to be of the opinion that he was spending this seminar making excuses for Intel. That's not my impression of it at all. So different people do indeed interpret things differently.
     
  4. glappkaeft

    Newcomer

    Joined:
    Jul 20, 2002
    Messages:
    17
    Likes Received:
    0
    Well, they tried to release faster 180nm P3's but we all know what happened to them. Intel pushed the P3 until it broke (originally it was supposed to scale at a much more leisurely pace to around 700 MHz) and still could’t touch neither Athlons nor P4's. I don't think that Intel could have gotten much more out of the P3 architecture without a full redesign like the PM and there wasn't any time for that.

    I take the opposite viewpoint – I don’t think everything Intel did with the P4 was good (eg. L1 cache that sometimes behaves like it’s direct mapped and slow shifters). I think that the decision to go with a speed demon design was one of several good ones available and certainly the P4 matched the Athlon on the 180 nm process and often outperformed it on the 130 nm process. Sure it’s more dependant on optimised code but Intel is big enough to force that on developers (just see what happed between IL-2 and IL-2 FB) and in the long run that’s a good thing for everyone. Normally one would expect a 90 nm Netburst core to have competitive performance but somewhere along the way some combination of core fixes, more use of automated design tools, mystery transistors, overestimating the effect of strained silicon, 90 nm leakage and 31 stages borked the Prescott.

    Yes, but that is only wrong if they don’t deliver the performance and the Alpha EV5 and P4 shows that speed demons can work very well.
     
  5. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    It's not like they had to go the route of the P4, though. My point was that the P3 had much higher IPC than the P4. You'd think it would have been pretty straightforward for Intel to have either maintained or enhanced IPC for their next architecture. But no, they decided for, "MHz or bust." This was a dead-end, and they should have known it.

    It's not that high frequency is a bad thing to pursue. It's that pursuing high frequency at the cost of IPC is bad. That's what Intel did, and now they're paying for it.
     
  6. WaltC

    Veteran

    Joined:
    Jul 22, 2002
    Messages:
    2,710
    Likes Received:
    8
    Location:
    BelleVue Sanatorium, Billary, NY. Patient privile
     
  7. I.S.T.

    Veteran

    Joined:
    Feb 21, 2004
    Messages:
    3,174
    Likes Received:
    389
    Jesus christ, that takes up about 46% of the scroll bar on my IE. O.O Very interesting though. Edit: That was 3608 words, Walt. I am truely impressed.
     
  8. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    GPU manufacturers are competing for gamers, and the CPU is dragging them down compared to their main competitors ... consoles.

    CPUs are doing occlusion culling now, and doing a lousy job. That has to move to the GPU sooner or later. Physics? Probably going to move to the GPU too. AI? Well it doesnt need SIMD so the CPU has a bit less of a disadvantage, but it is still massively parallel (each actor is independent in a given timestep). For gaming the CPU is a poor match for any computationally intensive task.

    Maybe the symbioses wont change, but then PC gaming will start bleeding even more players.
     
  9. SA

    SA
    Newcomer

    Joined:
    Feb 9, 2002
    Messages:
    100
    Likes Received:
    2
    When you look at the computation market, GPU manufacturers have targeted the floating point performance segment far more than the CPU manufacturers.

    I agree that pretty much all of the computation intensive tasks will eventually move to what is now considered the GPU. This includes all the physcial simulation, collision detection, 3d graphics (including all the culling), etc.

    This means the GPU of the future will need to gradually transform into a massively parallel general purpose floating point vector processor that can be programmed using standard programming languages like C++. This in turn means general purpose addressing, branching, and stack management. Something that looks more like a FASTMATH processor than a current GPU. However, one major difference compared to a CPU is that the vast majority of the transistors will be used for actual logic with large numbers of floating point ALUs rather than for on-die cache. Instead future GPUs/vector processors will likely continue to rely on very high external memory bandwidths with a small amount of on-die cache.

    It also means scaling up frequencies significantly using dynamic logic.
     
  10. Entropy

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    3,360
    Likes Received:
    1,377
    This is the central point your post is trying argue.
    I'm claiming that you are wrong. The problems with limited performance scaling with shrinking feature size is not Intels alone. I can see why you'd like to kick Intel in the shin for the P4, but this problem is deeper than a particular x86 implementation.

    Now, compare the above to the Sony patent on the Broadband Engine. The traditional "CPU" and "GPU" division is demonstrably not the only option. There are more ways than one to skin this particular cat. I would further contend that Intel is very aware of this. The interesting question is - what will they do about it? Nothing or something? And if something, how? What can be done within the current PC paradigm? Outside it?


    Drifting back to the the question if this is the breaking of a trend of escalating power draws for PCs, it still remains to be seen if Intels move to a core originally intended for portable application is just to enable further scaling. Will we see 120W desktop variants of these cores? This quote I nicked from The Inquirer from an Intel financial Q&A gives hints:
     
  11. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    CPU and then a Math Co, we've been there before, they could merge.

    I feel that, there will come a time where a motherboard will have very fast interfaces, within which one can attach a processor geared towards differeing workloads, however.

    I'm leaning towards the latter because, we can start affording that sooner than we can the transistors in a single package to support a unified solution.

    If I understand things correctly, logic heavy chips have sprawling networks for power and signalling, these large networks will utilize interconnects more heavily, which, as the feature size decreases, present greater resistances; won't all of this require greater voltage to propagate a singal and thusly increase power quadratically as opposed to smaller PU and smaller networks at higher frequencies?
     
  12. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    I think the voltage that a processor can use is limited by the transistors that are used.
     
  13. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    955
    Likes Received:
    52
    Location:
    LA, California
    SA -
    I hope that your description of a "massively parallel GPU of the future" is coupled with a language that allows the software developer to directly specify parallelism to the compiler. Keeping the syntax similar to C/C++/Java/C# is IMO a good idea, but to my mind using any of those languages as they stand would be like having an old grannie drive a Porshe.

    Somewhat OT, an interesting link :
    http://wavescalar.cs.washington.edu/

    exciting times ahead...
     
  14. mboeller

    Regular

    Joined:
    Feb 7, 2002
    Messages:
    923
    Likes Received:
    3
    Location:
    Germany
    If I understand you correctly, ATi is already on the right track, cause they have licenced the Fast14 - process for dynamic logic from Intrinsity which is also producing or licencing the FastMATH processor. So it could very well be that they have also licenced the FastMATH processor too.

    IMHO there could even be another path to high performance computing for massive parallel multimedia-tasks :

    http://stretchinc.com/

    They embedd an FPGA, called ISEF within an RISC-processor so that they can program the FPGA with C/C++. The performance of this combination seems to be very high. An S5000 @ 300MHz programmed in C++ is faster than an FastMATH processor @ 2GHz programmed in assembler in the EEMBC telemark benchmark.
    The only drawback I see at the moment seems to be the slow interchangeability between different algorithm due to the need to program the ISEF/FPGA.
     
  15. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    FPGAs are hugely inefficient as far as power consumption and area use when working with wide words is concerned. Anything suited to our needs would still need to have specialized circuitry for arithmetic, datapaths, register sets and caches IMO. The overhead of implementing any of these in a FPGA is unrealistic.

    There might be levels below simple multi-core chips with relatively standard processors/shader-units though which could work well, something like MOVE. I still think condensed graphs might translate well to hardware.
     
  16. glappkaeft

    Newcomer

    Joined:
    Jul 20, 2002
    Messages:
    17
    Likes Received:
    0
    I think we more or less agree on that. I just don't think it was a bad idea at 130 nm an earlier and it should have worked decently enough at 90 nm but all this Intel talk about superscaling 10 GHz P4's is looking more and more like someones bad acid trip.
     
  17. WaltC

    Veteran

    Joined:
    Jul 22, 2002
    Messages:
    2,710
    Likes Received:
    8
    Location:
    BelleVue Sanatorium, Billary, NY. Patient privile
    What I'm simply trying to point out is that AMD realized these things long ago when it was designing the original K7 core, and so AMD never embarked on the same kind of "less-efficient-but-clocked-to-the- stratosphere" approach to Athlon that Intel announced early on for the P4. The roadmap cancelled here is the P4's--not the Athlon's--so that should tip you to the fact the problems faced by AMD with Athlon as to its particular roadmap, and the problems faced by Intel with respect to the P4 roadmap are entirely *different* sets of problems, because the strategies behind the roadmaps are different, and the cpu architectures themselves are different. AMD never at any time announced a 7-10GHz ultimate production target for the K7, Intel did for the P4, and so obviously these differing roadmap strategies produced different kinds of problems for each company to solve.

    Of course, as I said, everybody knows cpu manufacturing is getting tougher--that's not news--and is certainly something generally well known prior to Intel's P4 MHz-ramp roadmap being cancelled. What's going to count in the future is how these companies respond to the problems and challenges they face ahead. This does not simply boil down to a matter of technology--it boils down to the strategies companies employ to deal with those problems--which in turn becomes much more a matter of judgment than one of pure technology. As you say, there are always several ways to skin the cat...;) What counts in the end, though, is whether or not one company chooses a better method of skinning the cat than another. With the introduction of Athlon, AMD embarked on one method of skinning the cat (architecurally-driven increases in processing efficiency), with P4 Intel embarked on another (less-efficient architectures driving performance through MHz ramps achieved through process reductions), and it certainly seems to me that what's been cancelled here is the P4 strategy--not the Athlon strategy, right?

    In fact, Dothan and Itanium also eschew the strategy of driving performance through the MHz ramping of less efficient architectures, don't they? So does IBM's G5, Sun's SPARC, etc. In other words, what AMD's done relative to its Athlon strategy is actually much more common in cpu design and manufacturing than was the P4's MHz-driven strategy, even within the Intel family of cpus itself. Cancellation of the P4 roadmap tells us only about problems Intel had with its P4 MHz-driven roadmap, which Intel has found to be insurmountable. However, they are changing strategies to bring their x86 design strategy in line with the strategy used by AMD for x86 cpu design and manufacturing--just because the problems are tougher these days doesn't mean Intel's giving up--it's just shifting gears and changing strategies.
     
  18. speng

    Regular

    Joined:
    Nov 25, 2002
    Messages:
    454
    Likes Received:
    5
    I agree with most of what you're saying, but I wouldn't lump IBM's G5 into this.

    It's focused on more floating point performance and parallel processing not the Mhz as seen they are barely able to reach 2GHz.

    Speng.
     
  19. pascal

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,968
    Likes Received:
    221
    Location:
    Brasil
    Let me clarify my positions below.

    The Pentium 4

    I really dont know what happened but there are some levels of possible problems like:
    a - the basic idea of use long pipelines to implement a speed deamon.
    b - the conceptual phase of the design based on "a"
    c - the implementation phase of the design

    Looks like the conceptual phase of the P4 design had some modifications of the initial idea, and the implementation phase had problems like:
    - few people (3) really understand the entire design
    - too large development team
    - probably the tools where the best possible, but not good enough to help with such complex design

    The point is I am not sure a long pipeline is not a good idea if you:
    - have a better ISA
    - dont have heavy logic overhead for hyperthreads, i32e, DRM, etc...
    - have better tools to develop it

    Maybe if Intel had redesigned/improved the northwood core they could:
    - Had implemented a faster 1MB L2 cache ( 5% performance increase)
    - implemented a larger L1 microops cache
    - implemented a larger L1 data cache
    - a second FPU unit
    - some speed gain without the hyperthread logic overhead

    This could
    - have "only" around 80 Millions transistors
    - more chips per wafer and better yields
    - colder
    - higher IPC

    Then a 2.8GHz P4 could have a 3.2GHz performance and a 1.8AGHz heat dissipation.
    And a 4GHz P4 could be as fast as a 5GHz Prescott.
    This could be a winner in the current PC market :)

    edited: also we dont dont know what impact a low latency onchip memory controller could have with a long pipeline CPU.

    The wintel PC model

    Independentelly of the P4 success or failure IMHO the current wintel Personal Computer model is old, ineficient and it is time to change for something better.
    We need some alternatives urgentlly. Maybe I will post a new thread about it.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...