Intel Larrabee set for release in 2010

Discussion in 'Architecture and Products' started by B3D News, Jun 22, 2007.

  1. Rys

    Rys Graphics @ AMD
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,182
    Likes Received:
    1,579
    Location:
    Beyond3D HQ
    Graphics application programmers using it for something mass market will never program the chip using x86 directly, so I think that's moot. The graphics programming landscape shifted away from raw to high-level some time ago, and I can't see it going back.

    As for HPC, that space has always had a multitude of architectures and ISAs to consider depending on what's going on, so x86 isn't a killer feature there either. NVIDIA offer a GPU programming model that's not unlike what HPC has to deal with already, using extended C no less (which is a killer feature in my opinion), and it's only going to improve in terms of ease and hardware features to help over time.

    So as a parallel programmable architecture, I don't think being x86 is a big advantage for HPC either.
     
  2. Hannibal

    Newcomer

    Joined:
    Mar 19, 2007
    Messages:
    16
    Likes Received:
    0
    Oh yeah, I'm sure they'll upgrade to a real ISA, and they'll keep pushing CUDA and other tools... but that won't stop them from being a fabless semi company with a niche, boutique ISA that's in direct competition with a commodity x86 part from Intel.
     
  3. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    What's a real ISA?

    Apart from that I partially agree on the rest, NVIDIA will probably need to address CUDA shortcomings (as it'snot very easy and straightforward to code for) in the next couple of years if they want to be competitive with Intel, programming model wise.
    Can NVIDIA do that? no doubt about it, they certainly will.
     
  4. Hannibal

    Newcomer

    Joined:
    Mar 19, 2007
    Messages:
    16
    Likes Received:
    0
    My point is not that people will program these chips in assembler. My point is that x86 compatibility brings with a ton of tools and binaries that can run on a chip with zero tweaking and recompilation. It just works. That's why x86 continues to be compelling in whatever new form factor or usage scenario that Intel can shoehorn it into, from UMPC to HPC.

    Are you honestly suggesting that x86 isn't making major headway in the HPC market?

    Also, for what it's worth, Intel showcased a set of C extensions (Ct) for data parallel (i.e. Larrabee) programming... and of course, the tools being x86 and all, it was pretty painless for them to get it all up and going... the same is true for the OpenMP stuff they were demoing across the room... again, if you're just extending and x86 app or tool then you have to do so much less work.
     
  5. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,551
    Likes Received:
    24,483
    As for Sweeney, he said dedicated video hardware would have been obsoleted years ago, circa Voodoo 2 timeframe. He also said NV30 rocked his world and would crush the competition. We know what happened there. He's more off the future hardware trends than on.
     
  6. Rys

    Rys Graphics @ AMD
    Moderator Veteran Alpha

    Joined:
    Oct 9, 2003
    Messages:
    4,182
    Likes Received:
    1,579
    Location:
    Beyond3D HQ
    Point taken, but I was coming at it from a 3D graphics standpoint, where Intel will have to provide the expected programming infrastructure there, mostly on Windows.

    Of course not. What I'm suggesting is that x86 is far from being the major player in HPC, and I can't see why yet another player with C programming for their hardware (on x86 hosts let's not forget) can't make headway into HPC too.

    Good, I'm happy to hear that. But then I also don't think it's that big of a deal when your programming model supports C and requires you run on x86 anyway, which is the case for NVIDIA and CUDA. They're in the same boat there, from a programmer's perspective, as Intel and Larrabee (IMO).

    I'm thinking about it as someone who might want to develop and deploy a HPC app targetted at a massively parallel FP machine. I don't think the difference there is in Larrabee's favour in any way, but I'll wait and see how Intel expect folks to program the thing in a parallel fashion (for non-graphics apps) before I truly make my mind up.

    Edit: Beware that I'm coming at it from the perspective of Larrabee being parked an add-in-board (if that wasn't obvious). Obviously if it's the main processor, the dynamic changes a bit.
     
  7. Arun

    Arun Unknown.
    Legend

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    I agree with that, although then we're just back in the domain of uncertainty.
    If Larrabee is optimal or near-optimal at FP-heavy x86 applications without recompiling or modifying a single line of code (and without code morphing), then I will eat my hat, and I can assure you that many other posters on this board would also be willing to do so.

    Being near-optimal without recompilation would indeed be a killer feature. But if you need to recompile (which I'm sure you will), x86 compatibility has essentially become as much of a gimmick as it was on the original Itanium. In fact, I'd be tempted to point out a few other parallels there, but no matter.
    Well, that's nice. I'd be more impressed if they were also under NDA for NVIDIA and AMD GPUs in that timeframe (whatever that may be), however, and could compare the two. Because as it is, that's roughly similar to being impressed by NDAd NV30's specs when ATI just released the original Radeon.
    Real ISAs are always (and will always be) inferior to schemes such as NVIDIA's PTX when the one-time compile/optimization is not a problem, which it obviously is not for data-parallel HPC applications. If you think otherwise, I'd gladly counter any arguement you might have there! :)
     
  8. nutball

    Veteran Subscriber

    Joined:
    Jan 10, 2003
    Messages:
    2,496
    Likes Received:
    983
    Location:
    en.gb.uk
    And NVIDIA have a 3+ year head-start in getting people using their C extensions.

    The HPC crowd don't need binary compatibility, we just recompile our codes. Yes x86 is making headway but the x86-ness isn't the thing that makes it attractive, it's the price-performance ratio.

    This is very true.

    OpenMP is a mixed bag in my experience. My personal feeling is that one of its major failings is that it's too easy to use (paradoxical as that may seem). It has some serious shortcomings in certain areas (eg. memory placement). It's a great way to scale from 1 to 10 threads, it's not so hot going from 10 to 100. IMO it hides too much from the programmer for it to be viable for extreme scalability on a NUMA architecture.
     
  9. Voltron

    Newcomer

    Joined:
    May 25, 2004
    Messages:
    192
    Likes Received:
    3
    "niche fabless" has nothing to do with market acceptance of products.

    Also, perhaps you have noticed that NVIDIA has sold half a billion chips in the past 10 years and is now selling at a rate of over 100 million chips a year and growing pretty rapidly. So I really don't think "niche" is applicable.

    If you are implying all fabless semiconductor companies are "niche" I am not sure you are correct there either. Just maybe the economics that TSMC brings to the table are far superior than Intel's, in spite of Intel's process lead. Indeed, that is what allows NVIDIA to comparatively larger chips than Intel and earn almost equivalent gross margins at about some between 1/5 and 1/10 the price of an Intel chip on average (as you are surely aware memory and other components make up a substantial portion of a graphics board cost). Those are very powerful economics to deal with. So INtel has some advantages (at least for now) by operating fabs, but NVIDIA has a big one by not operating a fab.
     
  10. Hannibal

    Newcomer

    Joined:
    Mar 19, 2007
    Messages:
    16
    Likes Received:
    0
    Voltron,

    I didn't say or imply that fabless == "niche." I said that their /ISA/ is niche.

    The fabless part becomes a problem when you're trying to design a really complex, high-performance part and make it fit someone else's process. This is where Intel as an IDM has an advantage: they design processors for their specific process technology. Their fab engineers and architects are under one corporate roof, so to speak, and they share a lot of specific knowledge that lets the architects design things that Intel can product in volume at good yields. But this foundry vs. IDM tangent is off-topic...
     
  11. Voltron

    Newcomer

    Joined:
    May 25, 2004
    Messages:
    192
    Likes Received:
    3
    True - I think i read that with a bit of dyslexia.

    But Hannibal, you did bring up NVIDIA being fabless as point. And while in-house manufacturing clearly has its advantages, it would be naive to think that NVIDIA and TSMC aren't working extremely hard to close that gap. Meanwhile. in spite of those advantages the proof of the power of NVIDIA and TSMC's cost advantage in these companies margins, financial statements, and ASPs. So people can talk all they want about Intel this and Intel that, but economically the evidence is there.
     
  12. Hannibal

    Newcomer

    Joined:
    Mar 19, 2007
    Messages:
    16
    Likes Received:
    0
    Of course it won't be optimal without a recompile. Indeed, it has been known for a while that Larrabee programming is x86 + some GPU-specific extensions. So certainly nothing will run optimally on Larrabee without a recompile any more than code written for SSE2 hardware will run "optimally" on SSE4 hardware without a recompile. But that's not the point.

    The point is that you don't /have/ to do a recompile to get the world's largest installed base of software and tools up and running on it. You only do a recompile when you need a specific piece of new functionality or a performance boost.

    As I said in my response to Rys, it's all about the x86 tool chain, and the relative painlessness of extending it to support new ISA features, and the fact that the same hardware will run both legacy code (albeit sub-optimally) and code that has been recomplied with the extended tools optimally.

    You guys are looking at this from a lone developer's point of view, but I'm talking about the wider x86 ecosystem picture.

    Ultimately, x86 becomes attractive in any given niche--whether it's HPC or ultra-mobile--at the exact moment that you're no longer at a real performance disadvantage for using it. Now, this is a subtly different claim than saying that x86 has some inherently attractive features from an indivdual code geek's point of view--it doesn't. But what it has is enormous scale, because it's backed by this huge installed base of tools and expertise.

    So the moment that Moore's Law makes it possible to use x86 in an area without suffering too badly from a relative performance standpoint, then it becomes a compelling choice for these scale-based, ecosystem reasons.

    I'm not really sure how to respond to this... other than I just completely disagree. I mean, nobody really /wants/ to use an intermediary ISA, or JIT, or anything like this, if they could just as easily use a product that natively implements the world's most popular ISA. I'd love to hear your arguments in favor of investing, say, an substantial portion of a large company's developer resources in a proprietary, intermediary ISA when there's an x86 solution that gets you, say, 80% there.

    I think the only time in the history of computing that the industry has looked at x86 on the one hand and a JIT or BT solution on the other (where you code to an intermediary ISA that's not actually implemented in hardware) and said, "I'll take the non-x86 ISA" is with Java.
     
  13. Hannibal

    Newcomer

    Joined:
    Mar 19, 2007
    Messages:
    16
    Likes Received:
    0
    Understand that I'm speaking strictly long-term here. I don't think NVIDIA is going to go under next month or next year. When Larrabee comes out, even if Intel does knock everyone's socks off with some kind of RTRT + Raster badness to the point every gamer on earth must immediately rush out and buy an Intel GPU, it's going to take the market a while to figure out that the tectonic plates have shifted, and that a "GPU" that's really a many-core x86 part is fundamentally a game-changer.

    I mean, all the RISC workstation vendors didn't go out of business when Intel launched the PPro. Of course, I think that GPUs vs. Larrabee will play out on a much more compressed time horizon than RISC vs. x86 did.
     
  14. Arun

    Arun Unknown.
    Legend

    Joined:
    Aug 28, 2002
    Messages:
    5,023
    Likes Received:
    302
    Location:
    UK
    Well, if you don't care about the overall performance or the vector FPUs, why not just run your code on a vanilla ARM core? :) I get your point, but you're falling in the same trap that plagued the Itanium design team, IMO. Binary compatibility is only appealing if it delivers 'good enough' performance. For a chip that is exclusively aimed at high-performance workloads (whether that is HPC or Graphics), there is no such thing as 'good enough'.

    That is one thing I completely and utterly agree with. The tremendous investments in toolchains for x86 are certainly an advantage, although I'd also like to point out that it's not perfect yet for multithreaded applications, and that debugging programs with tens of threads can be a nightmare right now IMO. I would certainly hope and expecvt this to be much easier in 2009+ than today, however.

    Which corresponds to the exact second when performance becomes 'good enough', because x86 can never be optimal. While this indeed makes it attractive for the ultra-mobile market in the long-term, the very definition of HPC tends to be that there is no such thing as 'good enough'. The only reason (except the toolchains) why x86 is attractive in HPC today is that it has better economies of scale in terms of *production* and R&D. You know, the exact same ones GPUs also enjoy today...

    Moore's Law implies nothing regarding relative performance penalties. If you are 50% less efficient, that won't magically change when you're thinking 32B transistors vs 16B transistors compared to when it was 32M vs 16M. As such, x86 only becomes attractive when it either has economies of scale (for production + R&D) that other solutions do not enjoy or that performance has become 'good enough'. In the case of Traditional GPUs vs x86, both of these potential advantages do not exist.

    That is correct, but if and only if perf/$ and perf/watt are roughly similar.

    First, let me counter some of the negative aspects you're pointing out. It should be noted that PTX is not proprietary, so whether AMD and Intel support it is really up to them. In the end, there is nothing that prevents interested parties from writing an efficient PTX-to-x86 converter. And if NVIDIA feels that would actually put their hardware in a good light, they could even easily do it themselves.

    Really, your entire arguement there is based around three points, so I'll answer one by one: a) x86 has a much better toolchain today. b) x86 is easier than alternatives because everyone is used to it. c) JIT-like techniques nearly never worked before, why would it suddenly make sense?

    - A: NVIDIA and AMD have every interest in the world to invest aggressively to reduce the gap there between now and 2009/2010. I doubt they'll get there, but I don't think anyone can deny that it will be less of a problem (or advantage, from Intel's point of view) in that timeframe.
    - B: Everyone is used to the latest architectures implementing x86, not the ISA itself. Optimizing for Larrabee and optimizing for Conroe are so fundamentally different tasks that you'll basically have to relearn everything, as far as I can tell. Abrash might be at an advantage here for various reasons, but I'm very skeptical about the rest of us.
    - C: Traditional JIT languages only execute each code fragment a small number of times. Just doing the final stages of optimizations and compilation before running the program on a GPU is not the same thing at all, because that exact same code will be run thousands, or millions, or even billions of times. The overhead is pretty much negligible, and you can gain a lot from that extra bit of optimization.

    In the end, I think many of your arguements pretty much fly out of the windows when you consider how large the GPGPU market will likely be by 2H09, because Intel won't have anything to compete with that before then. We'll see how fast that goes, though.
     
  15. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    I agree with the premise that Larrabee will be a serious threat to GPUs moving into HPC, or rather, the segment of HPC GPUs are initially targeting.

    GPUs are going into the cheap flops segment, the segment that cheap x86 clusters have basically conquered.

    x86 compatibiliby can be a powerful draw in some cases, though it is mitigated by the fact that HPC has a lot less inertia than x86's desktop stronghold.

    There are things x86 compatibility requires or has brought along in the periphery that GPUs in HPC will have to combat.

    Exceptions:

    I've seen anecdotal evidence that precise or relatively precise exceptions (or just having exceptions at all) are apparently very interesting to those who work in HPC.
    x86 compatibility does enforce such capability, though in theory any other full-blooded ISA would as well.

    Exceptions and other non-performance features are signs of an architecture that is designed with the idea that what it computes matters.
    GPUs have a well-known legacy for not being all that rigorous, and the first instantiations of their GPGPU products aren't far enough away from that legacy.

    Flexibility:
    Larrabee can simply do more than GPUs can. The benefits of x86 compatibility means the cores are capable of existing independently of a master CPU: Larrabee can be its own master.
    Some of the released slides indicate that this will be the case for some potential systems.
    That would make Larrabee an almost drop-in replacement for some applications, though the performance drop without at least a quick recompilation would be horrendous.
    Then again, a little performance is infintely more than GPUs that can do none.

    It's also where Larrabee can bring a cost advantage. GPUs will not escape the necessity of having the CPU along as a master device.
    There are likely workloads where Larrabee could dispense with a separate CPU and either not care or actually gain performance.

    If the desire is for cheap flops, and this is the market GPUs are targeting, it may help to cut out the middleman.

    It may be possible Larrabee could position itself in other segments GPUs cannot touch. That means even if they hold position in the cheap flops sector, Larrabee can strike from safe harbor on the other side.
    This assumes Intel's other chip designs don't get in the way.

    Efficiency:
    This is a wild card, and not a guarantee for Larrabee.
    There are signs that it will have an edge.
    Folding at Home as an example, it was pointed out that flops results for GPU work units were in some ways inflated versus those run by Cell (an idea somewhat closer to Larrabee). GPUs in FAH have a lot of throwaway computation, while Larrabee might be able to get away with a little less. This is something that is going to be much more critical in the future and in my next point.

    If Larrabee can manage greater efficiency, and GPUs have some pretty drastic fall-off in non-ideal workloads, then something like a factor of two or three shorftall in peak performance may not be enough to keep GPGPUs ahead.

    If Larrabee is capable of twice the SP flops as it has DP, then there may not be a gap at all.

    Power Power Power Power:
    Intel has a history with power management.
    Intel has had a power-aware philosophy beaten into it in the last few years.
    GPUs currently are very coarse in their management (hey we have 2 speed modes!).
    Regardless of peak performance, power is currently one of or the most dominant factors in high-performance design.
    I have not been impressed with some of the indications of the attitude various GPU execs have about that subject.
    If they have changed their minds in recent months, they are still several years late to the party.
    GPUs are comparatively efficient to hefty IPC cores trying to spit out peak flops. They will not be so lean if faced with a design that is far more aware of power than they are.
    Whether Larrabee will be power efficient is a fine question. It may very well not be.
    If GPUs progress as they have done, it won't matter because they'll be no better at best and likely worse.

    Intel:
    A company with the size and resources of Intel can handle long protracted fights, can establish or force beachheads, and can suffer more setbacks.

    Unknown spoilers:
    The x86 ISA.
    Without knowing more, we don't know how much of a penalty Larrabee will pay for working with this ISA.

    The vector extensions.
    Without knowing more, we won't know what omissions might hold the chip back.

    Larrabee's design quirks:
    We may find out its early rendition will be an interesting design, but with some set of failings and shortfalls. For any decent new beginning I bet heavily this will be the case.

    Larrabee's focus:
    Design is still king, even with a process lead and massive engineering resources.
    Choices in Larrabee's design will do more to hurt or hinder it more than having an in-house fab.

    GPUs:
    A lot of my speculation is predicated on the idea that GPU designers don't change anything. This is incredibly unlikely.
    They have been getting feedback for some time, so they should be smart enough to know to work on fixing their shortcomings.

    Time:
    Intel's timing will be important.
    Too much time means GPUs have time to evolve.
    Too little, and Larrabee will be rushed and come out in an environment that still favors GPUs or standard x86.

    Intel:
    A company with the size and resources of Intel has a lot of other concerns.
    Competition from its own chip lines will be a threat to Larrabee's viability.
    Intel also has a spotty history in sticking to its guns after an initial setback.
     
  16. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,090
    Likes Received:
    694
    Location:
    O Canada!
    http://www.beyond3d.com/content/interviews/18/4

    Feb-2004

     
  17. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    215
    Location:
    Uffda-land
    Mmmm, but it isn't that his second (or more) shot at providing a timeframe for that prediction to come true? Sort of like how commercially viable nuclear fusion seems to always be 10 years away whenever you check? :smile:

    Tho Fusion and Larrabee are better ammunition for him than anything he had in 2004. . . or 2000. . .or 1998. . .or whenever he made that prediction for the first time.
     
  18. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    If power constraints remain as they are, Sweeny's Age of Convergence will be a short one.

    If his time frame were to come true, I predict that by 2020 we'll have some newfangled "graphics-specific CPU" and everyone will be singing the praises of some "revolutionary" separate and specialized silicon.
     
  19. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,551
    Likes Received:
    24,483
    Yes. The time I was refering to was way back when and not the more recent 2004 timeframe. The first time-frame Sweeny said those items was around 1998-99. He even repeated it on the Motley Fool boards -- before B3D was even around.

    An snippet of Sweeney's "The Sky is Falling" proclaimed @ 11/11/1999 and Archived on GameSpy
     
    #39 BRiT, Jun 30, 2007
    Last edited by a moderator: Jun 30, 2007
  20. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    215
    Location:
    Uffda-land
    B3D is older than that. :grin:
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...