Larrabee: 16 Cores, 2GHz, 150W, and more...

Discussion in 'Architecture and Products' started by B3D News, Jun 1, 2007.

  1. B3D News

    B3D News Beyond3D News
    Regular

    Joined:
    May 18, 2007
    Messages:
    440
    Likes Received:
    1
    It is amazing how much information is out there in the wild, when you know where to look. TG Daily has just published an article partially based on a presentation they were tipped off about, and which was uploaded on the 26th of April. It reveals a substantial amount of new information, which we will not focus on analysing right now, so we do encourage you to read it for yourself.

    Read the full news item
    NOTE: Please discuss the architectural info on Larrabee in this thread, and the industry speculation TG Daily's article has in that other thread.
     
  2. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,004
    Likes Received:
    2,509
    Location:
    Well within 3d
    What of the possibility of a 3 operand x86 MADD?
    It would fit mathematically with the 1 TF, clock speed target, and vector width.

    edit:

    Just reread slide 31.

    Larrabee is listed as having 8-16 DP flops per cycle per core.
    That seems counter to what has been rumored previously.

    It also seems like Larrabee can handle non-SSE ops as well.
     
    #2 3dilettante, Jun 1, 2007
    Last edited by a moderator: Jun 1, 2007
  3. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,013
    Likes Received:
    263
    Location:
    UK
    Yeah, I was thinking a bit about that. I think what might actually make more sense for Intel is to be 3-wide: INT, ADD|MUL, ADD|MUL. So that way you can keep everything as being 2 operands, and you have a fair bit of flexibility anyway. If ADD was half-speed for DP and quarter-speed for MUL, it would also explain why it sports 8-16 DP Flops/cycle, rather than just 8 or 16. That would also imply SP would be 32/cycle, but that's just speculation.

    One thing I'd like to highlight is that slide 17 uses a standard CSI x86 infrastructure with no traditional CPU on any of the sockets. Considering each chip only has one kind of core apparently, this implies that each core is apparently able to run a modern operating system.
     
  4. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,004
    Likes Received:
    2,509
    Location:
    Well within 3d
    It seems Larrabee will be running full x86 threads. It seems curious, though, that the L1 latency is only 1 cycle.

    Gesher has a 3 cycle latency, so I wonder how they can reduce it so much for Larrabee.

    If Larrabee's vector unit has ADD and MUL pipes, it would explain why a unit that is characterized as being 512 bit can support a throughput that would require 1024 bits per cycle per operand.
     
  5. JHoxley

    Regular

    Joined:
    Oct 18, 2004
    Messages:
    391
    Likes Received:
    35
    Location:
    South Coast, England
    :shock: How long before some smartass tries to run 16 different OS's on this kit...?
     
  6. Arun

    Arun Unknown.
    Moderator Legend Veteran

    Joined:
    Aug 28, 2002
    Messages:
    5,013
    Likes Received:
    263
    Location:
    UK
    Well, the CSI version has 24 cores according to the same slide, actually, and 96 threads. Assuming that proper virtualizaton is supported (which I kind of doubt, but heh), that would be quite a sight to behold indeed, hehe.
     
  7. ERK

    ERK
    Regular

    Joined:
    Mar 31, 2004
    Messages:
    287
    Likes Received:
    10
    Location:
    SoCal
    from the presentation

    I fail to understand how a multiplier (no matter how many bits) can take 150,000 gates to implement.
     
  8. santyhammer

    Newcomer

    Joined:
    Apr 22, 2006
    Messages:
    85
    Likes Received:
    2
    Location:
    Behind you
    Wasn't that the 80-cores one? From 80 to 16...
     
  9. Pressure

    Veteran Regular

    Joined:
    Mar 30, 2004
    Messages:
    1,274
    Likes Received:
    221
    The 80-core processor weren't x86-capable.
     
  10. INKster

    Veteran

    Joined:
    Apr 30, 2006
    Messages:
    2,110
    Likes Received:
    30
    Location:
    Io, lava pit number 12
    The "Terascale" project had little to do with "Larrabee", other than both being multi-core CPU designs. The "Terascale" did use 80 cores on a single die alright..., but each of them had a very simple, non-x86 design.
    This is the magic of Larrabee, a possible "x86 chip optimized for 3D graphics and GPGPU apps in need for floating point power" of some sort.
    Kind of like a Cell BE, but with a more homogeneous design (i.e., all cores have a similar design, whereas Cell has a main general purpose PPC core and eight -PS3 has seven, of course- specialized cores).
     
    #10 INKster, Jun 1, 2007
    Last edited by a moderator: Jun 2, 2007
  11. AnarchX

    Veteran

    Joined:
    Apr 19, 2007
    Messages:
    1,559
    Likes Received:
    34
    Very interessting view in Intels GP"GPU" and CPU future:
    [​IMG]

    :grin:
     
  12. pakotlar

    Banned

    Joined:
    Mar 19, 2004
    Messages:
    805
    Likes Received:
    17
    this is their gpu
     
  13. psurge

    Regular

    Joined:
    Feb 6, 2002
    Messages:
    938
    Likes Received:
    35
    Location:
    LA, California
    Well for one thing Larrabee has twice the cycle time, so in absolute terms it's not that huge a reduction (dunno if that's particularly relevant...). Still, didn't p4 have a 2 cycle load-use latency on its L1 (possibly enabled by the small L1 size)? Does anyone know the how Core 2 does in this area?
     
  14. Nick

    Veteran

    Joined:
    Jan 7, 2003
    Messages:
    1,881
    Likes Received:
    17
    Location:
    Montreal, Quebec
    3 cycles. But L1 is twice the size.
     
  15. Pete

    Pete Moderate Nuisance
    Moderator Veteran

    Joined:
    Feb 7, 2002
    Messages:
    4,853
    Likes Received:
    232
    You meant homogeneous, of course.
     
  16. INKster

    Veteran

    Joined:
    Apr 30, 2006
    Messages:
    2,110
    Likes Received:
    30
    Location:
    Io, lava pit number 12
    :oops: :D
     
  17. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,583
    Likes Received:
    703
    Location:
    Guess...
    So Geshers not out for another 3 years and Intel are already releasing slides showing how slow it will be? Lol :lol:

    Actually at the moment Gesher interests me a lot more than Larrabee. Sure its only got 1/5th the floating point performance but 224 GFLOPs DP is nothing to sneeze at and I assume it would be double that in SP. Plus Gesher should have outlandish single threaded performance which is something that seriously worries me with Larrabee.
     
  18. santyhammer

    Newcomer

    Joined:
    Apr 22, 2006
    Messages:
    85
    Likes Received:
    2
    Location:
    Behind you
    At this point i'm a little confused...

    [​IMG]

    Photo taken from http://xtreview.com/addcomment-id-2572-view-Intel-larrabee-and-processors-sandy-bridge-(Gesher).html
    And http://xtreview.com/images/davis.pdf

    So Gesher is basically an advanced ClearSpeed copro card/Torrenza and Larrabee will be a sock-able CPU? Then that picture is wrong? It seems larrabee is the GPU then?? ( see the AV decoder, display input etc... seems a graphics card... )

    Is the Larrabee some general x86 multi cores(for example 4) + some SIMD-math-dedicated cores(12-20) or just a CPU with 16-24 general purpose x86 cores? Am gonna be able to program it using OpenMP then?

    The presentation is too confusing ( mix 80-cores + Gesher + Larrabee >< + GPU + !!$##@ ). I think the 80-cores is, indeed, the GPU. The Gesher is oriented like Torrenza copros plugged using PciE and Larrabee is a 16-24 general purpose x86 sockable-cpu but who know...!

    And perhaps the Larrabee and Geshes will be general purpose, so can be used a CPU, GPU, sockable coprom PCIe card or whatever. Multiple configurations of the same thing.

    Btw, I saw Intel could use GDDR5 for this.

    And to finish a thought... are we going to return to software rendering then?
     
    #18 santyhammer, Jun 2, 2007
    Last edited by a moderator: Jun 2, 2007
  19. INKster

    Veteran

    Joined:
    Apr 30, 2006
    Messages:
    2,110
    Likes Received:
    30
    Location:
    Io, lava pit number 12
    I think that last slide with the green PCB schematics must be a fake and/or misinterpreted from official info.
    The Intel slide above that clearly mentions "CSI" as the bus/connector of choice for either "Larrabee" and "Gesher", but this one says "PCI Express Gen II" (2.0).
    Last time i checked they were not one and the same thing, unless CSI has something to do with "Geneseo".
     
  20. Farhan

    Newcomer

    Joined:
    May 19, 2005
    Messages:
    152
    Likes Received:
    13
    Location:
    in the shade
    Yes, that seems hueg liek xbox. What are all those gates doing? :shock:
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...