Larrabee at Siggraph

Discussion in 'Architecture and Products' started by nAo, Jun 2, 2008.

  1. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    A minor note: according this thread on Ace's hardware forums Larrabee vector ISA is called LRBni, and it's different from AVX, but I guess we already knew that.
     
  2. bowman

    Newcomer

    Joined:
    Apr 24, 2008
    Messages:
    141
    Likes Received:
    0
    That's not the final name, it's 'Larrabee New Instructions', similar to 'Prescott New Instructions', 'Katmai New Instructions' and so on.
     
  3. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    The initials fit.

    The forking of vector sets is unfortunate. It sounds like Larrabee's vectorization capabilities are better than AVX, while AVX has a niftier and possibly more extendable encoding.
     
  4. Killer-Kris

    Regular

    Joined:
    May 20, 2003
    Messages:
    540
    Likes Received:
    4
    More than anything such an old chip would be an excellent starting point. How many transistors did the original Pentium have, like 3M? Now how many of those can you fit into the G80's transistor budget, and those would be truly scalar units :shock:!!! Of course that's a poor comparison because adding the vector extensions, MT, and x86-64, and an L2 cache would leave it a little more bloated than 3M transistors per core but it gets the ball rolling.
     
  5. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Going by 1.4 billion for Nvidia's chip, and sandpile.org's listing of 4 million for the P54, 350 original Pentiums would fit.

    They wouldn't be too useful, since they lack the other 26 million transistors Larrabee's cores have, and there's no other logic or interconnect to actually talk to them.
     
  6. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Half of that 30M transistor budget should be cache..
     
  7. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    How useful would it be without that cache?
     
  8. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    LOL. I was hinting to the fact that caches are more dense that logic, that's it!
     
  9. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    True, but I was basing my numbers off transistor count, not area. So I still think the 350 cores would be a little useless, unless we say everything but the cores takes 0 transistors... ;)
     
  10. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Agreed. 32+ cores is more likely
     
  11. bowman

    Newcomer

    Joined:
    Apr 24, 2008
    Messages:
    141
    Likes Received:
    0
    Larrabee cores are 30M? Where did you hear this?
     
  12. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Where was the source on the 30 million number?
    It sounds reasonable, but I can't find the source of the figure.
    I didn't see it in the larrabee pdf, but I may have overlooked it.
     
  13. WaltC

    Veteran

    Joined:
    Jul 22, 2002
    Messages:
    2,710
    Likes Received:
    8
    Location:
    BelleVue Sanatorium, Billary, NY. Patient privile
    Thanks...;) Took the words right out of my mouth. I think that for the sake of "conceptual simplicity" some of us just might be emphasizing the "simplicity" notion a tad too much. Things like cache and the glue logic to make the whole shebang work probably would take at least--oh, at least a dozen or so transistors, I should think...;)
     
  14. ArchitectureProfessor

    Newcomer

    Joined:
    Jan 17, 2008
    Messages:
    211
    Likes Received:
    0
    Not the current tools, but slightly updated versions of the current tools.

    As Larrabee's vectors are easier than current x86 vectors (SSE), modifying the Intel C compiler (ICC) to support these new vectors shouldn't require a total re-write of the whole compiler. In fact, the existing auto-vectorization should map well to Larrabee. With a bit more tweaking, the ICC compiler should be able to vectorize even more loops (for example, ones with conditionals) to create more efficient vector code. The rest of the non-vector aspects of the compiler's code generation would be basically unchanged.

    So, not zero effort in updating the tools, but much easier than making an entire EPIC compiler for a new ISA.
     
  15. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Hope Intel guys are working on a Larrabee implementation of the Ct language, unfortunately they don't mention it at all on their siggraph paper.
     
  16. heliosphere

    Newcomer

    Joined:
    Jun 15, 2005
    Messages:
    142
    Likes Received:
    15
    They do mention Ct, but they don't mention working on an implementation if that's what you meant.
     
  17. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
  18. bowman

    Newcomer

    Joined:
    Apr 24, 2008
    Messages:
    141
    Likes Received:
    0
    Oh! Hm. Anyone know the transistor count for 256KB of cache, PCI-E bus and some memory controllers? :razz: Time to bust out the calculator..
     
  19. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,579
    Likes Received:
    4,799
    Location:
    Well within 3d
    Can't speak to the controllers, but 256KiB of 6T SRAM is about 12.5 million transistors.

    edit: this doesn't count any cache tags, just data arrays
     
  20. ArchitectureProfessor

    Newcomer

    Joined:
    Jan 17, 2008
    Messages:
    211
    Likes Received:
    0
    Roughly speaking cache tags are around 10% or less the size of the data array.

    The details: For 64-byte blocks, the worst-case tag overhead is around 12.5%, which is a full 64-bit tag for a 64-byte block. But assuming a 48-bit physical address space and a 6-bit block offset, you're down to 42 bits or 8%. For the 256KB cache, the index would likely be around 10 bits, so now you're down to only 6% overhead for the tags.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...