Samsung HBM-PIM - Processing in Memory

Discussion in 'Graphics and Semiconductor Industry' started by Jawed, Feb 17, 2021.

  1. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    Wow, literally in the memory:

    Samsung's New HBM2 Memory Has 1.2 TFLOPS of Embedded Processing Power | Tom's Hardware

    Now it can be argued that this much compute is barely worth bothering with (it's only FP16 FLOPS being counted).

    The cost/packaging/thermal constraints of HBM and the 20nm node being used here seem to indicate this is a prototyping sample for partners and I suppose there'll be a couple of years of testing...
     
  2. Tkumpathenurpahl

    Tkumpathenurpahl Oil Monsieur Geezer
    Veteran Newcomer

    Joined:
    Apr 3, 2016
    Messages:
    1,797
    Likes Received:
    1,801
    As an HBM fanboy, this excites me enormously
     
  3. HLJ

    HLJ
    Regular Newcomer

    Joined:
    Aug 26, 2020
    Messages:
    355
    Likes Received:
    589
    Important detail:
    This cut the memory amount in half per chip...performance is never free ;)
     
    Lightman and pharma like this.
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    By a quarter.
     
  5. iceberg187

    Regular

    Joined:
    Jul 31, 2006
    Messages:
    645
    Likes Received:
    87
    Location:
    Hobart, Indiana
  6. HLJ

    HLJ
    Regular Newcomer

    Joined:
    Aug 26, 2020
    Messages:
    355
    Likes Received:
    589
    "Naturally, making room for the PCU units reduces memory capacity — each PCU-equipped memory die has half the capacity (4Gb) per die compared to a standard 8Gb HBM2 die. To help defray that issue, Samsung employs 6GB stacks by combining four 4Gb die with PCUs with four 8Gb dies without PCUs (as opposed to an 8GB stack with normal HBM2)."

     
  7. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    If you'd said die, I wouldn't have posted :) You said chip though.
     
    HLJ likes this.
  8. HLJ

    HLJ
    Regular Newcomer

    Joined:
    Aug 26, 2020
    Messages:
    355
    Likes Received:
    589
    Ahhh :D
     
  9. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,048
    Likes Received:
    7,009
    FP16 calculations in the HBM doesn't seem to be very useful for GPUs, and if the target was AI applications I wonder if it would be better served by Samsung's own NPUs consisted of CPU cores and MAC engines.
    At first this looks like more of an academic exercise, but I can't figure out if Samsung intends to sell these half-RAM/half-FP16 ALUs dies as they are right now.

    For GPUs, I wonder if they could put e.g. ROPs in there instead of the FP16 ALUs. It would be akin to Xenos' eDRAM die.
    It would allow for a more modular approach to memory channels on a GPU, as total bandwidth is often scaled according to the number of ROPs enabled (and vice versa).
     
  10. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    2,044
    Likes Received:
    1,474
    Location:
    France
    So the fp16 alus have like as low latency as possible access to the ram ?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...