Samsung HBM-PIM - Processing in Memory

Discussion in 'Graphics and Semiconductor Industry' started by Jawed, Feb 17, 2021.

  1. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    Wow, literally in the memory:

    Samsung's New HBM2 Memory Has 1.2 TFLOPS of Embedded Processing Power | Tom's Hardware

    Now it can be argued that this much compute is barely worth bothering with (it's only FP16 FLOPS being counted).

    The cost/packaging/thermal constraints of HBM and the 20nm node being used here seem to indicate this is a prototyping sample for partners and I suppose there'll be a couple of years of testing...
     
  2. Tkumpathenurpahl

    Tkumpathenurpahl Oil Monsieur Geezer
    Veteran Newcomer

    Joined:
    Apr 3, 2016
    Messages:
    1,809
    Likes Received:
    1,813
    As an HBM fanboy, this excites me enormously
     
  3. HLJ

    HLJ
    Regular Newcomer

    Joined:
    Aug 26, 2020
    Messages:
    385
    Likes Received:
    633
    Important detail:
    This cut the memory amount in half per chip...performance is never free ;)
     
    Lightman and pharma like this.
  4. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    By a quarter.
     
  5. iceberg187

    Regular

    Joined:
    Jul 31, 2006
    Messages:
    647
    Likes Received:
    88
    Location:
    Hobart, Indiana
  6. HLJ

    HLJ
    Regular Newcomer

    Joined:
    Aug 26, 2020
    Messages:
    385
    Likes Received:
    633
    "Naturally, making room for the PCU units reduces memory capacity — each PCU-equipped memory die has half the capacity (4Gb) per die compared to a standard 8Gb HBM2 die. To help defray that issue, Samsung employs 6GB stacks by combining four 4Gb die with PCUs with four 8Gb dies without PCUs (as opposed to an 8GB stack with normal HBM2)."

     
  7. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,286
    Likes Received:
    1,551
    Location:
    London
    If you'd said die, I wouldn't have posted :) You said chip though.
     
    HLJ likes this.
  8. HLJ

    HLJ
    Regular Newcomer

    Joined:
    Aug 26, 2020
    Messages:
    385
    Likes Received:
    633
    Ahhh :D
     
  9. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,072
    Likes Received:
    7,034
    FP16 calculations in the HBM doesn't seem to be very useful for GPUs, and if the target was AI applications I wonder if it would be better served by Samsung's own NPUs consisted of CPU cores and MAC engines.
    At first this looks like more of an academic exercise, but I can't figure out if Samsung intends to sell these half-RAM/half-FP16 ALUs dies as they are right now.

    For GPUs, I wonder if they could put e.g. ROPs in there instead of the FP16 ALUs. It would be akin to Xenos' eDRAM die.
    It would allow for a more modular approach to memory channels on a GPU, as total bandwidth is often scaled according to the number of ROPs enabled (and vice versa).
     
  10. Rootax

    Veteran Newcomer

    Joined:
    Jan 2, 2006
    Messages:
    2,059
    Likes Received:
    1,493
    Location:
    France
    So the fp16 alus have like as low latency as possible access to the ram ?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...