NVIDIA COPA - Composable On-Package Architecture

Discussion in 'Architecture and Products' started by Jawed, Jun 27, 2021.

  1. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,472
    Likes Received:
    1,833
    Location:
    London
    This seems to be purely for data centre compute:

    2104.02188.pdf (arxiv.org)

    "In this work, we demonstrate that diverging architectural requirements between the HPC and DL application domains put converged GPU designs on a trajectory to become significantly under-provisioned for DL and over-provisioned for HPC. We propose a new composable GPU architecture that leverages emerging circuit and packaging technologies to provide specialization, while maintaining substantial compatibility across product lines. We demonstrate that COPA-GPU architectures can enable selective deployment of on-package cache and off-chip DRAM resources, allowing manufacturers to easily tailor designs to individual domains."

    So the focus here is that cache and memory controllers are interchangeable, collectively called Memory System Module. They attach using custom links to the GPU Module (GPM).

    Along the way I learnt that 826mm² appears to be the current reticle limit.
     
    Krteq, nnunn, DegustatoR and 3 others like this.
  2. xpea

    Regular Newcomer

    Joined:
    Jun 4, 2013
    Messages:
    480
    Likes Received:
    598
    Oh nice find !

    That's what NV came up with as they struggle to abandon the GPU culture they created on the enterprise market. On a time of specialized accelerators (GPU, AI/ML, FPGA, DPU/IPU, VPU), it's more and more difficult to make one arch to perform well on every workload. As I said on the other topic, pure AI/ML players put more and more pressure on NVDA dominance and some compute gurus inside green team are pushing for disruptive pure AI silicon.

    BTW I like this diagram :

    GPM package options - 2021 Nvidia Researchi.png

    Good summary of packages options available for next gen silicon
     
  3. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    11,625
    Likes Received:
    2,492
    Location:
    New York
    It’s strange that Nvidia acknowledges the obvious deficiencies of a jack-of-all-trades design yet insists on reusing the same fundamental compute architecture for both HPC and DL. Their DL competition is highly customized in both compute and memory systems and this proposal only addresses the latter. Seems like a losing bid.
     
  4. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,648
    Likes Received:
    3,627
    Location:
    Germany
    I guess it all comes down to finding the right point in time when paths must diverge. Like they used to sell their big iron chips in consumer GeForces for a couple of years until GM200 and with P100 went to a diverged approach for consumer and HPC - with the HPC part retaining the ability to be sold as a high-end consume part, like a gen later in Titan V.
     
    trinibwoy likes this.
  5. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,702
    Likes Received:
    7,705
    So their plan is to get more flexible in the memory department to address different DL and HPC markets, but not to split the actual GPU development?

    I wonder how many people are using the GA100 for graphics, considering it still has a whopping 128 ROPs and 864 TMUs. At least they didn't put RT cores in it.
     
  6. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,648
    Likes Received:
    3,627
    Location:
    Germany
    I think in this patent, they are exploring the possibilities for a flexible memory interface/configuration. Once you got that down, you can design all your (large, expensive, MCM) GPUs with those interfaces and put inbetween whatever is all the rage of the day.
     
  7. Bondrewd

    Veteran Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    1,546
    Likes Received:
    749
    858mm^2 actually.
     
  8. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,472
    Likes Received:
    1,833
    Location:
    London
    NVidia is pretty safe with its software infrastructure for, let's say, five years.

    This snippet is pretty worrying:

    "Figure 12 shows that doubling and quadrupling the number of baseline GPU-N instances (2× GPU-Ns and 4× GPU-Ns) results in mean 29% and 43% performance gains respectively for our training workloads. We find that a DL-optimized HBML+L3 COPA-GPU configuration (with 27% performance gain) provides similar levels of performance to 2× GPU-Ns, yet should cost significantly less than buying and hosting 2× larger installations of traditional GPU-Ns.

    [...]

    HBML+L3 integrates 1.6× more HBM memory, resulting in total aggregate cost lower than 2× of GPU-N. Thus, DL-optimized COPA-GPUs will provide substantially better cost-performance at scale, saving on not just overall GPU cost but additional system-level collateral such as datacenter floorspace, CPUs, network switches, and other peripheral devices."

    It demonstrates that in DL, scaling by using more GPUs is almost a dead-end - and you can bet NVidia's customers have noticed. Sure, putting 1GB of L3 cache and much more HBM bandwidth is a radical, difficult change, but NVidia's competitors are doing radical difficult things too.

    Much like patent documents are usually a sliver of the future, because the products are way more complex than any single document can hint, I think it's safe to assume that NVidia is also planning to do far more radical things inside the GPM.

    Perhaps processing in memory is where DL is headed.
     
  9. ToTTenTranz

    Legend Veteran

    Joined:
    Jul 7, 2008
    Messages:
    12,702
    Likes Received:
    7,705
  10. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    11,472
    Likes Received:
    1,833
    Location:
    London
    So, erm, Hopper being "chiplet" based might be where we start to see COPA?:



    Thread indicates that there's a growing consensus that Hopper is data centre (Lovelace is gaming).

    Videocardz's spin-off article refers back to tweets from May about the configuration of GH100:

    NVIDIA Hopper GPU rumored to tape out soon - VideoCardz.com

    So I suppose we now need to be aware of NVidia codenames for products that use the "composable" concepts. Hopper might be the architecture of the chiplets, but something else might be the name of the AI accelerator and something else again might be the name of the non-AI accelerator.

    So a collection of codenames related to Hopper could be the first real clue that it is at the centre of COPA-based family of products.

    Of course, it might be too early for anything COPA-based.
     
    Lightman likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...