AMD CDNA Discussion Thread

Discussion in 'Architecture and Products' started by Frenetic Pony, Nov 16, 2020.

  1. OlegSH

    Regular

    Joined:
    Jan 10, 2010
    Messages:
    797
    Likes Received:
    1,622
    For sure, all on-chip networks have variable latencies, nobody calls this NUMA.
     
    DavidGraham and pharma like this.
  2. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Bad news: A100 is way less funny than usual.
     
    Tarkin1977 likes this.
  3. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,110
    Location:
    New York
    I'm sure you're right but there's so much hype around SYCL right now as the one language to rule them all. But there was a lot of hype for OpenCL at one time too so....

    It's then still an open question of what stack people will use to get the most out of Frontier and El Capitan. ROCm seems very raw still. Ironically the "Radeon" in ROCm doesn't really fit any more.
     
    #263 trinibwoy, Nov 9, 2021
    Last edited: Nov 9, 2021
    PSman1700 likes this.
  4. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,502
    Likes Received:
    24,397
    Can folks remain civil and stop with the petty bickering that does not improve any discussions.
     
    Krteq, Malo, Lightman and 2 others like this.
  5. Granath

    Newcomer

    Joined:
    Jul 26, 2021
    Messages:
    80
    Likes Received:
    81
  6. Esrever

    Regular

    Joined:
    Feb 6, 2013
    Messages:
    846
    Likes Received:
    647
    Without any nvidia exascale supercomputers being built, how is anyone even going validate any of the things being said without any data?
     
  7. Leoneazzurro5

    Regular

    Joined:
    Aug 18, 2020
    Messages:
    335
    Likes Received:
    348
    Well, if we even will get a number about sustained FLOPs then we could speak about efficiency. Quite frankly, even if the FP64 efficiency of MI200 was half of A100, MI200 would be anyway more efficient at FP64, and I think that figure is not beyond reach.
     
  8. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    782
    Location:
    EU-China
    Wow this one is priceless...
    As of today, A100 HPC/ML scaling performance is very well know as it's the undisputed leader in this field. Just look at HP500 list and MLperf website, or simply google A100 benchmarks, they are hundreds of pages...
    On the other side, except few selected benchmarks from AMD yesterday presentation, we have nothing/nada/zip yet about MI200. I mean from unbiased source in the real world...
     
    pharma likes this.
  9. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,110
    Location:
    New York
    Didn't AMD share a bunch of FP64 benchmarks showing MI200 well ahead?

    It's crazy that a dual-die MI200 only has 7% more transistors than A100? A100's count is probably inflated due to having 40MB L2 cache vs 16MB on MI200. I assume that's what those tweets are referring to.
     
    Lightman likes this.
  10. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Truly since NV won exactly 0 bids.
    Yea.
    It just overall has more memories per SM.
    192KiB L1/shmem slab versus 16+64 for CDNA2.
     
    Krteq and Lightman like this.
  11. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,887
    Likes Received:
    4,534
    Take a look outside the US, specifically in the EU.

    We will know soon enough whether it's a stunt and whether that's the reason to move on to MI300 asap.
     
    #271 pharma, Nov 10, 2021
    Last edited: Nov 10, 2021
  12. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,211
    No more than 2.5X ahead, despite being theoretically almost 5X faster. Some benches don't even advance beyond the 1.6X margin.
     
  13. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    The only quasi-announced EU exascale uses SiPearl + Ponte Vecchio...
    You can't win a Summit successor with a 'stunt'.
    The 'reason' is AMD cranking ~6Q prod to prod in DC GPUs.
    Been like that since Vega20.
     
  14. DavidGraham

    Veteran

    Joined:
    Dec 22, 2009
    Messages:
    3,976
    Likes Received:
    5,211




     
    #274 DavidGraham, Nov 10, 2021
    Last edited: Nov 10, 2021
    xpea and pharma like this.
  15. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,055
    Likes Received:
    3,110
    Location:
    New York
    Presumably the people ordering these things aren’t idiots and MI200 had to be more than a benchmark queen to get the nod. But stranger things have happened.
     
    Lightman likes this.
  16. pharma

    Veteran

    Joined:
    Mar 29, 2004
    Messages:
    4,887
    Likes Received:
    4,534
    That's what I would assume. But the tweet above implies AMD still not including pertinent specs in the white paper, which is somewhat unusual at this late date.
     
  17. Lurkmass

    Regular

    Joined:
    Mar 3, 2020
    Messages:
    565
    Likes Received:
    711
    The source language doesn't matter as much as a cross-vendor intermediate bytecode representation or lack thereof. Even if vendors did agree to a common source language like SYCL it's progress would be stalled thereafter since vendors can't see eye to eye on what the supported driver ingestion format should be. AMD and Intel will never agree to support PTX assembly as the standard format for compute kernel binaries since it's a sub-optimal abstraction for their hardware in terms of performance. Nvidia will never agree to accept any other format either because they don't want to make their existing software ecosystem advantage that they've built up over the years to be redundant so why should they force themselves to be at level playing field with others when they can keep being on top ?

    If the industry did start participating on SYCL technical specifications, it would end up in being the same deadlock that OpenCL was mired in. If adoption behind SYCL is contingent on other corporations making compromises at their own detriment for the greater goal of achieving portability then we take OpenCL to be the example of an end result from the lack of compromises ...

    You won't be thrilled with the the answer but developers will have to use the ROCm stack regardless because it's the most production ready option on AMD HW. Mesa's clover project isn't functional yet. Others can try to make their own stack by looking at ROCm itself but that's far from ideal since public documentation is bad and it'll be hard to follow the code without being a former AMD employee so they may still have to do some reverse engineering despite being an open source project. ROCm is the only one left standing as being viable because it's a project that AMD officially supports and it's a part of their long term corporate responsibility as well so if they don't ditch it by the end of this decade then ROCm will eventually reach maturity ... (ROCm was only made public a little over 5 years ago with the initial release)

    Whether developers will want to maintain compatibility with multiple compute stacks is another problem altogether but given the politics they'll have no choice but to bite the bullet if they want to expand their customer base ...
     
    Lightman, trinibwoy and xpea like this.
  18. xpea

    Regular

    Joined:
    Jun 4, 2013
    Messages:
    551
    Likes Received:
    782
    Location:
    EU-China
    nearly every week a government supercompurter is installed with A100 around the world...
    Nov 1st, Texas Advanced Computing Center (TACC):
    https://www.hpcwire.com/2021/11/01/tacc-unveils-lonestar6-supercomputer/
    Oct 18th, UAE National Center for meteorology:
    https://www.hpcwire.com/off-the-wir...ecasting-with-new-supercomputer-built-by-hpe/
    Sept 28th, Department of Energy’s National Nuclear Security Administration Tri-Lab CTS-2
    https://www.hpcwire.com/2021/09/28/nnsa-selects-dell-for-40m-cts-2-commodity-computing-contract/
    Sept 16th, Queen Máxima of the Netherlands
    https://www.hpcwire.com/off-the-wir...s-inagurates-supercomputer-for-dutch-science/

    and so on and so on...
     
    DegustatoR likes this.
  19. Bondrewd

    Veteran

    Joined:
    Sep 16, 2017
    Messages:
    1,682
    Likes Received:
    846
    Yeah but is the Summit aka a pretty balanced system replacement so ain't no way it actually sucks™ at real sciences.

    Where's the NV exascale?

    Even Intel won (effectively) 2 systems.
     
  20. troyan

    Regular

    Joined:
    Sep 1, 2015
    Messages:
    603
    Likes Received:
    1,122
    AMD's has shown even a low 1.4 improvement over A100. One GCD is nearly as big as GA100 (*) while offering barely better FP64 performance with less cache, on chip bandwidth and off chip interconnection.

    I'm curios about the PCIe version. Lets see what AMD can do with 300W.

    *
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...