AMD: Navi Speculation, Rumours and Discussion [2019]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

  1. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,733
    Likes Received:
    2,563
    Location:
    Finland
    There's no overlap, Arcturus is in completely different league with it's 128 CUs compared to Navi 12's 40 (20 dual).
    It could be built for specific customers needs too, just like Vega 12 was tailored for Apple
     
  2. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,277
    Likes Received:
    3,531
    Location:
    Well within 3d
    Arcturus (possibly the MI100) from what I've gleaned from various articles and code commits is a compute-oriented product, perhaps HPC-targeted more so than usual.
    It has 128 CUs and no graphics command processor. Perhaps there is some hint to what limits GCN's scaling in the removal of the graphics command processor, while scaling compute. Perhaps there's a limit to how much the control logic can directly control for a single graphics context, while a compute device could scale out the number of ACEs with no expectation that they act in concert like a graphics card would.

    On top of that, there's an apparently new class of acceleration unit in addition to the vector hardware, perhaps some sort of large matrix multiply unit that might extend the machine learning instructions or general math capabilities for large compute. While I'd need to hunt down the reference, there's some code written to the effect that some portion of the clocking capability for boosting has been disabled, since there's going to be a lot of data movement and highly utilized silicon even at more modest clocks.
    For instruction generation in the compiler, there is advice that while it is possible to issue vector instructions in parallel with the new accelerator instructions, it's discouraged due to the likelihood that the chip will throttle.

    It will probably aim for lower clocks than prior GPUs, since it's going to have a lot more silicon active over a broad chip.
    As for why it is released after RDNA, perhaps there are factors like the HPC contracts AMD is touting that would need the higher peak throughput but with somewhat reduced risk by sticking with a more familiar ISA and base architecture. RDNA has some unappetizing silicon bugs at this point, and its software support for RDNA is not good in compute.
    The broader wavefronts and more coarse batch requirements may also be more acceptable for workloads dominated by very large matrix multiplies.

    There are elements in RDNA that I would imagine could improve on Vega, but that might not be sufficient until RDNA is more mature.
     
    w0lfram, Lightman, Leovinus and 2 others like this.
  3. CarstenS

    Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    4,875
    Likes Received:
    2,181
    Location:
    Germany
    With 25 TFLOPS (FP32) basically required for MI100 and 128 CUs à 64 ALUs given, Arcturus would need to clock slightly north of 1.5 GHz, not unusually low for a Vega-GPU.
     
    Lightman likes this.
  4. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,617
    Likes Received:
    5,180
    Will Arcturus have half rate FP64?

    Regardless, perhaps all Arcturus discussions should be in the Vega thread and not this one.
     
    w0lfram likes this.
  5. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,617
    Likes Received:
    5,180
    Navi 12 is looking stranger each time there's news about it.
    On one hand it's using two HBM2E stacks, so we should expect around 820GB/s of bandwidth from it, which is >80% wider than Navi 10's 256bit 14Gbps GDDR6. On the other hand, it's still a relatively narrow GPU with only 20 WGPs like Navi 10.
    And then there are those tests showing very slow core clocks at 1.15GHz.

    If this was using a single HBM2E stack, then I'd say we were looking at Vega 12's successor for Macbooks, with a single HBM2E stack offering considerably better performance even with the low core clocks.
    With 2x HBM2E stacks this will be a bandwidth monster glued to a relatively tiny GPU that is moreover clocked really low.

    Can HBM2E clock significantly lower (e.g. lower than 2.4Gbps/pin) to enable significantly lower voltages? Otherwise Navi 12 only makes sense if it's clocked at astronomical and unforeseen speeds like 2.5GHz, but that would make Big Navi rather redundant.
     
    Lightman likes this.
  6. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,733
    Likes Received:
    2,563
    Location:
    Finland
    The ES-boards have been with 2 Gbps and 2.4 Gbps HBM2e, so not quite 820GB/s, only 512 - 614 GB/s (of course this doesn't mean the final couldn't be higher, that ES board was a mere 200W board)

    late edit:
    My memory is short but bad, but wasn't there talk about the Navi with DLops already right after Navi 10 launch?
     
    #1806 Kaotik, Feb 17, 2020
    Last edited: Feb 17, 2020
    ToTTenTranz likes this.
  7. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    620
    Likes Received:
    293
    It's premium mobile and DC inferencing chip in one package.
    Quite literally in Navi whitepaper(s).
     
    #1807 Bondrewd, Feb 17, 2020
    Last edited: Feb 17, 2020
    Kaotik likes this.
  8. 3dilettante

    Legend Alpha

    Joined:
    Sep 15, 2003
    Messages:
    8,277
    Likes Received:
    3,531
    Location:
    Well within 3d
    The GFX1011 "NaviDL" device listed in that github commit looks to be associated with the older commit for GFX1011 AND GFX1012: https://github.com/llvm-mirror/llvm/commit/eaed96ae3e5c8a17350821ae39318c70200adaf0.
    That brings various dot product instructions into GFX10, whereas there is a slightly differently numbered set for Vega 20.

    GFX1011 is also the Navi version that doesn't have the FeatureLdsMisalignedBug flag, but lists all the other bugs that might have been called teething pains for GFX10.
    This family variant has the FeatureDoesNotSupportXNACK flag, which is present for all non-APU products.

    GFX1011 does have a smattering of error strings related to BVH instructions, perhaps as errors in their use or some kind of ISA conflict with an unspecified nearby variant with BVH instructions.
     
    w0lfram and Radolov like this.
  9. w0lfram

    Newcomer

    Joined:
    Aug 7, 2017
    Messages:
    213
    Likes Received:
    38
    So how big does AMD need to go, with "big navi" ? Or what die size makes sense for rdna2 ? What size is navi12 rumored to be at, how much bigger than Vega20 @ 331mm^2 ?

    What does "bigger" mean in terms of what navi needs more of ? I think for games you need more TMUs and ROPS, etc. I don't know anymore, rdna changes things up and is about feeding the engine unfettered. rdna2 is the full architecture and is said to be more efficient at crunching games. I think we all expect this, but to what degree..? How far advanced is rdna2 ? (25% uplift in architecture alone rdna1 to rdna2 ?) ??

    Seems like rdna2 is catering to dx12 and Vulkan and will have a robust front end, sitting on new fabric. And a few patents ?



    For argument sake, if you add +50% to Navi 10s' 252mm^2 size (area), you get about 380mm^2. What are we looking at with 7nm+ (w/HBM2e) .?

    50% more CUs ??
    ROPS ?
    TMUs ?
     
  10. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,733
    Likes Received:
    2,563
    Location:
    Finland
    Navi 12 should be really similar in size to Navi 10, since they're essentially the same chip minus memory controller and added DL ops in 12.
     
  11. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    620
    Likes Received:
    293
    They're two, and they're big and ungodly big.
    More of everything.
     
  12. TheAlSpark

    TheAlSpark Moderator
    Moderator Legend

    Joined:
    Feb 29, 2004
    Messages:
    21,150
    Likes Received:
    6,486
    Location:
    ಠ_ಠ
    hm... the main core area of Navi 10 is in the 150mm^2 range. The 4x64-bit MCs are about 50mm^2 altogether. The uncore stuff is about 50mm^2.

    I guess if they just did a naive doubling of everything (512-bit, 40WGP, 128ROPs, 4SE) then the die size would be in the 450mm^2 range :?:
     
  13. yuri

    Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    197
    Likes Received:
    172
    The ungoldy big one sounds interesting. Besides, Intel's 500W Xe needs some competition.
     
    Lightman likes this.
  14. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,733
    Likes Received:
    2,563
    Location:
    Finland
    I doubt they'll be doing 2 "big chips", but who knows, there's three Navi 2x chips.
    That's what Arcturus is for ;)
     
  15. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,617
    Likes Received:
    5,180
    From what I've heard about how difficult GDDR6 traces can be, 512bit of that memory could be very hard to achieve.

    Besides, Big Navi should be getting into the price point where HBM2E is worth implementing, especially with the clock speeds and memory density attainable by the newly produced stacks from Samsung and SK Hynix. They could get up to 48GB and 920GB/s on just 2 stacks.
     
  16. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    620
    Likes Received:
    293
    Unfortunately AMD has no plans for 500W boards so far.
    Arcturus is 8k ALU@300W.
    Oh you should never doubt her majesty.
    I have some baaaaaaaad~ news for you.
     
  17. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,617
    Likes Received:
    5,180
    What are they?
     
    Cuthalu likes this.
  18. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    620
    Likes Received:
    293
    What DRAM vendors tell you is total bullshit.
    The fastest and densest shit you're getting this year is 8-Hi@2.4Gbps.
    Kinda low-key fucks every acc vendor on the market, but it can't be helped.
     
    PSman1700 likes this.
  19. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    10,617
    Likes Received:
    5,180
    Where are you getting this info from? Do you have any sources for that?


    I don't have anything against Big Navi using GDDR6, especially with speeds reaching 18Gbps in the near future (or is that a lie too?).
    I just pointed out the newest HBM2E spec announcements as good opportunities for a high-end graphics card.
     
    nnunn likes this.
  20. Bondrewd

    Regular Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    620
    Likes Received:
    293
    People.
    I can't point my finger at exact people, that would be very indecent of me.
    Yeah, but that's just JEDEC spec, not the actual parts available from DRAM vendors.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...