AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Discussion in 'Architecture and Products' started by Kaotik, Jan 2, 2019.

Thread Status:
Not open for further replies.
  1. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    Any chance AMD will drop the confusing “dual compute unit” terminology for RDNA2? It seems the two CUs share an L0 instruction cache and scalar data cache but all other resources are CU specific. Not sure those two caches are worth the confusing name.
     
  2. Actually they don't share L0. Each WGP can access the double the LDS because it is part of the WGP.
    More Info on WGP mode from AMD's RDNA whitepaper.
     
    Lightman likes this.
  3. JoeJ

    Veteran

    Joined:
    Apr 1, 2018
    Messages:
    1,523
    Likes Received:
    1,772
    However, the work-group processor mode allows using larger allocations of the LDS to boost performance for a single work-group

    I wonder if CU can mix workgroups that use lots of LDS with others that use only a little bit? Probably.
    I also wonder how this compares with NV which seems to have more LDS in general (Ampere increased it once more).
     
  4. szatkus

    Newcomer

    Joined:
    Mar 17, 2020
    Messages:
    38
    Likes Received:
    26
    People with uncommon names (well, uncommon where they currently live) sometimes goes with more familiar name. When I was working with Koreans some more prominent people were using English names as their first names, because English is more cool or something. Considering it could be confusing for people in his vicinity he may just said something like "You can call me David" and here's that.

    May be also because of etymology, some of my English and Americans clients call me Thomas because it would be my name in English. I don't know if it's the case for Devinder.
     
  5. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    I'm aware of some people doing this, even going all official like Jen-Hsun switching to Jensen, but Devinder is and has been Devinder everywhere but the WCCFtech article.
     
    Deleted member 90741 likes this.
  6. There is a good description of what it is from @bridgman

    https://www.phoronix.com/forums/for...x/856534-amdgpu-questions?p=857850#post857850
     
    #2226 Deleted member 90741, Jun 4, 2020
    Last edited by a moderator: Jun 4, 2020
  7. BRiT

    BRiT (>• •)>⌐■-■ (⌐■-■)
    Moderator Legend Alpha

    Joined:
    Feb 7, 2002
    Messages:
    20,516
    Likes Received:
    24,424
    It's a shame that only WCCFtech can get things right ...
     
  8. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    That's true for HPC parts but RDNA and "gaming" Turing have the same LDS:ALU ratio of 1KB per FP ALU. They also share the same 4KB LDS per block/workgroup.

    Support for higher maximum LDS allocations per workgroup makes sense but Nvidia seems to take a simpler approach. Still not clear to me why a dual CU isn't just a CU with 4 32-wide SIMDs with mode toggles for GCN compatibility. Maybe that's exactly what it is and the terminology is just wonky.

    "Turing allows a single thread block to address the full 64 KB of shared memory. To maintain architectural compatibility, static shared memory allocations remain limited to 48 KB, and an explicit opt-in is also required to enable dynamic allocations above this limit."
     
    JoeJ likes this.
  9. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,493
    Likes Received:
    474
    They do share the instruction cache just not the vector L0.

    Yes
     
    Alexko, Pete, BRiT and 4 others like this.
  10. Thanks to all for the clarity on the ACE nomenclature and cache structure.

    For the more interesting part :embarrased:

    Code:
    drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
    case CHIP_SIENNA_CICHLID:
        adev->gfx.me.num_me = 1;
        adev->gfx.me.num_pipe_per_me = 2;
        adev->gfx.me.num_queue_per_pipe = 1;
    What is the main reasoning behind the increase of the GFX pipe to 2. Could the API calls theroretically be dispatched across multiple pipes.

    I see the MES also got a bunch of updates for Sienna. It is would be really interesting to see this being put to action hopefully with the upcoming HW scheduling in Windows (seems to be not active based on some reports).
     
  11. Krteq

    Newcomer

    Joined:
    May 5, 2020
    Messages:
    149
    Likes Received:
    263
    Some new "Sienna Cichlid" related commits in radeonSI MESA driver

    EDIT: Corrected Phoronix link.

    Some interesting stuff:

    ac_gpu_info.c
    Code:
    if (info->chip_class >= GFX10_3)
            info->max_wave64_per_simd = 16;
        else if (info->chip_class == GFX10)
            info->max_wave64_per_simd = 20;
        else if (info->family >= CHIP_POLARIS10 && info->family <= CHIP_VEGAM)
            info->max_wave64_per_simd = 8;
    
     
    #2231 Krteq, Jun 8, 2020
    Last edited by a moderator: Jun 8, 2020
    Pete, Lightman and BRiT like this.
  12. szatkus

    Newcomer

    Joined:
    Mar 17, 2020
    Messages:
    38
    Likes Received:
    26
    I CTRL-Fed the RDNA whitepaper.

     
    Lightman likes this.
  13. Krteq

    Newcomer

    Joined:
    May 5, 2020
    Messages:
    149
    Likes Received:
    263
    So, according to that commit, in Sienna there is 16-entry wavefront controller per SIMD, right?
     
  14. szatkus

    Newcomer

    Joined:
    Mar 17, 2020
    Messages:
    38
    Likes Received:
    26
    Another interesting bit.
    PHP:
    if (ASICREV_IS_SIENNA_M(chipRevision))
                {
                    
    m_settings.supportRbPlus   1;
                    
    m_settings.dccUnsup3DSwDis 0;
                }
    I figured RB could mean a rendering backend.
     
    w0lfram likes this.
  15. szatkus

    Newcomer

    Joined:
    Mar 17, 2020
    Messages:
    38
    Likes Received:
    26
    Yeah, that's my conclusion as well. I guess 20 was too much for Navi, so they've shrank it to shave off some transistors.
     
  16. trinibwoy

    trinibwoy Meh
    Legend

    Joined:
    Mar 17, 2004
    Messages:
    12,059
    Likes Received:
    3,119
    Location:
    New York
    Or they’ve managed to reduce pipeline latency and/or increase ILP such that 16 wavefronts per SIMD is enough to hide typical latencies.

    For reference Turing allocates 8 wavefronts per SIMD down from 16 in Volta/Pascal. Ampere is back up to 16.
     
    #2236 trinibwoy, Jun 8, 2020
    Last edited: Jun 8, 2020
    Lightman, pharma, Alexko and 2 others like this.
  17. Radolov

    Newcomer

    Joined:
    Jul 30, 2019
    Messages:
    12
    Likes Received:
    13
    Does the M in ASICREV_IS_SIENNA_M mean that it will be a mobile part?
     
  18. szatkus

    Newcomer

    Joined:
    Mar 17, 2020
    Messages:
    38
    Likes Received:
    26
    I haven't even notice it.

    Code:
    #define ASICREV_IS_VEGA10_M(r)         ASICREV_IS(r, VEGA10)
    #define ASICREV_IS_VEGA10_P(r)         ASICREV_IS(r, VEGA10)
    
    Vega 10 has never been released as a mobile part, right? There are some chips with V. I really can't find logic behind it, maybe Value, Mid and Performance?

    Edit: oh, and Vega M is apparently P.
    Code:
    #define ASICREV_IS_VEGAM_P(r)          ASICREV_IS(r, VEGAM)
     
  19. Radolov

    Newcomer

    Joined:
    Jul 30, 2019
    Messages:
    12
    Likes Received:
    13
    I found some old tweet by komachi which says that M stands for "Mainstream" , unless things have changed. But it could indicate that Sienna Cichlid may not be the "Big Navi" that we're looking for. ¯\_(ツ)_/¯
     
  20. szatkus

    Newcomer

    Joined:
    Mar 17, 2020
    Messages:
    38
    Likes Received:
    26
    After seeing 128-bit bus I didn't ever think it is. I wonder why they pushed this one into the driver before Big Navi.
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...