AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by ToTTenTranz, Sep 20, 2016.

  1. Radolov

    Joined:
    Jul 30, 2019
    Messages:
    9
    Likes Received:
    13
    Thanks for mentioning. I should've gone to Specsavers. :)
    I did search for "acc" , but I did it in the 908 target for some odd reason. ¯\_(ツ)_/¯
     
  2. pTmdfx

    Regular Newcomer

    Joined:
    May 27, 2014
    Messages:
    275
    Likes Received:
    175
    One RDNA 2 bit that interests me is "CPU can cache GPU memory". APU hardware has been claimed to support coherent accesses to pageable system memory (well... these accesses dodge all levels of GPU caches though). So this new claim does sound like enabling CPU cache coherent access to SVM buffers allocated in GPU local memory! That's the missing piece once promised in the good old FSA 2012 roadmap. :-D

    If you have a device-local buffer, you definitely meant to take advantage of the GPU local bandwidth. But given that these buffers would be cacheable by CPU cores, naively speaking it would need to probe (and be probed by) CPUs and GPU neighbours for all the read/write traffic, alongside with GPU atomics having to work with MDOEFSI states. :runaway:

    I figured this might be the root cause of the RDNA-CDNA architecture split in the end. Thinking deeper about it, they would likely have to put in at least an IF Home Coherence Controller to serve neighbour memory requests (and probably GPU system-coherent atomics, if GPU L2 will not cache system coherent lines). Probe filters would have to be enabled for optimal local access bandwidth & energy efficiency, because snooping the entire system of 10 NUMA nodes (2 CPU + 8 GPUs) for all requests is not a sustainiable idea. Moreover, I wouldn't be surprised that they might want to allow GPU L2 to hold system coherent cache lines, e.g. for reducing traffic via write combining. This would then require GPU L2 to either serve probes directly, or have extras like shadow tags to absorb the traffic.

    The sad fact is that all these are irrelevant to consumer GPUs for the time being, and hence the split makes sense. No major consumer platform (MSFT/APPL/Android) seems to have an incentive to push heterogeneous computing in consumer/mobile world. XSX/PS5 is likely not touching this either. I can only hope NG consoles in 3-5 years might pick up the torch on the consumer computing front, since they have been an avid fan of APUs. :-|
     
    #6002 pTmdfx, Apr 3, 2020
    Last edited: Apr 3, 2020
  3. yuri

    Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    202
    Likes Received:
    177
    Lightman likes this.
  4. Arnold Beckenbauer

    Veteran

    Joined:
    Oct 11, 2006
    Messages:
    1,430
    Likes Received:
    362
    Location:
    Germany
    Is this the MacBook Pro's GPU from 2018/2019?
     
  5. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,874
    Likes Received:
    2,808
    Location:
    Finland
    2019 only I think? But yes.
     
  6. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    781
    Likes Received:
    211
    TheAlSpark, Kaotik and Lightman like this.
  7. ethernity

    Newcomer

    Joined:
    May 1, 2018
    Messages:
    23
    Likes Received:
    30
    The "A" is for Accumulator. It is used for accumulating results during matrix FMA.
     
  8. del42sa

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    182
    Likes Received:
    106
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...