AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Discussion in 'Architecture and Products' started by ToTTenTranz, Sep 20, 2016.

  1. Radolov

    Newcomer

    Joined:
    Jul 30, 2019
    Messages:
    12
    Likes Received:
    13
    Thanks for mentioning. I should've gone to Specsavers. :)
    I did search for "acc" , but I did it in the 908 target for some odd reason. ¯\_(ツ)_/¯
     
  2. pTmdfx

    Regular Newcomer

    Joined:
    May 27, 2014
    Messages:
    353
    Likes Received:
    300
    One RDNA 2 bit that interests me is "CPU can cache GPU memory". APU hardware has been claimed to support coherent accesses to pageable system memory (well... these accesses dodge all levels of GPU caches though). So this new claim does sound like enabling CPU cache coherent access to SVM buffers allocated in GPU local memory! That's the missing piece once promised in the good old FSA 2012 roadmap. :-D

    If you have a device-local buffer, you definitely meant to take advantage of the GPU local bandwidth. But given that these buffers would be cacheable by CPU cores, naively speaking it would need to probe (and be probed by) CPUs and GPU neighbours for all the read/write traffic, alongside with GPU atomics having to work with MDOEFSI states. :runaway:

    I figured this might be the root cause of the RDNA-CDNA architecture split in the end. Thinking deeper about it, they would likely have to put in at least an IF Home Coherence Controller to serve neighbour memory requests (and probably GPU system-coherent atomics, if GPU L2 will not cache system coherent lines). Probe filters would have to be enabled for optimal local access bandwidth & energy efficiency, because snooping the entire system of 10 NUMA nodes (2 CPU + 8 GPUs) for all requests is not a sustainiable idea. Moreover, I wouldn't be surprised that they might want to allow GPU L2 to hold system coherent cache lines, e.g. for reducing traffic via write combining. This would then require GPU L2 to either serve probes directly, or have extras like shadow tags to absorb the traffic.

    The sad fact is that all these are irrelevant to consumer GPUs for the time being, and hence the split makes sense. No major consumer platform (MSFT/APPL/Android) seems to have an incentive to push heterogeneous computing in consumer/mobile world. XSX/PS5 is likely not touching this either. I can only hope NG consoles in 3-5 years might pick up the torch on the consumer computing front, since they have been an avid fan of APUs. :-|
     
    #6002 pTmdfx, Apr 3, 2020
    Last edited: Apr 3, 2020
  3. yuri

    Regular Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    255
    Likes Received:
    246
    Lightman likes this.
  4. Arnold Beckenbauer

    Veteran

    Joined:
    Oct 11, 2006
    Messages:
    1,519
    Likes Received:
    466
    Location:
    Germany
    Is this the MacBook Pro's GPU from 2018/2019?
     
  5. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,384
    Likes Received:
    3,396
    Location:
    Finland
    2019 only I think? But yes.
     
  6. iMacmatician

    Regular

    Joined:
    Jul 24, 2010
    Messages:
    787
    Likes Received:
    215
    TheAlSpark, Kaotik and Lightman like this.
  7. ethernity

    Newcomer

    Joined:
    May 1, 2018
    Messages:
    55
    Likes Received:
    103
    The "A" is for Accumulator. It is used for accumulating results during matrix FMA.
     
  8. del42sa

    Newcomer

    Joined:
    Jun 29, 2017
    Messages:
    185
    Likes Received:
    107
  9. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,176
    Likes Received:
    2,704
    Location:
    Germany
    Since it has not been mentioned and it's probably more GCN than RDNA here goes.
    Papermaster apparently confirmed Arcturus als Instinct MI100 for 2H20:
     
    Lightman, BRiT, pharma and 1 other person like this.
  10. Lurkmass

    Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    180
    Likes Received:
    173
    It's gfx908 specifically and for comparison:

    MI50/MI60: gfx907
    MI25: gfx901
    MI6/MI8: gfx803

    RX 5700 XT: gfx1010
     
    Lightman likes this.
  11. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    9,384
    Likes Received:
    3,396
    Location:
    Finland
    And it lacks "3D pipelline" as per AMD Linux patch
     
    Lightman likes this.
  12. CarstenS

    Legend Veteran Subscriber

    Joined:
    May 31, 2002
    Messages:
    5,176
    Likes Received:
    2,704
    Location:
    Germany
    A bit late now, but apparently, HBM(2) was not so inexpensive to have on gaming cards after all:
    https://newsroom.intel.com/press-kits/architecture-day-2020/
    There's Raja in the architecture day stream talking (with a smile) about still having scars on his back for trying to bring expensive like HBM to gaming at least twice." (timestamp 1:26:48)
     
  13. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,541
    Likes Received:
    6,321
    I believe Fiji was a pipecleaner for HBM. AMD co-financed and co-developed HBM for years so they had to use it sometime/somewhere to prove the concept, so that's why it was used in Fiji despite the capacity limit. My guess is he's talking about Vega 10 and Kaby Lake G.


    As for Vega 10, there's a lot of clues pointing to Raja / RTG planning for the chip to clock a whole lot higher than it ever did. At an average 1750MHz (basically the same as the GP102 with similar size and supposedly similar 16FF process), a full Vega 10 with standard ~1.05V vcore would have been sitting closer to the 1080Ti (like Vega VII does) which at the time sold for higher than $700.
    Even their HBM2 clocks came up shorter than they predicted, as Micron edit: SK Hynix (with whom AMD developed HBM and would probably supply them the memory for significantly cheaper than Samsung) couldn't supply standard 2Gbps HBM2 to them, and only Samsung got close at the time.

    Had Vega 10 clocked like AMD planned since the beginning, they'd have 64 CUs @ 1750MHz and 512GB/s bandwidth (not to mention some stuff that didn't work out as they planned, like the primitive shaders) with a performance level that would have allowed them to sell the card for over $700. Instead they had to market the card against the GTX 1080, for less than $500, which in turn gave them much lower profit margins.

    Of course, shortly after Vega came out, the crypto craze went up, ballooning the prices of every AMD card out there, so in the end it didn't go so bad.


    So just to get to my point: I think Raja's mistake was not to implement HBM in consumer cards. It was to implement HBM in consumer cards that failed to meet their performance targets. I guess if Pascal chips had hit a power consumption wall above ~1480MHz, their adoption of GDDR5X would have been considered a mistake as well. Though a lesser one since they could always scratch the GDDR5X versions and use GDDR5 for everything, of course.
    It was a problem of implementation cost vs. average selling price of the final product. Apple seems to be pretty content with HBM2 on their exclusive Vega 12 and Navi 12 laptop GPUs, for example.
     
    #6013 ToTTenTranz, Aug 14, 2020
    Last edited: Aug 14, 2020
  14. Bondrewd

    Veteran Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    1,042
    Likes Received:
    441
    But Vega20 also clocked like turd even with a shrink.
    They've just fucked up.
    Hynix.
    It was Hynix.
    Micron did HMC and didn't even enter the HBM race until like last year.
     
  15. yuri

    Regular Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    255
    Likes Received:
    246
    Hmm, nope.

    HBM gen1 failed horribly due to capacity limit at that time - the Hawaii refresh had 8GB, but shiny the HBM highend got 4GB. Fiji was more like an engineering sample which simply had to be shipped to cover RaD, as you mentioned.

    HBM gen2 was IMO also a huge fail, since they bet the whole Vega roadmap on that. Vega 10, was a horrible bottlenecked bugged fireball. Vega 11 (Polaris replacement) got canned completely. Vega 12 was an Apple exclusive. Kaby G got EoLed pretty quickly. Dual-Vega 10 was canned. Vega 10 Nano was canned.

    Vega 20 with HBM gen2 allowed AMD to finally refresh their elder HPC offerings. So, I guess, that one wasn't that bad. However, dual-Vega 20 was just an Apple exclusive again...
     
    digitalwanderer and DavidGraham like this.
  16. ToTTenTranz

    Legend Veteran Subscriber

    Joined:
    Jul 7, 2008
    Messages:
    11,541
    Likes Received:
    6,321
    HBM2 is very successful and it's present in over a dozen different products from AMD, nvidia, NEC, Intel and maybe more. All of which with very high profit margins.

    If HBM2 had been a huge fail, Intel and Micron wouldn't have scrapped HMC to use and fab HBM2.
     
    Lightman, no-X and Leovinus like this.
  17. yuri

    Regular Newcomer

    Joined:
    Jun 2, 2010
    Messages:
    255
    Likes Received:
    246
    Well, the context was AMD introducing expensive HBM tech to consumer market. Neither nVidia, NEC, nor Intel (besides the very short lived Kaby G) employ HBM in their consumer-oriented products.
     
  18. Bondrewd

    Veteran Newcomer

    Joined:
    Sep 16, 2017
    Messages:
    1,042
    Likes Received:
    441
    Would not matter if the perf and the margins were there.
    Soon.
    We don't have other choices.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...