AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Radolov · Mar 14, 2020

CarstenS said:
accvgprs were mentioned before, for example in the first of your additional links. In fact, they were mentioned as far back as july 2019.

Thanks for mentioning. I should've gone to Specsavers.

I did search for "acc" , but I did it in the 908 target for some odd reason. ¯\_(ツ)_/¯

pTmdfx · Apr 3, 2020

One RDNA 2 bit that interests me is "CPU can cache GPU memory". APU hardware has been claimed to support coherent accesses to pageable system memory (well... these accesses dodge all levels of GPU caches though). So this new claim does sound like enabling CPU cache coherent access to SVM buffers allocated in GPU local memory! That's the missing piece once promised in the good old FSA 2012 roadmap.

If you have a device-local buffer, you definitely meant to take advantage of the GPU local bandwidth. But given that these buffers would be cacheable by CPU cores, naively speaking it would need to probe (and be probed by) CPUs and GPU neighbours for all the read/write traffic, alongside with GPU atomics having to work with MDOEFSI states. :runaway:

I figured this might be the root cause of the RDNA-CDNA architecture split in the end. Thinking deeper about it, they would likely have to put in at least an IF Home Coherence Controller to serve neighbour memory requests (and probably GPU system-coherent atomics, if GPU L2 will not cache system coherent lines). Probe filters would have to be enabled for optimal local access bandwidth & energy efficiency, because snooping the entire system of 10 NUMA nodes (2 CPU + 8 GPUs) for all requests is not a sustainiable idea. Moreover, I wouldn't be surprised that they might want to allow GPU L2 to hold system coherent cache lines, e.g. for reducing traffic via write combining. This would then require GPU L2 to either serve probes directly, or have extras like shadow tags to absorb the traffic.

The sad fact is that all these are irrelevant to consumer GPUs for the time being, and hence the split makes sense. No major consumer platform (MSFT/APPL/Android) seems to have an incentive to push heterogeneous computing in consumer/mobile world. XSX/PS5 is likely not touching this either. I can only hope NG consoles in 3-5 years might pick up the torch on the consumer computing front, since they have been an avid fan of APUs. :-|

yuri · Apr 15, 2020

Disassembly of Radeon Pro Vega 20 (ES?)

It contains some nice pictures of the weird Vega 12 die.

Arnold Beckenbauer · Apr 15, 2020

Is this the MacBook Pro's GPU from 2018/2019?

Kaotik · Apr 15, 2020

Arnold Beckenbauer said:
Is this the MacBook Pro's GPU from 2018/2019?

2019 only I think? But yes.

iMacmatician · Apr 16, 2020

Vega 20 (CUs) initially wasn't available in the 2018 15" MacBook Pros, but a silent update in November 2018 added them as BTO options.

Deleted member 90741 · Apr 21, 2020

Radolov said:
There was a patch on Arcturus talking about something new called ”AccVGPRs”. Previously there has been mentions of AGPRs , but to my knowledge it has never been clarified what the “A” stood for. Is it safe to assume that it stands for “Accelerator”?

The "A" is for Accumulator. It is used for accumulating results during matrix FMA.

del42sa · May 13, 2020

https://videocardz.com/newz/amd-ann...ing-16gb-hbm2-memory-and-infinity-fabric-link

AMD Radeon PRO VII for professionals

CarstenS · Jun 18, 2020

Since it has not been mentioned and it's probably more GCN than RDNA here goes.
Papermaster apparently confirmed Arcturus als Instinct MI100 for 2H20:

https://twitter.com/x/status/1273292081205657600

Lurkmass · Jun 18, 2020

CarstenS said:
Since it has not been mentioned and it's probably more GCN than RDNA here goes.

It's gfx908 specifically and for comparison:

MI50/MI60: gfx907
MI25: gfx901
MI6/MI8: gfx803

RX 5700 XT: gfx1010

Kaotik · Jun 19, 2020

And it lacks "3D pipelline" as per AMD Linux patch

CarstenS · Aug 14, 2020

A bit late now, but apparently, HBM(2) was not so inexpensive to have on gaming cards after all:
https://newsroom.intel.com/press-kits/architecture-day-2020/
There's Raja in the architecture day stream talking (with a smile) about still having scars on his back for trying to bring expensive like HBM to gaming at least twice." (timestamp 1:26:48)

Deleted member 13524 · Aug 14, 2020

CarstenS said:
There's Raja in the architecture day stream talking (with a smile) about still having scars on his back for trying to bring expensive like HBM to gaming at least twice." (timestamp 1:26:48)

I believe Fiji was a pipecleaner for HBM. AMD co-financed and co-developed HBM for years so they had to use it sometime/somewhere to prove the concept, so that's why it was used in Fiji despite the capacity limit. My guess is he's talking about Vega 10 and Kaby Lake G.

As for Vega 10, there's a lot of clues pointing to Raja / RTG planning for the chip to clock a whole lot higher than it ever did. At an average 1750MHz (basically the same as the GP102 with similar size and supposedly similar 16FF process), a full Vega 10 with standard ~1.05V vcore would have been sitting closer to the 1080Ti (like Vega VII does) which at the time sold for higher than $700.
Even their HBM2 clocks came up shorter than they predicted, as ~~Micron~~ edit: SK Hynix (with whom AMD developed HBM and would probably supply them the memory for significantly cheaper than Samsung) couldn't supply standard 2Gbps HBM2 to them, and only Samsung got close at the time.

Had Vega 10 clocked like AMD planned since the beginning, they'd have 64 CUs @ 1750MHz and 512GB/s bandwidth (not to mention some stuff that didn't work out as they planned, like the primitive shaders) with a performance level that would have allowed them to sell the card for over $700. Instead they had to market the card against the GTX 1080, for less than $500, which in turn gave them much lower profit margins.

Of course, shortly after Vega came out, the crypto craze went up, ballooning the prices of every AMD card out there, so in the end it didn't go so bad.

So just to get to my point: I think Raja's mistake was not to implement HBM in consumer cards. It was to implement HBM in consumer cards that failed to meet their performance targets. I guess if Pascal chips had hit a power consumption wall above ~1480MHz, their adoption of GDDR5X would have been considered a mistake as well. Though a lesser one since they could always scratch the GDDR5X versions and use GDDR5 for everything, of course.
It was a problem of implementation cost vs. average selling price of the final product. Apple seems to be pretty content with HBM2 on their exclusive Vega 12 and Navi 12 laptop GPUs, for example.

Bondrewd · Aug 14, 2020

ToTTenTranz said:
Had Vega 10 clocked like AMD planned since the beginning

But Vega20 also clocked like turd even with a shrink.
They've just fucked up.

ToTTenTranz said:
as Micron (with whom AMD developed HBM and would probably supply them the memory for significantly cheaper than Samsung)

Hynix.
It was Hynix.
Micron did HMC and didn't even enter the HBM race until like last year.

yuri · Aug 14, 2020

ToTTenTranz said:
I think Raja's mistake was not to implement HBM in consumer cards. It was to implement HBM in consumer cards that failed to meet their performance targets.

Hmm, nope.

HBM gen1 failed horribly due to capacity limit at that time - the Hawaii refresh had 8GB, but shiny the HBM highend got 4GB. Fiji was more like an engineering sample which simply had to be shipped to cover RaD, as you mentioned.

HBM gen2 was IMO also a huge fail, since they bet the whole Vega roadmap on that. Vega 10, was a horrible bottlenecked bugged fireball. Vega 11 (Polaris replacement) got canned completely. Vega 12 was an Apple exclusive. Kaby G got EoLed pretty quickly. Dual-Vega 10 was canned. Vega 10 Nano was canned.

Vega 20 with HBM gen2 allowed AMD to finally refresh their elder HPC offerings. So, I guess, that one wasn't that bad. However, dual-Vega 20 was just an Apple exclusive again...

Deleted member 13524 · Aug 14, 2020

yuri said:
HBM gen2 was IMO also a huge fail

HBM2 is very successful and it's present in over a dozen different products from AMD, nvidia, NEC, Intel and maybe more. All of which with very high profit margins.

If HBM2 had been a huge fail, Intel and Micron wouldn't have scrapped HMC to use and fab HBM2.

yuri · Aug 14, 2020

ToTTenTranz said:
HBM2 is very successful and it's present in over a dozen different products from AMD, nvidia, NEC, Intel and maybe more. All of which with very high profit margins.

If HBM2 had been a huge fail, Intel and Micron wouldn't have scrapped HMC to use and fab HBM2.

Well, the context was AMD introducing expensive HBM tech to consumer market. Neither nVidia, NEC, nor Intel (besides the very short lived Kaby G) employ HBM in their consumer-oriented products.

Bondrewd · Aug 14, 2020

yuri said:
Well, the context was AMD introducing expensive HBM tech to consumer market.

Would not matter if the perf and the margins were there.

yuri said:
employ HBM in their consumer-oriented products.

Soon.
We don't have other choices.

AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Radolov

pTmdfx

yuri

Arnold Beckenbauer

Kaotik

Drunk Member

iMacmatician

Deleted member 90741

Guest

del42sa

CarstenS

Moderator

Lurkmass

Kaotik

Drunk Member

CarstenS

Moderator

Deleted member 13524

Guest

Bondrewd

yuri

Deleted member 13524

Guest

yuri

Bondrewd