AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Pressure · Nov 6, 2018

331mm2 (13.2B transistors) for VEGA20. That's quite small! It supports PCIe 4.0 and ECC as well.

Geeforcer · Nov 6, 2018

DavidGraham said:
AMD is comparing V100 PCI-E (250W) vs MI60 (300W). Both cards trade blows in a number of benchmarks. Though they avoided the comparison with the 10% faster V100 NVLink (300W).

I thought Volta could do over 1K images per second in ResNet-50.

HKS · Nov 6, 2018

Geeforcer said:
I thought Volta could do over 1K images per second in ResNet-50.

My guess is that AMD isn't using the Tensor Cores.

DavidGraham · Nov 6, 2018

Geeforcer said:
I thought Volta could do over 1K images per second in ResNet-50.

HKS said:
My guess is that AMD isn't using the Tensor Cores.

Yes they are not.

In May 2017, V100 did ~600 images per second.

https://devblogs.nvidia.com/inside-volta/

In May 2018, NVIDIA made software improvements, and V100 now does 1075 images per second.

We have achieved record-setting ResNet-50 performance for a single chip and single server with these improvements.

A single V100 Tensor Core GPU achieves 1,075 images/second when training ResNet-50, a 4x performance increase compared to the previous generation Pascal GPU.

https://devblogs.nvidia.com/tensor-core-ai-performance-milestones/

Deleted member 13524 · Nov 6, 2018

Rootax said:
1.8ghz at 300w ? We're not that far under Vega 10, I was expecting better, even if it's not "just" a die shrink...

Well I'm not disappointed with the clocks because Pro cards use traditionally conservative clocks, but I am with those 300W TDP.

It seems that at least in GCN a high double precision throughput causes a significant impact in power efficiency.
For example the R9 290X (1:8 FP64) has a 250W TDP with 1000MHz core clock, whereas the similar FirePo W9100 (1:2 FP64) has a 275W TDP with 930MHz core clock. Both at same memory clock speeds.
7% lower core clock for 10% higher TDP.

Tahiti OTOH had similar clocks/TDP between the consumer and pro cards, with AMD maintaining the same 1:4 FP64 throughput between the two.

yuri said:
Seeing the predictions "north of 2GHz"... oh my, god. Vega simply has to be clocked to ~1.2GHz to be effective.

Predictions of north of 2GHz were always made regarding a hypothetical consumer gaming product, not with workstation/datacenter products.
The Vega Frontier averages at 1.27GHz. Vega 64 on performance mode moves up to ~1.5GHz and Vega 64 LC to around 1.6Hz.
That's a 18% difference in average clocks, and the FE even presents a higher TDP than the Vega 64.

How much is 1.8GHz with 18% higher clocks? 2.124GHz
Sure sounds like "north of 2GHz" to me.

And this is of course looking at a hypothetical consumer release, which may very well not happen.

yuri said:
How would a 7nm Fiji do? The same I guess.

Just in case this is not pure trolling:

1 - Fiji only supports HBM1, so 4GB would still be the limit, at 512GB/s and without HBCC
2 - Fiji has a 1:16 FP64 throughput so even if it clocked at 1.8GHz the FP64 throughput would be less than 1 TFLOPs, so a 7nm Fiji would have worse FP64 performance than the very old Hawaii Pro cards.
3 - Fiji has a 1:1 FP16 throughput, so half of a Vega 20 at ISO clocks
4 - Fiji has a 1:1 INT8 throughput, so 1/4 of a Vega 20 at ISO clocks
5 - Fiji's I/O is limited to 16x PCIe 3.0. That's 32GB/s bi-directional. Vega 20 has 16x PCIe 4.0, 64GB/s bi-directional to CPU/RAM, plus an inter-GPU connection of 200GB/s.
6 - No hardware virtualization on Fiji

So no. A 7nm Fiji wouldn't do "the same".

DavidGraham · Nov 7, 2018

ToTTenTranz said:
Predictions of north of 2GHz were always made regarding a hypothetical consumer gaming product, not with workstation/datacenter products.
Vega 64 on performance mode moves up to ~1.5GHz

Vega MI25 has basically the same 1500MHz clock as Vega 64. So consumer Vega is the same as datacenter Vega. Makes no difference whatsoever.

Worse yet, officially AMD mentions MI60 to have UP TO 1.8GHz clock, which means this is by no means a fixed clock. It probably goes under that.

ToTTenTranz said:
How much is 1.8GHz with 18% higher clocks? 2.124GHz

Looking at current scaling, MI60 vs MI25, we get 20% more FLOPS. For basically the same TDP. If consumer Vega 20 vs Vega 10 are subjected to the same treatment, we are looking at 350w to 400w of power consumption for the hypothetical consumer Vega 20 @2.1Ghz clock.

ToTTenTranz said:
And this is of course looking at a hypothetical consumer release, which may very well not happen.

AMD denied any consumer Vega 20 to be released for gamers, this thing is strictly datacenter.

AMD has made it clear that their 7nm Vega graphics cards are not designed for gaming applications, instead, acting as a way for the company to enter the lucrative server/datecenter and machine learning markets.

https://www.overclock3d.net/news/gp..._vega_instinct_mi60_and_mi50_graphics_cards/1

w0lfram · Nov 7, 2018

DavidGraham said:
Vega MI25 has basically the same 1500MHz clock as Vega 64. So consumer Vega is the same as datacenter Vega. Makes no difference whatsoever.

Worse yet, officially AMD mentions MI60 to have UP TO 1.8GHz clock, which means this is by no means a fixed clock. It probably goes under that.

Looking at current scaling, MI60 vs MI25, we get 20% more FLOPS. For basically the same TDP. If consumer Vega 20 vs Vega 10 are subjected to the same treatment, we are looking at 350w to 400w of power consumption for the hypothetical consumer Vega 20 @2.1Ghz clock.

AMD denied any consumer Vega 20 to be released for gamers, this thing is strictly datacenter.

https://www.overclock3d.net/news/gp..._vega_instinct_mi60_and_mi50_graphics_cards/1

How many people are overclocking their machine learning cards..?

You keep conflating two different things within your arguments... business cards and consumer gamer cards. You assume because Vega20 on 7nm for business, makes for a horrible gaming GPU, that a 7nm Gaming GPU using the same 7nm process, would some how be horrible..? Undoable..?

I am not sure of the points you are posing.

There is an uptick in performance that AMD got from moving from Global node process to TSMC's process, that had nothing to do with 7nm scaling, but how well TSMC's nodes worked with Vega's uarch. Vega's design doesn't break down as mhz adds up, the design gets better. And if the able or designed around 2Ghz, then where is the problem..?

Mind you, these just announced server/machine learning "MI60 cards" does 1.8GHz... what about the specifically designed 7mn for gaming @ 2Ghz design..?

Geeforcer · Nov 7, 2018

I think the point is fairly obvious: AMDs Instinct product line has been clocked at pretty much the same frequencies as the consumer cards based of respective chips, which contradicts the notion of “+20% for gaming Vega 20”, academic as it may be as it is surely not happening.

Esrever · Nov 7, 2018

ToTTenTranz said:
Well I'm not disappointed with the clocks because Pro cards use traditionally conservative clocks, but I am with those 300W TDP.

It seems that at least in GCN a high double precision throughput causes a significant impact in power efficiency.
For example the R9 290X (1:8 FP64) has a 250W TDP with 1000MHz core clock, whereas the similar FirePo W9100 (1:2 FP64) has a 275W TDP with 930MHz core clock. Both at same memory clock speeds.
7% lower core clock for 10% higher TDP.

I think the TDP for the whole card includes the memory and the W9100 has 2x the memory as the 290x, easily increasing the power by the 25W difference. This is also something to consider for the MI60 vs MI125. 32GB of HBM with corresponding memory interface probably uses more power than the 16GB of HBM in the MI25.

This makes it look like the MI60 uses less power at 1.8GHz than the MI25 at 1.5GHz:

Also if it is try and if the chart is to scale, 7nm at ~1.2GHz is where they quote the half the power compared to 14nm.

gamervivek · Nov 7, 2018

DavidGraham said:
Vega MI25 has basically the same 1500MHz clock as Vega 64. So consumer Vega is the same as datacenter Vega. Makes no difference whatsoever.

Worse yet, officially AMD mentions MI60 to have UP TO 1.8GHz clock, which means this is by no means a fixed clock. It probably goes under that.

Looking at current scaling, MI60 vs MI25, we get 20% more FLOPS. For basically the same TDP. If consumer Vega 20 vs Vega 10 are subjected to the same treatment, we are looking at 350w to 400w of power consumption for the hypothetical consumer Vega 20 @2.1Ghz clock.

AMD denied any consumer Vega 20 to be released for gamers, this thing is strictly datacenter.

https://www.overclock3d.net/news/gp..._vega_instinct_mi60_and_mi50_graphics_cards/1

I'm not sure how P6/P7 works with MI25 cards, but if the peak engine clock of MI25 is 1500Mhz, it's a bit behind the peak clock of 1630Mhz on Vega64. A little underwhelming for the new process anyway.

I think the power consumption would be even higher for Vega20@2.1Ghz, if it can even hit it in normal conditions.

The gaming version can be a 1080Ti competitor but not anything more.

Entropy · Nov 7, 2018

gamervivek said:
I'm not sure how P6/P7 works with MI25 cards, but if the peak engine clock of MI25 is 1500Mhz, it's a bit behind the peak clock of 1630Mhz on Vega64. A little underwhelming for the new process anyway.

I think the power consumption would be even higher for Vega20@2.1Ghz, if it can even hit it in normal conditions.

The gaming version can be a 1080Ti competitor but not anything more.

Why would there be a gaming version at all? The fp64 performance of this card is wasted there, or put in other words, it is inefficient by design with gaming workloads. Nothing AMD has said as far as I have seen has indicated that Vega20 is targeting the gaming market.

gamervivek · Nov 7, 2018

Hawaii also had FP64. I think the probability of getting a prosumer version like Frontier Edition is much higher, just don't see AMD sell it all that much for GPGPU unless it's a really expensive pipe-cleaner.

Deleted member 13524 · Nov 7, 2018

Entropy said:
Why would there be a gaming version at all

To compete with the 1080Ti and 2080.

Entropy said:
The fp64 performance of this card is wasted there

As it was for Hawaii, yet they launched 5 consumer SKUs with that chip.

Entropy said:
it is inefficient by design with gaming workloads.

It is?
I look at the diagram and see ROPs, DSBR, geometry engines.. What use are those for, in datacentres?

Entropy said:
Nothing AMD has said as far as I have seen has indicated that Vega20 is targeting the gaming market.

It was a datacentre focused event. Nothing AMD said about Zen2 indicated that it was coming for the consumer. When nvidia showed GV100 there were no hints at releasing consumer/gaming versions either, yet Titan V exists.

Entropy · Nov 7, 2018

ToTTenTranz said:
To compete with the 1080Ti and 2080.

As it was for Hawaii, yet they launched 5 consumer SKUs with that chip.

True. However, this time, to my knowledge, AMD have not mentioned gaming at all, and they have said that they have Navi coming for gaming at 7nm.

beyondtest · Nov 7, 2018

Hi. I'm fairly new to stuff like ML INT4, INT8 etc.

I read that MI60 is only 118 TOPS and 260 TOPS for Volta in INT4.

I'f I am understanding it correctly, AMD does not want to price is and would be willing to negotiate with the buyer.

How important are these TOPs?

troyan · Nov 7, 2018

Volta's TensorCores dont support INT. This is a new feature for Turing. But there is still support for 4x INT8.

beyondtest · Nov 7, 2018

troyan said:
Volta's TensorCores dont support INT. This is a new feature for Turing. But there is still support for 4x INT8.

Thanks. How different ins 4x INT8 vs INT4?

Samwell · Nov 7, 2018

~~Volta has no Int4, only INT8. As for Int8, we're at 120 TOPs INT8 for Volta vs 60 INT8 TOPS for V20.~~ But Volta isn't really for the Int8, Int4 workload, more for FP32 and FP16. Turing in Tesla T4 is Nvidias solution for inference (There you use INT8/INT4) with 130 INT8 TOPs and 260 INT4 TOPs at 70 W vs V20 120TOPs at 300W.
So AMDs solution for inference isn't really competitive.

AMDs strength with V20 is it's DP and FP32 performance. There they can compete pretty good with Volta and might sell some cards. Also code which needs a lot of bandwidth will run great with this 1TB/s HBM.

Edit: I think troyan is right, Int8 isn't on tensor cores in volta.

beyondtest · Nov 7, 2018

Samwell said:
~~Volta has no Int4, only INT8. As for Int8, we're at 120 TOPs INT8 for Volta vs 60 INT8 TOPS for V20.~~ But Volta isn't really for the Int8, Int4 workload, more for FP32 and FP16. Turing in Tesla T4 is Nvidias solution for inference (There you use INT8/INT4) with 130 INT8 TOPs and 260 INT4 TOPs at 70 W vs V20 120TOPs at 300W.
So AMDs solution for inference isn't really competitive.

AMDs strength with V20 is it's DP and FP32 performance. There they can compete pretty good with Volta and might sell some cards. Also code which needs a lot of bandwidth will run great with this 1TB/s HBM.

Edit: I think troyan is right, Int8 isn't on tensor cores in volta.

Is V20 same as the MI60?

I listened to this for a bit:

So the MI60 can be used for rendering, RT and denoising. Could AMD release gaming drivers for it?

Zaphod · Nov 7, 2018

beyondtest said:
So the MI60 can be used for rendering, RT and denoising. Could AMD release gaming drivers for it?

Q: But can it run Crysis?
A: No.

At present, AMD haven't even announced support for Windows at all.

https://www.amd.com/en/products/professional-graphics/instinct-mi60

AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Pressure

Geeforcer

Harmlessly Evil

HKS

DavidGraham

Deleted member 13524

Guest

DavidGraham

w0lfram

Geeforcer

Harmlessly Evil

Esrever

gamervivek

Entropy

gamervivek

Deleted member 13524

Guest

Entropy

beyondtest

troyan

beyondtest

Samwell

beyondtest

Zaphod

Remember