AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

P100 is last years architecture. Would be more interesting to se how it compares to V100.

You are not ready to see any V100 soon ( end Q3 for Tesla , DL, maybe in the form of V102 starting 2018).. and nobody have performance nor benchmark . If you can have me a sample of it, i will be glad to do the test for you.
 
Last edited:
But this is not a consumer product...looks like we won't be seeing "regular" Vega before this fall.
As sad as I find this personally, AMD needs all the money it can get. And that's in the professional market, where with HBCC, there are more market segment where Vega would be able to make a difference and thus warrant investing money for the target audiences. Gaming market tops out at 700 US-$ normally.
 
Contrary to the render, I'm pretty sure the card in Raja's hands had 1× 8-pin and 1× 6-pin.

Impressive numbers shown there, for sure.


I would not bet on the 8-hi stacks. It is not to be ruled out yet that Vega can talk to four stacks - maybe even that Vega sample shown at the Radeon Tech Summit only was the smaller Vega 11 or not the fully equipped version of Vega 10. Smoke screens and mirrors. :)

Well what i was write upper, can AMD have developp 2 interposers, one with 4 stacked ram and one with 2.. ( used on the MI25 ).. ? maybe can even reduce cost of it for the gaming one.
 
I think Vega either has insanely low yields or AMD is selling them like hotcakes for all the professional products, or both tbh. It seems like there won't be consumer Vega for a while with how little of it was shown. Since frontier edition is coming end of June, it looks like consumer cards might be even later.
 
I think Vega either has insanely low yields or AMD is selling them like hotcakes for all the professional products, or both tbh. It seems like there won't be consumer Vega for a while with how little of it was shown. Since frontier edition is coming end of June, it looks like consumer cards might be even later.

Rumors ( well leak for some ), point to a 31th showcase of AIB cards and availbility the 5 june..
 
I would not bet on the 8-hi stacks. It is not to be ruled out yet that Vega can talk to four stacks.
Well what i was write upper, can AMD have developp 2 interposers, one with 4 stacked ram and one with 2..
And developing 2 different interposers / reference designs plus gutting half the memory channels for consumer products would be better than just using 8-Hi stacks?


- maybe even that Vega sample shown at the Radeon Tech Summit only was the smaller Vega 11 or not the fully equipped version of Vega 10. Smoke screens and mirrors. :)
Honestly, I think that would be too good to be true. We haven't seen anyone from AMD mentioning more than one Vega chip yet. In January 2016 Raja was already saying there'd be a Polaris 10 and a Polaris 11.


I think Vega either has insanely low yields or AMD is selling them like hotcakes for all the professional products, or both tbh. It seems like there won't be consumer Vega for a while with how little of it was shown. Since frontier edition is coming end of June, it looks like consumer cards might be even later.
Vega cards are scheduled to be formally announced at Computex in May 31st.
This was an analyst presentation so the fact that there even was a couple of new products being shown was quite a surprise.
They showed the enterprise Vega for the analysts because it's a higher margin product. Vega for consumers will be shown in 2 weeks.

Plus, AMD doesn't have a history of getting the professional graphics cards first in the market. On the contrary. The only exception I remember of was Tonga, but by then AMD was still clearing inventory for Tahiti IIRC.
 
And developing 2 different interposers / reference designs plus gutting half the memory channels for consumer products would be better than just using 8-Hi stacks?
If the alternative was not being able to get to 16 GiB because of a lack of 8-hi stacks: Yes. Talking about risk management, which seems crucial for AMD.
 
q9orIIU.jpg

http://pro.radeon.com/en-us/vega-frontier-edition/
http://pro.radeon.com/en-us/frontier/
 
16GB with 480GB/s bandwidth - 2 stacks.

Well, perhaps. What if the interposer and memory IO is a "split design", in two variants for the 2 configurations of HBM stacks (2 or 4) with the count of IO lanes unchanged, merely that the interposer routes them differently?
 
16GB with 480GB/s bandwidth - 2 stacks.

Well, perhaps. What if the interposer and memory IO is a "split design", in two variants for the 2 configurations of HBM stacks (2 or 4) with the count of IO lanes unchanged, merely that the interposer routes them differently?
It's far, far more likely that Hynix is doing 8-Hi stacks than that they'd be using HBM's in "clamshell" mode with half the io-speed

edit: for the lazy, the memory speed is 1.92 Gbps so practically it has to be 2 Gbps chips "which can't be available" according to some
 
P100 is last years architecture. Would be more interesting to se how it compares to V100.
I think DeepBench is only FP32 though for now, while Nvidia was pushing P100 towards FP16; its FP32 peak performance is 9.3 TFLOPs as it is probably fair to assume they are using a PCIe P100.
If using FP32 might as well get a Tesla P4 (full GV102) with just over 12 TFLOPs and 250W instead of the P100, unless one is investing in NVLink in a 4GPU per CPU socket node implementation (2S) with P100.

Still quick glance it could be tentative good result for AMD even just from FP32 perspective but critically comes down to the details on how the environments were setup and whether it can perform better with mixed precision and we need other DL applications/frameworks benched with it (if DeepBench is still FP32 only).
Last I read from Baidu was:
Both forward and backward operations are tested. This first version of the benchmark will focus on training performance in 32-bit floating-point arithmetic. Future versions may expand to focus on inference workloads as well as lower precision arithmetic.
We will use vendor supplied libraries even if faster independent libraries exist or faster results have been published. Most users will default to the vendor supplied libraries and as such the vendor supplied libraries are most representative of users' experience.
TBH I would be a bit leery of Nvidia setting up the benchmark on other competitors to do comparison and feel same way with AMD setting up Nvidia P100 CUDA-libraries environment.
Fingers crossed they release other results such as AlexNet or ResNet and just with their own, easier to then get results comparable.
Cheers

Edit:
Corrected comment about Frontier being Mi25.
 
Last edited:
If the alternative was not being able to get to 16 GiB because of a lack of 8-hi stacks: Yes. Talking about risk management, which seems crucial for AMD.
It was mentioned before, but here's the reference:
http://pro.radeon.com/en-us/vega-frontier-edition/

bWB7Zc8.jpg


16GB at ~480GB/s.

(Bonus points for what it seems to be an AiO watercooled version in gold)

Unless they're clocking these 4*HBM2 stacks at lower values than HBM1, it looks pretty clear to me it's two 8-Hi stacks rated at 1.9Gbps.

Neither 1.9Gbps or 8-Hi are in SK Hynix' catalog. What a miracle.


So the memory speed is ~1.88 Gbps. Does anyone know if the HBM2 is underclocked from 2 Gbps or is it rated at a lower speed? (These possibilities are not mutually exclusive….)

My guess is they used 2Gbps HBM2 chips but brought the clocks a bit lower for the 8-Hi stacks (more stacks = more heat, higher leakage?).
Cards with 2*4-Hi HBM2 may be seeing the chips at the full 2Gbps.
 
Last edited by a moderator:
So the memory speed is ~1.88 Gbps. Does anyone know if the HBM2 is underclocked from 2 Gbps or is it rated at a lower speed? (These possibilities are not mutually exclusive….)
Hm? I counted 1.92 Gbps

edit: nevermind, tired it seems, 1.88 Gbps it is
 
Last edited:
Back
Top