AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Rootax · Sep 13, 2017

xEx said:
management work is not bureaucratic, its to ensure everything and everyone is being used as close to a 100% efficiency as possible and someone like Raja is also involves in design(that was why they rehire him in first place) so he should be managing resources to make sure those problems get fixed as soon as possible.

About Vega 11 and RR being schedule for 2018 tells me that the current vega may be hardware broken and that they will(are) redesigning to fix it, there is no other reason I see for that launch day(we already have zen and we already have vega and since zen is ready I only see vega not being ready the reason) Time will tell.

Vega 11 is replacing Polaris no ? It's not a high end chip, am I right ?

I hope Vega 10 is not hardware broken... If it was, I doubt they would push it in the pro cards like they are (WX9100, SSG...), but It may be wishfull thinking.

3dilettante · Sep 13, 2017

xEx said:
About Vega 11 and RR being schedule for 2018 tells me that the current vega may be hardware broken and that they will(are) redesigning to fix it, there is no other reason I see for that launch day(we already have zen and we already have vega and since zen is ready I only see vega not being ready the reason) Time will tell.

There were leaks about Raven Ridge samples earlier in the year, and there are Linux driver changes related to it and Vega features like the DSBR published recently (Linux drivers likely trailing internal Windows work). It seems far too late to be redesigning anything for the APU if it actually exists in physical form. It would probably be too late for significant change if it were less than six months to a year before that.

That this bring-up work seems like it has suffered from protracted development or a lack of development in advance of late-stage events like sample silicon may point to some form of disruption, or not having a full handle on the new elements of the architecture. Some of the claimed benefits of the new fabric and Zen's design efforts were related to a change in methodology and tools that allowed for more regularity, discoverability, and instrumentation in the hardware so that less time was spent in validation/fixes and more could be determined in advance. Perhaps that wasn't so readily applied to the graphics elements of the company.

Anarchist4000 · Sep 13, 2017

Geeforcer said:
I think the strategic decision to so heavily tie themselves to HMB has been a glaring mistake and skipping an annual product cycle could not have been too popular either.

HBM is the likely future and we've yet to see how it applies, if at all, to APUs. It doesn't seem to be the holdup, but driver development given current feature state. Raja wouldn't have been at fault for skipping part of a product cycle as AMD indicated that was a higher level call to focus funds on Ryzen.

Geeforcer said:
I don't know: Between him leaving right after what I though was an extended vacation with Lisa herself (as opposed to one of his own underlings) taking the reins, taken together with him not getting stock option bonus last month may indicate there is some internal fraction/dissatisfaction with group's performance.

What possible metrics would he have been held to that they've seen double digit growth, increased market share, cleared most of the inventory thanks to mining, yet still "miss" the option? The more likely scenario is the bonuses we're tied to the CPU release as RTG technically would have outperformed.

Frenetic Pony · Sep 13, 2017

Anarchist4000 said:
HBM is the likely future and we've yet to see how it applies, if at all, to APUs. It doesn't seem to be the holdup, but driver development given current feature state. Raja wouldn't have been at fault for skipping part of a product cycle as AMD indicated that was a higher level call to focus funds on Ryzen.

What possible metrics would he have been held to that they've seen double digit growth, increased market share, cleared most of the inventory thanks to mining, yet still "miss" the option? The more likely scenario is the bonuses we're tied to the CPU release as RTG technically would have outperformed.

The Etherium boom was going to make RTG sales great regardless of actual hardware. But it's certainly not a failure. Guy's probably just trying to save his marriage.

Rootax said:
Vega 11 is replacing Polaris no ? It's not a high end chip, am I right ?

I hope Vega 10 is not hardware broken... If it was, I doubt they would push it in the pro cards like they are (WX9100, SSG...), but It may be wishfull thinking.

This is entirely likely. A Vega 10 can easily be cut straight in half, one HBM stack etc. etc. Unfortunately a Vega 32 wouldn't actually be on par with an Rx580, so the other option would be Vega 48, drop one shader engine instead of 2. That would hit above 580, but I wonder if it would be bandwidth limited with only one stack of HBM. For cost reasons I can't see it having 2. Even if they can get HBM2 running at full speed that works out to the same bandwidth a 580 already has, making it a potential bottleneck.

Rootax · Sep 14, 2017

Still no wx9100 and ssg... They were supposed to be launched today, with new pro drivers...

CarstenS · Sep 14, 2017

sebbbi said:
Our SPH fluid solver uses 16 bit (normalized) storage for position data. fp16 for velocities. We have a relative small area however, but many professional fluid dynamics use cases are also using a small simulation area around a single car/bike/plane/rocket. Fluid solving itself considers only a small neighborhood (either neighboring voxels or particles fetched from a fixed size grid). If you store the particles to a cache locality optimized grid, you can store the fractional part of the position inside the grid cell. Similarly math inside the SPH kernel doesn't need high precision, since the particles that affect each other are close to each other. This avoids the biggest floating point problems such as catastrophic cancellation (which obviously are very relevant if you use a single global floating point coordinate space).

But doing fluid sim in fp16 versus fp64 requires much more thought about the math and data structures. Professional users tend to favor higher hardware cost over higher amount of programming hours.

As should have been obvious, I was talking about the professional space not in the sense of professional game development, were physics just have to look borderline realistic ("Hair", really, that the best AMD and Nvidia can do?) but where structural integrity or even people's lives are at stake (fluid dynamics simulatin for air or space craft as one example). Where it's also a legal issue if you can proof - and watertight proof - whether or not your using reduced precision in order to save cost really did NOT contribute to that catastrophic flight frame failure due to resonance build up while travelling through a particular patch of air.

Otherwise, you might get away with cheaping out on precision. Maybe I should start using footnotes and legal disclaimers.

Picao84 · Sep 14, 2017

yuri said:
LOL Read it through more carefully then.

Because slides are obviously known for conveying full and detailed information, right?
In any case, care to show a case where the white paper so bluntly negates the slides? I have done my work, time for you to make yours and prove your point.

Vega to me is just a NetBursted/Bulldozered GCN. They have apparently invested enormous effort/die space to push the frequency up. AMD carefully dances around the pipeline lengthening which reminds Bulldozer a bit.

* Vega topped Polaris' freq
=======
* Vega lost IPC by pipelines tweaks
* Vega obtained a partial compensation of the pipeline tweaks induced IPC loss in form of huge caches, etc[?]
* Vega wasted diespace by the compensations
* Vega became a super-powerhog

That is more or less a correct assessment, with the exception being specific cases where performance increased more than the clock speed ratio in non gaming situations.

Picao84 · Sep 14, 2017

DavidGraham said:
So does the TitanXp or the 1080Ti. Only Quadro line uses pro drivers.

I don't get your point? That does not invalidate in any way that Vega performs better at professional applications in its FE incarnation, being competitive with Quadro P6000. Just like GP102 would perform better with Quadro drivers, before Titan XP got a driver upgrade, following Vega's launch. Both IHV's bottleneck consumer solutions in software so they dont compete with the professional ones. Your rebutal sounded more like "I must be right at all cost damnit!" than a logical, thought out, answer.

Yep, Vega 64 Liquid is 700$+, while being barely above the Air Vega 64.

True, but if you read the thread, it was just an example how "very nice" could mean anything or nothing. Its nothing to go on about.

DavidGraham · Sep 14, 2017

Picao84 said:
I don't get your point? That does not invalidate in any way that Vega performs better at professional applications in its FE incarnation,

The point is, without pro drivers, 1080Ti/TitanXp is faster than RX Vega, but still slower than Quadros. With pro drivers Quadro 6000 is faster than Vega FE. In all cases GP102 is still faster than Vega in these applications. So it's not really competing there at all, except on price.

And again you are still ignoring that Compute has always been AMD's strong point since GCN came to be, even Crypto has been their strong point since VLIW5. So it's not like they made an effort to excel in these regards. You are still also ignoring the fact they overclocked the chip to hell and beyond to achieve these results. Saying Vega is one chip competing with 3 different chips is sugarcoating a dire situation, I can practicallly say that about just any chip if I cherrypick well enough some cases where it competes.

-Vega Pro will very likely not compete with Quadro GP102, because it's downclocked to 1200MHz.
-RX doesn't compete with GP102 in gaming at all.
-Vega Instinct is not known to compete with GP100 at this point, we haven't heard anything about it's performance since AMD showed that one marketing slide (and we all know how reliable those are).
-Vega FE is competing with TitanXP I some apps, but it falls behind it in gaming. However Vega FE is a vague product at best, no body knows who the chip is for? It doesn't use certified drivers so it can't be for pro uses, it doesn't game well, or even achieve good VR performance, so it can't be for game developers, they are a better served with a Titan. Maybe it's for a developer interested in AMD ecosystem?

CarstenS · Sep 14, 2017

DavidGraham said:
-Vega FE is competing with TitanXP I some apps, but it falls behind it in gaming. However Vega FE is a vague product at best, no body knows who the chip is for? It doesn't use certified drivers so it can't be for pro uses, it doesn't game well, or even achieve good VR performance, so it can't be for game developers, they are a better served with a Titan. Maybe it's for a developer interested in AMD ecosystem?

You can say many thing about Vega in all it's incarations, but to be fair, „it doesn't game well, or even achieve good VR performance“ is a bit drastic. Yes, it is slower in most gaming applications than GP102-variants and also slower than GTX 1080 in a fair share of them. But you certainly can „game well“ on any Vega product released as of now.

DavidGraham · Sep 14, 2017

CarstenS said:
You can say many thing about Vega in all it's incarations, but to be fair, „it doesn't game well, or even achieve good VR performance“ is a bit drastic.

I meant: it doesn't game as well, execuse my misspell. Though that bit about VR is not drastic IMO, VR performance even on the RX variant is absymal.

Deleted member 13524 · Sep 14, 2017

silent_guy said:
We're not talking perf/W efficiency here. We're talking architectural efficiency.

What's the difference? Are you calling "architectural efficiency" to perf/mm^2 by leaving TDP aside?

If so, has anyone made a Vega 10 vs GP102 comparison at ISO clocks? Downclock a Titan X down to say 1400MHz, do the same with a Vega 64 and see how they compare?

Last time I saw something like that, I think the Polaris 10 actually goes very close to a GP104 at ISO clocks for core and memory.

silent_guy said:
Do you have any reason at all to believe that the compute specific extra features of GP100 have a negative impact on its graphics performance? Is there a negative to having larger register files? To larger cache? To having NVLINKs?

They at least have an impact in the clocks the GP100 can achieve at a given TDP compared to GP102. According to nvidia's own whitepapers, GP100's peak FP32 throughout is 10.6 TFLOPs (56SM @1480MHz) with a 300W TDP whereas GP102 can get about 20% more at 250W. This obviously has an impact in its graphics performance.
So the answer to your question is yes: GP100's 1/2 FP64 + 2xFP16 + more cache + nvlinks etc. do in fact have a negative impact on gaming performance.
They're not responsible for decreasing IPC, they're responsible for decreasing clocks at iso TDP.

silent_guy said:
Because that's really what this is about: people claiming that VEGA's gaming performance is lackluster because it's focusing on compute.

There's a number of reasons why Vega isn't reaching the same gaming performance as GP102 at iso TDP:

1 - GlobalFoundries' 14LPP is substantially less efficient than TSMC's 16FF+ (from the posts of experts in this forum, there's at least a 20% difference in power consumption at iso clocks).

2 - As Raja confirmed 2 weeks ago some of the features aren't implemented in the driver yet (his statement implies they will be, and so have @Rys ' statements so far). Perhaps this discussion will be different when DSBR gets enabled even in automatic mode, since it'll affect both geometry performance and effective bandwidth.

3 - Also as mentioned by Raja in the same tweet, the Infinity Fabric being used in Vega 10 wasn't optimized for consumer GPUs and that also seems to be holding the GPU back (maybe by holding back the clocks at iso TDP). Why did they use IF in Vega 10? Perhaps because iterating IF in Vega 10 was an important stepping stone for optimizing the implementation for Navi or even Vega 11 and Raven Ridge. Perhaps HBCC was implemented around IF from the start. Perhaps Vega SSG doesn't have a PCIe controller for the SSDs and IF is being used to implement a PCIe controller in Vega.

4 - Compute-oriented features like 2*FP16, larger caches and HBCC prevent Vega 10 from achieving higher clocks at iso TDP, just like what happens with GP100.

Picao84 · Sep 14, 2017

DavidGraham said:
The point is, without pro drivers, 1080Ti/TitanXp is faster than RX Vega, but still slower than Quadros. With pro drivers Quadro 6000 is faster than Vega FE. In all cases GP102 is still faster than Vega in these applications. So it's not really competing there at all, except on price.

Sorry but I still think that observatorion is pointless. What matters is Vega FE performance with drivers for professional applications. Wether nvidia cards with non-professional drivers have better performance or not is irrelevant since that is not their target market. Yes, a company might buy them for professional applications but they will a) not have guaranteed performance for their applications since nvidia can change performance at any time to optimised gaming and b) not have dedicated support from nVIDIA.

In several applications Vega FE is within 10% or less of Quadro P6000. That is competing in my book. Did you really check the link with Toms Hardware review?

And again you are still ignoring that Compute has always been AMD's strong point since GCN came to be, even Crypto has been their strong point since VLIW5. So it's not like they made an effort to excel in these regards. You are still also ignoring the fact they overclocked the chip to hell and beyond to achieve these results. Saying Vega is one chip competing with 3 different chips is sugarcoating a dire situation, I can practicallly say that about just any chip if I cherrypick well enough some cases where it competes.

The chip is not overclocked to hell and beyond, it was designed to achieve those clocks. Regardless if its sugar coating or not, its reality.

-Vega Pro will very likely not compete with Quadro GP102, because it's downclocked to 1200MHz.

Where have you seen this information? Vega FE on PCPer hovers between 1300Mhz and 1500Mhz.
What do you mean by Vega Pro?
Also in the same review Vega FE is compared to Radeon Pro Duo (single chip). In most cases its more than 100% faster than the latter (while in gaming is only around 35-40%). Clearly there were changes done to geometry workloads to achieve that. However, like I speculated on a previous post, those changes afect the performance of other units (edited: please read tasks, not units) while gaming (which are not done in professional applications, freeing resources).

-RX doesn't compete with GP102 in gaming at all.

I never said that it did.

-Vega Instinct is not known to compete with GP100 at this point, we haven't heard anything about it's performance since AMD showed that one marketing slide (and we all know how reliable those are).

Yes, we need someone to develop a nice free benchmark for that!

-Vega FE is competing with TitanXP I some apps, but it falls behind it in gaming. However Vega FE is a vague product at best, no body knows who the chip is for? It doesn't use certified drivers so it can't be for pro uses, it doesn't game well, or even achieve good VR performance, so it can't be for game developers, they are a better served with a Titan. Maybe it's for a developer interested in AMD ecosystem?

Vega FE is a result from choices made to try and compete everywhere at once with an architecture that has suffered patches and band aids since the beginning, in reaction to nVIDIA strenghts. Everytime they make changes, it seems something gets broken or inneficient. GCN needs to die and they need to start from scratch (not literally from scratch but you get what I mean).

Picao84 · Sep 14, 2017

ToTTenTranz said:
They at least have an impact in the clocks the GP100 can achieve at a given TDP compared to GP102. According to nvidia's own whitepapers, GP100's peak FP32 throughout is 10.6 TFLOPs (56SM @1480MHz) with a 300W TDP whereas GP102 can get about 20% more at 250W. This obviously has an impact in its graphics performance.
So the answer to your question is yes: GP100's 1/2 FP64 + 2xFP16 + more cache + nvlinks etc. do in fact have a negative impact on gaming performance.
They're not responsible for decreasing IPC, they're responsible for decreasing clocks at iso TDP.

Exactly! The fact that people think that compute features are absolutely free, in the sense that they dont affect gaming performance, baffles me.

2 - As Raja confirmed 2 weeks ago some of the features aren't implemented in the driver yet (his statement implies they will be, and so have @Rys ' statements so far). Perhaps this discussion will be different when DSBR gets enabled even in automatic mode, since it'll affect both geometry performance and effective bandwidth.

I have my doubts it will make a huge impact, but we will see. I may be wrong but memory bandwidth is not the sole reason for Vega's "low" performance in games. The increase in TFLOPS might also not be enough to absorb all the new geometry stuff that has transitioned from fixed function to programmable. But, again, I may be completely off mark, dont hold my word on it.

4 - Compute-oriented features like 2*FP16, larger caches and HBCC prevent Vega 10 from achieving higher clocks at iso TDP, just like what happens with GP100.

True as well.

Kaotik · Sep 14, 2017

ToTTenTranz said:
At least for me, this part of the memo means that we probably won't be seeing Vega 11 or Raven Ridge in the stores before Q1 2018:

Which is quite a shame, really. I thought Raven Ridge would be one of the greatest revelations of 2017.

Raven Ridge is still supposedly launching this month for mobile, and Raja doesn't start his sabattical until 25th, so there's still room for that (despite the "next product launch excitement in 2018" part)

lanek · Sep 14, 2017

Rootax said:
Still no wx9100 and ssg... They were supposed to be launched today, with new pro drivers...

You know, US time is not really the same as Europe . actually it is only 5:50 in Los Angeles and 8:51 in New York.

But with professional gpus, dont expect much as an announcement.

Deleted member 13524 · Sep 14, 2017

Kaotik said:
Raven Ridge is still supposedly launching this month for mobile, and Raja doesn't start his sabattical until 25th, so there's still room for that (despite the "next product launch excitement in 2018" part)

You're right, Ryzen Mobile is supposedly coming Q3. Maybe he's leaving right after that.

Picao84 said:
I have my doubts it will make a huge impact, but we will see. I may be wrong but memory bandwidth is not the sole reason for Vega's "low" performance in games. The increase in TFLOPS might also not be enough to absorb all the new geometry stuff that has transitioned from fixed function to programmable. But, again, I may be completely off mark, dont hold my word on it.

(I brought the discussion to this place)
Looking at Vega 56 vs. Vega 64 performance at iso clocks, it seems like those extra 8 NCUs in Vega 64 are twiddling their thumbs pretty much the whole time. At the average 1.59GHz clocks they had in that comparison, we're looking at 8 NCUs * 64 ALUs * 2 * 1,59 = >1.6 TFLOPs FP32 extra that the full Vega 10 is simply not using (EDIT: in games and gaming benchmarks which is what they tested, of course).

It does seem like Vega 10 was designed to get its ALUs to do more than what they're doing right now (e.g. getting those primitive shaders up and running), otherwise there would be little practical reason to launch a consumer GPU with all NCUs enabled.

Picao84 · Sep 14, 2017

ToTTenTranz said:
You're right, Ryzen Mobile is supposedly coming Q3. Maybe he's leaving right after that.

(I brought the discussion to this place)
Looking at Vega 56 vs. Vega 64 performance at iso clocks, it seems like those extra 8 NCUs in Vega 64 are twiddling their thumbs pretty much the whole time. At the average 1.59GHz clocks they had in that comparison, we're looking at 8 NCUs * 64 ALUs * 2 * 1,59 = >1.6 TFLOPs FP32 extra that the full Vega 10 is simply not using.

It does seem like Vega 10 was designed to get its ALUs to do more than what they're doing right now (e.g. getting those primitive shaders up and running), otherwise there would be little practical reason to launch a consumer GPU with all NCUs enabled.

OK, that's very weird. Now I would like to see Vega 56 with Vega FE equivalent drivers in professional applications to see if the same thing happens or its only in games.

DavidGraham · Sep 14, 2017

Picao84 said:
In several applications Vega FE is within 10% or less of Quadro P6000. That is competing in my book.

A cut down GP104 does the same thing to Vega FE, does that mean GP104 can compete with all GPUs outhere? You can't take corner cases and generalize from there and call it competition.

Picao84 said:
What do you mean by Vega Pr

WX9100 is the pro line of Vega, it's the one with actual certified drivers, it's base clock is 1200Mhz, and it uses 6+8 power connectors and a different cooler.

Picao84 said:
Also in the same review Vega FE is compared to Radeon Pro Duo (single chip). In most cases its more than 100% faster than the latter (while in gaming is only around 35-40%).

FWIW, drivers for Pro duo left a lot to be desired, the drivers were not able to extract the performance of both GPUs, so in alot of cases they act as one GPU, or slightly above it.

Picao84 said:
The chip is not overclocked to hell and beyond, it was designed to achieve those clocks. Regardless if its sugar coating or not, its reality

Designed or overclocked doesn't really mean much when the vendor is doing it, I will rephrase then, clearly AMD pushed the design of Vega outside of it's comfort zone to barely compete with even GP104. It's evident in all of these power profiles that achieve good balance between power and performance, but are yet ignored and replaced with the default profile that screws the balance over for a few more percents of fps.

Picao84 said:
Vega FE is a result from choices made to try and compete everywhere at once with an architecture that has suffered patches and band aids since the beginning, in reaction to nVIDIA strenghts. Everytime they make changes, it seems something gets broken or inneficient. GCN needs to die and they need to start from scratch (not literally from scratch but you get what I mean).

I get it, and I agree with that wholeheartedly. However I do stress out that Vega is not a jack of all trade at all, it tries to compete on too many fronts but falls behind in most of them.

Anarchist4000 · Sep 14, 2017

Picao84 said:
OK, that's very weird. Now I would like to see Vega 56 with Vega FE equivalent drivers in professional applications to see if the same thing happens or its only in games.

Weird, but easily the result of software limiting. Executing NOPs in the face of a limit. Even 56 may be significantly idle. That'd be the reason the undervolting plus increased power limit does so much.

The chips appear to be using more energy than expected and/or were designed for low power APUs. Caching mechanisms the likely culprit as misses burn additional energy. That can be corrected with drivers.

Has anyone actually tested a "midrange" configuration with Vega? Around where that Nano would exist. We've seen the card, but not in any official capacity. Raja did mention not testing dynamic power. With the prices Intel charges for Iris Pro AMD could probably work an oversized Vega into the lineup just to gain share. Up to 64 CUs at ~1GHz without competition. Nvidia can't make an APU and Intel lacks large enough graphics chips. Lower margins, but higher revenue and share gains to establish themselves.

AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Rootax

3dilettante

Anarchist4000

Frenetic Pony

Rootax

CarstenS

Moderator

Picao84

Picao84

DavidGraham

CarstenS

Moderator

DavidGraham

Deleted member 13524

Guest

Picao84

Picao84

Kaotik

Drunk Member

lanek

Deleted member 13524

Guest

Picao84

DavidGraham

Anarchist4000