AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

Given the modest difference in the base clocks of Polaris and Vega FE, the CU counts, and the bandwidth numbers, something in the range of 2x 580/570s seems to be rather close to what Vega has in terms of raw counts, with the exception of geometry front ends. The power consumption seems to be in the neighborhood of 2x Polaris.

That shouldn't be all there is to the story, which leaves me puzzled at the moment.
 
Surely some features not activated at the moment and drivers probably operating in a fallback mode. But then it is madness to sell these cards now and have unregulated and unsupported benchmarks fill the internet.

Maybe we are lucky and gaming Vega will bring the full driver and will show 50% better performance while happily taking the performance crown. However whoever is responsible for the FE launch at AMD should be looking for a new job.
 
Tiled rasterization alone could do that if not currently enabled. The question is how many of the new features are currently enabled. All benches so far look like Fiji with higher clocks. That would seem to indicate none of the new features, except packed math and HBCC which AMD have demoed, are currently enabled.

What's kind of surprising is the 8Hi stacks aren't overheating as far as I can tell. Or they are and ECC is correcting.

Even if AMD demoed it in a hands-off session I wouldn't assume it is operating in a non-contrived scenario. HBCC was demoed in an artificially hobbled system, and that's the one case we know anything was supposedly working.
Packed math might have some credence because somebody other than AMD has used it on the PS4 Pro, but that's on something other than Vega.

As far as HBM2 goes, ECC is an option in the standard that isn't promised for this range of products. ECC is also not intended for thermal correction.

HBM's actual compensation is first entering a significantly higher refresh rate above 85C, which can be a constraint on the GPU as well.
The controller should try doing things like throttling or dead cycles to get things in line, otherwise perf/W and performance can degrade due to the power cost of refreshes and the inability to get as many accesses through.

If it gets hotter, the next provision is an emergency thermal trip sensor, although my interpretation is that it isn't error-correcting as much as a last-ditch shut down. It might be of higher importance with 8-hi and in the server environments, and might have an impact in the Pro.
 
Maybe we are lucky and gaming Vega will bring the full driver and will show 50% better performance while happily taking the performance crown.
Ugh... "Gaming" FE drivers are dated 6/27/2017. RX is supposed to launch sometime before 8/3/2017. This is more or less a month of the final phase of driver development which has been ongoing for years...

Do you really think a month of SW development should bring 50% of perf?
 
Let me put it this way, my confidence in AMD PR is so low that I would not rule out the option that they have disabled features until the launch of RX Vega, but I would also not rule out that they have made a "poor Volta" video for a chip that barely beats the 1080.
 
So many questions on why it is underperforming. Is it me or, all the "new stuff" presented a few month ago (discarding primitives, etc) need in fact custom programming, so, well, will never be used on PC ? Like the tesselation unit on R600 ? So we're basically left with an OC Fiji with a better L2 cache system ?
 
When Did NV really get their big perf per watt advantage was it when NV released their first product with colour compression (fermi as far as im aware) or was it when they released their first product with the ROP caches being L2 backed (maxwell).
How about: when they massively overhauled the SM architecture, where all the known changes point to significantly improved efficiency.

Unless colour compression just sucked in fermi/kepler then magically became super awesome in maxwell because reasons.
And why not? Many things transitioned from blah to super awesome at some point.
 
So many questions on why it is underperforming. Is it me or, all the "new stuff" presented a few month ago (discarding primitives, etc) need in fact custom programming, so, well, will never be used on PC ? Like the tesselation unit on R600 ? So we're basically left with an OC Fiji with a better L2 cache system ?

This is raja saying the geometry pipeline improvements won't take any extra work to use, about a month ago.

 
So many questions on why it is underperforming. Is it me or, all the "new stuff" presented a few month ago (discarding primitives, etc) need in fact custom programming, so, well, will never be used on PC ? Like the tesselation unit on R600 ? So we're basically left with an OC Fiji with a better L2 cache system ?

I think AMD's relationship with devs improved drasticaly since R600 days (hey its an AMD/ATI launch, expect R600 & R300 references to jump through the roof). It helps that there are few engine developers that the game studios use. ( So not all is lost ? )
 
Last edited:
I handed the board to him myself earlier today in Helsinki, over a beer. The joys of looking after Game Engineering for Europe. Full architecture details will come out later, can't spoil that, but it is FL 12_1 top tier for everything.

Wait Vega FE in Helsinki, Finland? Hold my beer :)
 
I would wait for mass produced cards, early silicon tends to get the shaft when it comes to power consumption. Still the performance/clock doesn't seem promising.
 
So many questions on why it is underperforming. Is it me or, all the "new stuff" presented a few month ago (discarding primitives, etc) need in fact custom programming, so, well, will never be used on PC ?
Not at all. The primitive shaders need specific support as automatic merging through the driver will likely result in pretty limited advantages. But the main improvement of the completely new raster engines with the tile based rasterization (where AMD claimed also to do HSR in the bins before rasterization; the L2 backing of the ROPs to store framebuffer tiles plays its part in the backend) should work completely transparent to the app/game. After all, it does so also in Maxwell and Pascal, right?
 
Not at all. The primitive shaders need specific support as automatic merging through the driver will likely result in pretty limited advantages. But the main improvement of the completely new raster engines with the tile based rasterization (where AMD claimed also to do HSR in the bins before rasterization; the L2 backing of the ROPs to store framebuffer tiles plays its part in the backend) should work completely transparent to the app/game. After all, it does so also in Maxwell and Pascal, right?

Only if it work like the iteration used in Maxwell and Pascal and does not have some hidden bottleneck.
 
Not at all. The primitive shaders need specific support as automatic merging through the driver will likely result in pretty limited advantages. But the main improvement of the completely new raster engines with the tile based rasterization (where AMD claimed also to do HSR in the bins before rasterization; the L2 backing of the ROPs to store framebuffer tiles plays its part in the backend) should work completely transparent to the app/game. After all, it does so also in Maxwell and Pascal, right?

Right. So, it's even more troubling right now for me. If it's not a drivers problem, then something is screwed up. It's performing like a oc fiji (and in some bench you can argue that a fiji@1600 could do better).
 
Only if it work like the iteration used in Maxwell and Pascal and does not have some hidden bottleneck.
I remember AMD saying it works without application support. And what should be that "hidden bottleneck"?

=====================================

Right. So, it's even more troubling right now for me. If it's not a drivers problem, then something is screwed up. It's performing like a oc fiji (and in some bench you can argue that a fiji@1600 could do better).
That it should work without application support doesn't mean it doesn't need driver support. All tiling rasterizers have a fallback to conventional rasterization. One can simply switch off the binning in the driver (or the other way around: the driver has to configure it properly to work).

I have no clue if there are switched off features in the current driver release, but I wouldn't rule it out neither.
 
Last edited:
I would wait for mass produced cards, early silicon tends to get the shaft when it comes to power consumption. Still the performance/clock doesn't seem promising.

They _are_ mass produced. It's not like 16/14 nm FinFET is an unkown variable at this point. Nor are the circuits on the Vega dies hand carved by specifically employed virgins-from-venus or something. We're talking about a selling product here, not some super-early pre-alpha stuff (which might have been valid for the prototype shown back in december in Sonoma).
 
Back
Top