AMD Vega Hardware Reviews

GF's process likely has a primary role in why Zen lags in clock speed despite what appears to be a roughly equal number of pipeline stages to its competition. Other quirks like the size of the CCX and other implementation choices were likely influenced by the limitations on complexity and performance of the process they were implemented on.

That Zen managed to hit a decent optimization point as a highly clocked CPU doesn't mean a GPU would be as successful. AMD's description of process tweaks for its prior APUs is evidence that it could be the opposite. It would make sense to tilt the tables in favor of the highly-priced CPU, if AMD had to choose how its priorities would go.

That's not to say that the process is the deciding factor for Vega, just that it is possible for a CPU's situation to be significantly different on the same node.

Aren't CPUs less efficient due to its "ultra high" frequencies? Even when Zen frequencies are behind Intels we are talking about 3.5-4Ghz ranges vs Vega's 1.5-1.7Ghz. As I look at it AMD engineer zen so good that even when its lacking process efficiently compare to its competition(which has advantage over everyone else in this regard) it manage to even beat it. in Vega we see a difference in every single metric that is almost more than 1 generation away and I don't don't mean row power or area but even transistors count...12.5B vs 7.2 is almost 60% more transistors. If AMD needs 60% more transistors to just get the performance of Nvidia's last generation(volta is almost out) then AMD have a big problem.
 
Aren't CPUs less efficient due to its "ultra high" frequencies? Even when Zen frequencies are behind Intels we are talking about 3.5-4Ghz ranges vs Vega's 1.5-1.7Ghz.
There are different physical implementation choices between dense ASICs and high-speed CPUs.
When AMD presented its details on the process customizations for Carrizo, they said as much for items such at the choice of pitches for the various metal layers. The CPU-specific process had a small number of small-pitch wires, and then successive layers that rapidly grew in size. The GPU process had metal layers that grew gradually in size.

This matters based on the electrical needs of the device. CPUs that need to send signals far and quickly benefit from the low resistance of the larger but less dense metal layers, while the denser GPUs that generally have shorter distances to travel opt for the ability to cram more wires in.
The choice in transistor properties and tweaks to the process can change the suitability of a process for something like high-speed digital performance, which tends to be leakier and less dense, versus a device focused on density and modest speeds.

For Carrizo, the process was tweaked to be a sort of compromise between the two, and Carrizo's clock speed continued in the gradual decline for Bulldozer variants that started on a high-performance SOI process.

Saying one device type does well on a given node doesn't necessarily mean a device with very different needs will readily succeed.
 
Apparently Newegg has taken to only selling the cards with the monitor bundle. If you look at the bundle page they have three brands of the air cooled cards in stock, but none individually.
 
Apparently Newegg has taken to only selling the cards with the monitor bundle. If you look at the bundle page they have three brands of the air cooled cards in stock, but none individually.
Definitely a way to deter the miners.
 
The wafer shots are not detailed enough to show, but does this mean you believe there are PHY for off-die links in Vega 10?

Well, I am saying that Vega is a repeatable die, and that AMD's future with Vega architecture is doing with Vega, what they did with Zen. Eventually placing two GPUs within one SOC, using Infinity Fabric.

No need for a PLX, or across board interconnect, fabric will allow for two GPU to share the same L2 memory using the HBCC. The problem right now is that two Vega will use a lot of wattage. How far AMD has to downclock, to get two GPUs under control, is the big issue. But connecting them, is strait forward affair.

I am sure these multi-gpu designs are already taped out & prototyped. Perhaps AMD is waiting for 7nm Navi to release to the mainstream consumer version, but I comfortable we will see a downclocked high end 14nm RX Vega x2 some time in 2017. We may even see x3 and x4 designed under Navi too...

Given infinity fabric's robust nature, this is inevitable.
 
Aren't CPUs less efficient due to its "ultra high" frequencies? Even when Zen frequencies are behind Intels we are talking about 3.5-4Ghz ranges vs Vega's 1.5-1.7Ghz.
Not from the frequencies as much as other hardware to facilitate them: caches, prediction, low latency memory. Parts the GPU skips by overlapping threads. Consider Threadripper 16c/32t having roughly the same cache as a 4096 core Vega. Distribution is different, but shows where the architecture is built out.

No need for a PLX, or across board interconnect, fabric will allow for two GPU to share the same L2 memory using the HBCC. The problem right now is that two Vega will use a lot of wattage. How far AMD has to downclock, to get two GPUs under control, is the big issue. But connecting them, is strait forward affair.
Only if AMD included enough lanes for the fabric. Really need someone to check a SSG for a PLX bridge to get an idea. I'd speculate Vega has 64 lanes for SSG or Infinity with APUs or x2 variants. Still missing a die shot or closer inspection of pro cards to know for sure.

The wattage isn't an issue considering. How high above nominal voltages they are running. In power savings, getting two on a board in the same power envelope isn't unreasonable. Almost seems like the goal, but we haven't seen it. Might be a matter or ironing out driver issues first as that seems the issue with Vega.
 
@CarstenS computerbase is saying the primitive shader is still inactive, and they apparently talked with AMD about it. Is this last part a translation error or have they really confirmed this with AMD?
 
Funny, you are the second person who asks me about Computerbase - which I don't work for any more since about 12 years now. :)
But as a non-involved native german speaker:

If you refer to this paragraph „Problem: Auch dieses Funktion ist im Treiber derzeit noch deaktiviert. Vega nutzt aktuell also nur die traditionelle Pipeline. Wann sich das ändert? AMD nennt keinen Termin.“ the start is very clear in german: It is currently deactived in the driver. The last part where AMD is mentioned just says „AMD does not tell a date“ - no indication whether or not they are going by publicly available information or if they actually talked with AMD about this issue. You could interpret from the wording, that they talked to AMD, but they do not state this explicitly.
 
If you would have asked me after Polaris if AMD could make a bigger mess with a launch i would have said no.

But obviously they are capable of doing that. The whole Vega thing is a giant PR desaster.


Yes but communication and marketing aside, both Vega cards are actually pretty good for their price. Even more when we consider the current state of mid-range prices with Polaris 10 and GTX1070 cards.

There's really nothing terribly bad with the cards.
In power saving mode, performance stays practically the same while consuming much less, lower heat output and noise. This should have been the default setting, and I wonder if it was their original idea, had the launch drivers been more mature and the pressure to match the GTX 1080 hadn't creeped in too much.

Reference coolers are pretty good too. They're blowers using vapor chamber heatsinks and good quality fans. It's nothing like the Hawaii launch that was tainted by their terrible coolers.
 
Compared to the competition it still does not look good. Even in Power saving mode, it still consumes more than the vanilla NV competition and is a bit slower, or you go for a NV custom design which uses about the same amount of power but is clearly faster.
 
@CarstenS computerbase is saying the primitive shader is still inactive, and they apparently talked with AMD about it. Is this last part a translation error or have they really confirmed this with AMD?
That can be a tricky question. Assuming windows combines stages like Linux, they always use a primitive shader, the question is what it's doing. Running the default path, throwing in optimizations like deferred attribute interpolation, culling to varying degrees, or running something unique at 17 primitives per clock or whatever the number is up to now.

At the very least it works on the energy benchmark, so it's enabled to some degree. Being widely enabled where it makes sense and optimally tuned is another matter.

There's really nothing terribly bad with the cards.
In power saving mode, performance stays practically the same while consuming much less, lower heat output and noise. This should have been the default setting, and I wonder if it was their original idea, had the launch drivers been more mature and the pressure to match the GTX 1080 hadn't creeped in too much.
I have the same sentiments. Should have shipped low power and acknowledged there is driver work to do. Leave the cards as strong overclockers. Unless AVFS or something else will change that situation shortly. Not that there has been an in depth tech review yet, but covering just what is, isn't, and will be working would be a nice start.

Compared to the competition it still does not look good. Even in Power saving mode, it still consumes more than the vanilla NV competition and is a bit slower, or you go for a NV custom design which uses about the same amount of power but is clearly faster.
Doesn't look good, but that's not surprising with a lot of features needing work and tuning to be done.
 
If you would have asked me after Polaris if AMD could make a bigger mess with a launch i would have said no.

But obviously they are capable of doing that. The whole Vega thing is a giant PR desaster.

One would think, but the share price is up and AMD got a huge write up today claiming their CPUs were going to take lots of market share (I'll find the link later).
 
Compared to the competition it still does not look good. Even in Power saving mode, it still consumes more than the vanilla NV competition and is a bit slower, or you go for a NV custom design which uses about the same amount of power but is clearly faster.

I'm not quite sure I would agree with that. Vega 56 when undervolted and overclocked appears to pretty clearly outperform an overclocked 1070:
vDUUA5H.png

However the drivers are still so raw that only limited undervolting and overclocking testing has been done so far. The undervolting testing I've seen so far seems to suggest that Vega 56 should be able to deliver ~15-20% gains from overclocking while undervolting to around 230 watts or even slightly less depending on silicon lottery results (See GN's review article and undervolting livestream). That is even before considering how much Vega's half finished drivers might be leaving on the table right now.
 
Don't confuse Primitive Shaders and DSBR.
I'm assuming the primitive shaders are partially responsible for setting up the bins, although the lines are a bit blurry and not well documented. Need to check the linux notes, but I thought they merged the first stages into a single shader which I think are surface and primitive. DSBR picking up from there. Frustum culling would ideally happen prior to tessellation, when desirable, which isn't indicated, but couldn't be done safely without some vertex shading. DSBR probably relies on a primitive shader to pack primitives or patches to facilitate sorting and maybe compress data, but occurs transparently through the driver. Speculating on that as it's something the driver would insert, but roughly sorting patches would be much easier than triangles for example as they'd be related spatially.

One would think, but the share price is up and AMD got a huge write up today claiming their CPUs were going to take lots of market share (I'll find the link later).
"Our industry checks show improving mindshare/shelf-space for AMD's new Ryzen desktop-PC processors, incl. 30-50% share at prominent e-tailors, well ahead of AMD's 11% current desktop unit share,"
http://www.cnbc.com/2017/08/15/amd-to-surge-more-than-40-percent-analyst.html
 
I'm assuming the primitive shaders are partially responsible for setting up the bins, although the lines are a bit blurry and not well documented. Need to check the linux notes, but I thought they merged the first stages into a single shader which I think are surface and primitive. DSBR picking up from there. Frustum culling would ideally happen prior to tessellation, when desirable, which isn't indicated, but couldn't be done safely without some vertex shading. DSBR probably relies on a primitive shader to pack primitives or patches to facilitate sorting and maybe compress data, but occurs transparently through the driver. Speculating on that as it's something the driver would insert, but roughly sorting patches would be much easier than triangles for example as they'd be related spatially.
Ok.
 
Back
Top