AMD Vega Hardware Reviews

This is interesting. I have never heard of anything like this happening on NVIDIA cards (or any other cards for that matter), though I do remember reading that GDDR5 had a feature like that.
GDDR5 and HBM2 have that as part of their specification, HBM1 notably did not. Not all cards implement the full specification though, for example the first use of GDDR5 in the HD4xxx series didn't.
http://www.anandtech.com/show/2841/12
Overclocking attempts that previously would push the bus too hard and lead to errors now will no longer do so, making higher overclocks possible. However this is a bit of an illusion as retransmissions reduce performance. The scenario laid out to us by AMD is that overclockers who have reached the limits of their card’s memory bus will now see the impact of this as a drop in performance due to retransmissions, rather than crashing or graphical corruption.

As mentioned above, bandwidth also depends on what clock the infinity fabric interconnect is running at. It may not be directly proportional to memory clock.
 
Last edited:
Infinity fabric in Vega is it's own clock, it's not tied to anything.
 
In Vega running the primitive shader is not required. You can still run a VS like with previous architectures.
From a implementation point of view, how do they differ?

No matter what they're called, it's still just instructions executing on the same shader cores?

Is it wrong to look at it as coalescing multiple smaller shaders together and getting some optimizations out of it?

What prevented AMD (and Nvidia) from doing that in the past?
 
From a implementation point of view, how do they differ?

No matter what they're called, it's still just instructions executing on the same shader cores?

Is it wrong to look at it as coalescing multiple smaller shaders together and getting some optimizations out of it?

What prevented AMD (and Nvidia) from doing that in the past?
A Vertex Shader thread doesn't have access to neighboring vertices like a primitive shader or geometry shader. Nvidia and AMD have supported a "fast path GS" which gives access to all vertices of a primitive but doesn't support amplification. The primitive shader uses a different data path through the hardware than the VS. The whitepaper says "Primitive shaders will coexist with the standard hardware geometry pipeline rather than replacing it" and it's alluding to the different data path.

I wouldn't say anything strictly prevented an IHV from doing this in before now.

There's a high level pipeline diagram on page 6 of the whitepaper that shows the concept of reducing the number of shader stages and culling before attribute shading. Coalescing multiple shader stages is part of the Next Generation Geometry Engine and the primitive shader is one of those stages. Technically you can use the GS to implement some of the primitive shader functionality, but the way the GS API is specified is not the best approach. Calling the new stage a primitive shader highlights it combines the functionality of the VS and GS even if the user doesn't specify a GS. A primary feature is giving the shader access to connectivity data without requiring a GS.
 
From FE overclocking those were a fluke as the card starts inserting NOPs and otherwise idling. So clocks go higher and performance decreases. Have to actually test something to confirm the overclock is productive. Pascal does something similar.
Are you referring to memory or engine overclocking? Memory overclocking can lead to errors and more repeated transactions, but NOPs aren't inserted for engine overclocking. Fixed function logic will straight up fail if a signal doesn't complete the path by the next clock edge. Parts of the design with EDC or ECC may be fault tolerant and retry, but most of a design won't consist of this logic.
 
That's fine, my point is this chip is a major letdown from an engineering standpoint. This sucks for for everyone except NVIDIA.

Is 56 cheaper than the 1070? I can't find it listed on Newegg right now. And Vega64 is not cheaper than the 1080 that is complete BS.


The 56 is $400 when it releases later this month. The 64 was $500 before selling out and that would put it $40 cheaper than the geforce 1080 on amazon
 
The primitive shader uses a different data path through the hardware than the VS. The whitepaper says "Primitive shaders will coexist with the standard hardware geometry pipeline rather than replacing it" and it's alluding to the different data path.

I wouldn't say anything strictly prevented an IHV from doing this in before now.
The distinction is just the scope of the shader as I understand. Same hardware being used without any special new functions. Maybe that 4SE one in the ISA, but I'm guessing we haven't seen all the instructions yet. A primitive shader would be inclusive of geometry and vertex shaders. Simply ignoring the greater scope they would be identical. Unless AMD saves overhead with the old way, I'd speculate everything becomes a primitive shader for the purpose of optimization. Coexistence only from the programmers perspective. From the drivers point of view, the hardware and programmable stages would be the same shader, just with internal functions added to the beginning and end. It's only when rasterization kicks in that another pipeline stage truly kicks in as lots more threads are spawned. Will be interesting to see where they go with this in regards to dynamic memory allocations in shaders.

Are you referring to memory or engine overclocking? Memory overclocking can lead to errors and more repeated transactions, but NOPs aren't inserted for engine overclocking. Fixed function logic will straight up fail if a signal doesn't complete the path by the next clock edge. Parts of the design with EDC or ECC may be fault tolerant and retry, but most of a design won't consist of this logic.
Engine overclocking in a throttled environment. Doing nothing as a means to conserve energy in the presence of a thermal or power limit without adjusting clockspeeds. Not as effective as changing the clocks, but situationally useful. Some overclocking tests have resulted in higher core clocks, but had no appreciable impact on performance. Entirely separate issue from timing constraints.
 
Review: Asus Radeon RX Vega 64 Strix Gaming
One worry we have is with respect to pricing; the reference card is already £100 dearer than expected and having it undergo the Asus Strix Gaming treatment is likely to further inflate pricing by, perhaps, another £75. The AMD Radeon RX Vega 64, no matter if it is presented well, really isn't a £600-plus GPU, so we hope that Asus/AMD can work on pricing when this model becomes available in a month's time.
http://hexus.net/tech/reviews/graphics/109078-asus-radeon-rx-vega-64-strix-gaming/
 
However the drivers are still so raw that only limited undervolting and overclocking testing has been done so far.

I think the drivers are a problem for Vega as much as the hardware.

Actually Nvidia was always much better on the driver department. Their features just work.

Some observations:
-If you have the iGPU enabled, you can't change clock and voltage settings on Polaris. Even with Afterburner or other 3rd party utilities its disabled. With Nvidia cards you can.
-With Nvidia drivers, you don't need to restart the PC. It doesn't ask for anything. I find that quite impressive since nearly every installation requires a reboot on Windows, even with software!
-Certain features like AA can be forced on with Nvidia drivers, but work intermittently or not at all in the case of AMD(and Intel). Nvidia drivers really do override application settings. As for the other two vendors, I'm not sure what they are doing.
-AFAIK Nvidia drivers use less CPU for the drivers?
 
AMD Radeon RX Vega 64 review
The real problem isn't the previous generation AMD GPUs, it's Nvidia's GTX 1080. There is nothing here that would convince an Nvidia user to switch to AMD right now. Similar performance, similar price, higher power requirements—it's objectively worse. In fact, the power gap between Nvidia and AMD high-end GPUs has widened in the past two years. GTX 980 Ti is a 250W product, R9 Fury X is a 275W product, and at launch the 980 Ti was around 5-10 percent faster. Two years later, the performance is closer, nearing parity, so AMD's last high-end card is only moderately worse performance per watt than Nvidia's Maxwell architecture. Pascal versus Vega meanwhile isn't even close in performance per watt, using over 100W (more than 60 percent) additional power for roughly the same performance.

It's difficult to point at any one area of Vega's architecture as the major culprit for the excessive power use. My best guess is that all the extra scheduling hardware that helps AMD's GCN GPUs perform well in DX12 ends up being a less efficient way of getting work done. It feels like Vega is to Pascal what AMD's Bulldozer/Piledriver architectures were to Intel's various Core i5/i7 CPUs. Nvidia has been able to substantially improve performance and efficiency each generation, from Kepler to Maxwell to Pascal, and AMD hasn't kept up with the pace.

http://www.pcgamer.com/amd-radeon-rx-vega-64-review/
 
It's difficult to point at any one area of Vega's architecture as the major culprit for the excessive power use. My best guess is that all the extra scheduling hardware that helps AMD's GCN GPUs perform well in DX12 ends up being a less efficient way of getting work done.

Hm, I don't really know anything about anything really, but seems likely from a strictly layperson's perspective there's no any single cause of vega's power draw. Would scheduling really account for a hundred watts increase in dissipation? You'd have to have the world's most overengineered scheduler to hit such numbers methinks. (Also comparatively most underperforming one, seeing as vega at a much bigger die doesn't perform a whole lot better than GF1080, or even any better at all in certain titles.)

My analysis - for what it's worth:
NV/Jen Hsun said several years ago now (kepler era maybe?) that moving around data in a GPU costs more power than doing calculations on said data. He also said around pascal's release (prolly the reveal conference) that they'd taken a lot of care in laying out the chip/routing data flow, or words to that effect. Spared no expense, most likely (or very little anyhow - as pascal reportedly cost a billion buckaroos IIRC to develop.)

We know AMD doesn't have billions and billions to spend, and what money they do have must be shared with console SoC and x86 CPU divisions. They've also had issues hiring and retaining highly qualified staff, and from what I've read here and other places, laying out modern microchips is very difficult work, perhaps amongst the most difficult? As a result, doesn't it seem chances are fairly high that vega isn't nearly as efficiently laid out as it could have been, and that much power is spent/lost just on shuffling bits around the die? Hardware units themselves might also be comparatively inefficiently designed compared to NV's chips.

Anyway, I was quite prepared for vega being a power-hungry mother. I've been using RX 290X and 390X cards for years already, they're quite hoggy as it is. I don't really care about power, to be honest. It's not as important to me as raw performance, features and overall capabilities. On that front, vega does pretty well, I like that it seems to be the first fully DX12 capable chip ever. It's also a fast chip in absolute terms, even though it is not the fastest (or even consistently faster than GF1080). Price/performance might be hella dodgy though if the "stealth" price increase ends up being the new status quo from now on.

From pre-release numbers of the ASUS Strix board, this will be the new measuring stock for vega. Forget the default AMD OEM blower cards - they've always sucked really bad. (Well, err...) The Strix is noticeably faster, way quieter and runs around 10C cooler, even though it sometimes seem to draw even more power. :p

Nobody should settle for anything less.
 
Hm, I don't really know anything about anything really, but seems likely from a strictly layperson's perspective there's no any single cause of vega's power draw. Would scheduling really account for a hundred watts increase in dissipation? You'd have to have the world's most overengineered scheduler to hit such numbers methinks. (Also comparatively most underperforming one, seeing as vega at a much bigger die doesn't perform a whole lot better than GF1080, or even any better at all in certain titles.)

My analysis - for what it's worth:
NV/Jen Hsun said several years ago now (kepler era maybe?) that moving around data in a GPU costs more power than doing calculations on said data. He also said around pascal's release (prolly the reveal conference) that they'd taken a lot of care in laying out the chip/routing data flow, or words to that effect. Spared no expense, most likely (or very little anyhow - as pascal reportedly cost a billion buckaroos IIRC to develop.)

We know AMD doesn't have billions and billions to spend, and what money they do have must be shared with console SoC and x86 CPU divisions. They've also had issues hiring and retaining highly qualified staff, and from what I've read here and other places, laying out modern microchips is very difficult work, perhaps amongst the most difficult? As a result, doesn't it seem chances are fairly high that vega isn't nearly as efficiently laid out as it could have been, and that much power is spent/lost just on shuffling bits around the die? Hardware units themselves might also be comparatively inefficiently designed compared to NV's chips.

Anyway, I was quite prepared for vega being a power-hungry mother. I've been using RX 290X and 390X cards for years already, they're quite hoggy as it is. I don't really care about power, to be honest. It's not as important to me as raw performance, features and overall capabilities. On that front, vega does pretty well, I like that it seems to be the first fully DX12 capable chip ever. It's also a fast chip in absolute terms, even though it is not the fastest (or even consistently faster than GF1080). Price/performance might be hella dodgy though if the "stealth" price increase ends up being the new status quo from now on.

From pre-release numbers of the ASUS Strix board, this will be the new measuring stock for vega. Forget the default AMD OEM blower cards - they've always sucked really bad. (Well, err...) The Strix is noticeably faster, way quieter and runs around 10C cooler, even though it sometimes seem to draw even more power. :p

Nobody should settle for anything less.
AMD probably uses way more automated design/layout tools than NVIDIA or Intel. As a result their GPUs appear to be way less efficient at shuffling bits.

Also aren't modern Intel iGPUs fully DX12 compliant? Not that it really matters; the majority of PC gamers have NVIDIA cards (~64% according to Steam), so devs have to code with that in mind. Kind of a shame but this is reality.

P.S. does Vega do conservative rasterization? I see a bunch of talk about FP16 but conservative rasterization is much more important (and has been supported by NV since Maxwell).
 
Back
Top