AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

As previously stated by AMD to PCPer and GamersNexus, Vega FE driver already has all the gaming optimizations until it's release. So RX driver could've really had almost nothing new to add to the table in that regard. AMD already implied not to expect great differences due to the activation of DSBR.

Who said memory bandwidth needed a driver to be corrected?

If RX Vega performs pretty much exactly like Vega FE, then RX Vega will perform clock-for-clock pretty much exactly like Fiji (as Vega FE did when GN tested them). Considering that Fiji did zero primitive culling and had no TBR at all, RX Vega performing identically clock-for-clock either indicates that primitive culling, DSBR and HBCC are worth literally 0 fps in gaming and every second that RTG has spent working on those features has been thrown away, or that there is a serious bottleneck elsewhere in Vega that holding back its performance to tying Fiji clock-for-clock.

Tom's Hardware tested Vega FE in a bunch of pro 3D rendering workloads and found Vega FE was trading blows with a Quadro P6000 in Creo 3.0, Solidworks 2015, and 3ds Max, so that seems to suggest the bottleneck shouldn't be geometry performance,

On the other hand, B3D Suite testing on Vega FE does seem to indicate a ~20-30% regression vs. Fiji in raw memory bandwidth, effective texture bandwidth, and texture fill rate. This led me to infer that Vega appears to be memory bandwidth bottlenecked in gaming (and ETH mining). I assumed that it would not be logical for RTG to deliberately design in such a regression, and that Vega had been delayed enough that any hardware issue should have been caught and corrected, thus I am left to figure there must be some sort of BIOS or driver issue that RTG may be able to correct to eliminate at least some of the memory bandwidth regression vs. Fiji, and that reducing that regression would simultaneously increase gaming performance and ETH hashrate. The reports of RX Vega's hashrate doubling compared to Vega FE had led me to think that RTG might have fixed the memory bandwidth regression (and that would have been consistent with the leaked Vega 56 benchmarks), but this TimeSpy score does not appear to support the idea that the memory bandwidth regression vs. Fiji has been corrected.

Bear in mind I'm still a total novice in understanding GPUs so I might be totally misunderstanding the situation.
 
Yes Indeed.

As previously stated by AMD to PCPer and GamersNexus, Vega FE driver already has all the gaming optimizations until it's release. So RX driver could've really had almost nothing new to add to the table in that regard.

AMD already implied not to expect great differences due to the activation of DSBR.

Who said memory bandwidth needed a driver to be corrected?

Nope Gamersnexus report that DSBR an Power features is off! So not all is activated.
http://www.gamersnexus.net/news-pc/3005-amd-moving-away-from-crossfire-with-rx-vega


Also the Bandwidth is really strange. Should be the same like Fiji but it's less. How can that be?
http://www.tomshardware.com/news/amd-radeon-rx-vega-64-specs-availability,35112.html
And also Rasterizer push Bandwidth performance!
 
Also the Bandwidth is really strange. Should be the same like Fiji but it's less. How can that be?
Vega has half as many stacks, and they are not clocked twice as high. That's the explanation for the difference in peak bandwidth from Fiji.
As for why they couldn't reach 2.0 Gbps with HBM2, it hasn't been explained. It could be a matter of the DRAM devices not being able to reach those speeds at the currently offered DRAM process or chip revisions. Some of the reported memory voltages seem like they are being pushed.

Even if they were at 2.0 Gbps, not every source of latency for DRAM is measured in terms of clock cycles. Having half the channels can mean there are more chances for bank conflicts or incurring turnaround latency.
 
1.89 does not equal 2.
Beyond that, there are multiple ways to penalize DRAM unrelated to speed.
Ok 1.89 is not 2 but it's also not 1. Vega must be theoretically 5% behind Fiji. But in real measurements (Beyond3d suite ) it's 50%. 10 times more? Rally

Because you have the half of the interface, the management of the Bandwidth should be more easier on vega. Also there should be new compression algorithm in Vega. But nothing shown up till now.
 
I was think that this was allready discussed. that the beyondsuite was not completely right on the bandwith test for Vega.. seems something is bugging it a little bit ? I think we was speaking about it something like 10 paages ago ( or more, or less, i have not check what page ) ?
 
Last edited:
I have a simple question.
When discussing "bandwidth", what components within a GPU, does bandwidth matter back & forth ? At such bit depth.

Memory?
Generally memory, but may be PCIE or an interconnect. Obviously a lot more busses, but in the case of a GPU or throughput processor they are always sized to the computational parts. Which leaves memory as the primary topic.

That was known for a while and most FP64 tasks are accelerated by generic clusters because memory quickly becomes a factor. So the work falls to the giant CPU based racks that represent most supercomputers. With all the cores and IO on Epyc that will likely be much of the FP64 focus, although Vega20 is coming with FP64.

Curiously, I didn't find anything when searching for "crypto" or "mining". Weren't there mining-specific instructions?
I'm pretty sure there was a slide claiming that.
The crypto instructions tend to be binary operators and integer operations for performing checksums and addressing. So they do have other uses, just not normally in graphics. Memory management, sorting, and error checking they are useful.
 
Ok 1.89 is not 2 but it's also not 1. Vega must be theoretically 5% behind Fiji. But in real measurements (Beyond3d suite ) it's 50%. 10 times more? Rally
Is this going by the 8x bandwidth tests?

DRAM utilization can fall if there are other differences, such as the level of tuning for the controllers and device latencies. If the controller's scheduling is not able to avoid device penalties, DRAM performance can drop pretty quickly.
The compression pipeline also has a limited capacity for tracking compression data. If its metadata cache is thrashed, the pipeline itself adds DRAM traffic and delays processing of compressed data, since no access can complete until the compression data is loaded.

A thrashing case can hurt a system with fewer channels. AMD's Fury launch mentioned indicated that the loaded latency of Fury's memory channels could be better than GDDR5 despite the low clock speed because it could spread conflicting patterns over more independent channels.

Because you have the half of the interface, the management of the Bandwidth should be more easier on vega.
With the same or heavier load and equivalent patterns, the first statement can often be not true.
 
The mining instructions ( in Nvidia and AMD ) are hashing instructions .. They are not directly aimed at mining ( but used by mining blockchains ) ..
 
RX Vega performing identically clock-for-clock either indicates that primitive culling, DSBR and HBCC are worth literally 0 fps in gaming and every second that RTG has spent working on those features has been thrown away
Vega FE performed between 1070 and 1080 except in few games, we have indications that RX Vega is around the 1080 slightly below and slightly above, which is probably due to enhanced clock speeds plus some limited driver improvements.

Nope Gamersnexus report that DSBR an Power features is off! So not all is activated.
http://www.gamersnexus.net/news-pc/3005-amd-moving-away-from-crossfire-with-rx-vega
We know it's off, we also know AMD doesn't expect great things from it's activation, in regards to games anyway. Even their marketing slides didn't indicate that. So I don't really see the point of making big deals of things AMD themselves are not.
 
Vega FE performed between 1070 and 1080 except in few games, we have indications that RX Vega is around the 1080 slightly below and slightly above, which is probably due to enhanced clock speeds plus some limited driver improvements.


We know it's off, we also know AMD doesn't expect great things from it's activation, in regards to games anyway. Even their marketing slides didn't indicate that. So I don't really see the point of making big deals of things AMD themselves are not.

It depends on the game i would guess , it could be enough to move some games from loosing to a 1080 to matching it. Will be interesting to see wat happens
 
VERY INTERESTING!!!!!

Sorry, it just bursts out occasionally when I'm not technically allowed to say anything. My apologies, I'm just a bit excited. :)

I think you need another one of these.
latest

:)
 
Back
Top