AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

3dilettante · Feb 1, 2017

sebbbi said:
As you said, the TMUs (GCN3/4) already can read (sample) variable width DCC targets, meaning that the DCC metadata needs to be loaded to L1 and L2 caches. I imagine Vega ROPs (tiny L1 ROP caches) and TMUs (CU L1 caches) load the DCC metadata from the shared L2 cache. I would assume that you need to flush both the ROP L1 caches and the CU L1 caches when transitioning a render target to readable. But since these caches are inclusive (L2 also has the L1 lines), this flush never causes any memory traffic. This should be very fast compared to the current ROP cache + L2 cache flush.

I see that there would be an effective capacity benefit to having the ROP L1s participating in the compression scheme, but it would be a change since it seems like the current method keeps the compression in writeback path of the ROP caches back to memory.
I could see it being more difficult to perform operations that write data across the parallel units of the RBE when the alignment and content of a ROP's specific position can have varying levels of complexity to read+decompress, modify, and then write+recompress (rinse and repeat).

Structuring hardware to somehow work natively in compressed data would be an interesting exercise, but generally I'd expect it would just decompress before working on the data--which means there's storage on the order of the uncompressed line somewhere.
Some special cases might be easier to manage, like all zero or all one color, but even then after one operation there's a decent chance that the output from that will require storage on the order of the orignal L1 lines to hold it while the compression is reapplied.
The variability in compression would present an interesting wrinkle for the L2 and cache flushes, varying based on the policies of the RBE caches and when the line footprint becomes known.

CSI PC · Feb 2, 2017

Gipsel said:
What Zaphod said.

And I always understood that AMD has secured "priority access" to HBM2 produced by SKHynix (i.e. they can block access of others by buying all of it) in exchange to the commitment to buy a significant share of the initial products. To my understanding that doesn't necessarily prohibit AMD from buying also HBM2 from Samsung, especially in case SKHynix can't deliver 2GT/s HBM2 and Samsung offers it. But I guess nobody here knows (and wants to share) the exact wording of the agreement.
A more fundamental problem with Samsung's HBM2 could come from implementation details like the physical dimensions of the HBM2 stacks, which are likely slightly different between Samsung and SKHynix (just the height of the stack is specified). It could be that the stacks of one of the manufactures need a slightly bigger interposer, i.e., AMD may need to design and manufacture a different interposer to be able to use Samsung HBM2. Does anybody know the size of the Samsung stacks? I've seen only the exact dimensions for SKHynix.

Alhough AMD is not the only company to collaborate with SK Hynix, there is also Altera and a couple of other ones.
That said I cannot see those others coming to market as production until after 2H 2017 at the earliest.
Cheers

CSI PC · Feb 2, 2017

Razor1 said:
They can't turn to Samsung till their exclusivity clause is full filled.....

And big Vega is surely going to need more than 410 GB/s bandwidth, and this is why AMD hasn't nailed down a launch date and gave use a "vague" 1st half, even that looks to be iffy going by Hynix's 2.0 ghz memory being TBA. Kinda getting close to the wire.

So was it really good for AMD to have HBM exclusivity from Hynix? Looks to me they got bitten in the ass by this one.....

Also needs to be considered just how large a contract Nvidia has with Samsung and how much of their future production they have reserved, while not 'exclusive' they are tying up a lot of current production and rumour in the past was all.

Regarding bandwidth, I guess it depends upon context.
Worth noting that the bandwidth of Pascal Titan only became a bottleneck in just a few of the model benchmarking tested with Amber v16 PMEMD Benchmarking suite when compared to P100, and you are not benefitting from 3d game related compression in these situations.
Cheers

CSI PC · Feb 2, 2017

It possibly fits here (could assume it is HBM2) or maybe CPU related but was an interesting comment in the conference call other day with regards to a particular question and answer.

Q:
I had a question on inventory, Devinder. You did give us the reason for why the inventory is higher. But what I'm trying to understand is why the delta between the guidance that you had given, which was supposed to be in the $660 million amount, which you guided to.
Was there a change in what you were expecting for the roadmap? Where there uncertainties that you had guided to $660 million, and now you came up to the number that you reported on the fourth quarter? Thank you.

A:
Yes, I think it's fair to say that from the time I gave the guidance, $660 million coming into $750 million, that there were some changes, but let me explain.
First of all, it was higher than anticipated due to product ramps, product mix, and also our higher expected revenue in the first half of 2017. We also had an opportunity to purchase some inventory in a tight PC supply environment at commercially favorable terms. And we took the opportunity to go ahead and purchase the inventory, given what we see from a revenue standpoint for the first half of 2017.

Could be down to a few different components though and vague enough to be applicable to either Vega or Ryzen, although HBM2 stands out as being tight supply *shrug*.
Cheers

Anarchist4000 · Feb 2, 2017

$80m is a lot of something. Lifetime supply of interposers? Could be Polaris12/10GX2 for eventual mobile parts or something.

HBM2 could be tight, but if they had an agreement that seems less an issue unless they expected a lot of demand for Vega and Naples. Might track with Hynix dropping the 2Gbps if they bought it all. That's an awful lot of ram if they're still buying it.

Infinisearch · Feb 3, 2017

AMD Vega memory arch interview - http://www.techarp.com/articles/amd-vega-memory-architecture-qa/

seahawk · Feb 6, 2017

I still do not see how they want to make it work in a normal windows environment and with the card just boosting HBM2 and no additional flash memory.

Deleted member 2197 · Feb 22, 2017

What’s RX 580? Is it a new chip? Is it a rebranded and overclocked RX 480? I don’t think we can answer this yet. Ashes of the Singularity benchmarks are hard to compare, but there are plenty of results with GTX 1070 in this area (7100 points in Standard 1080p preset).

The guy who tested RX 580 works for AMD (you can find his Facebook profile, because he used a real name as AotS login). We don’t want to stalk people or cause any problems to this guy, so excuse me for not posting a link. It’s easy to find though..

https://videocardz.com/66253/amd-radeon-rx-580-ashes-of-the-singularity-results-leaks-out

CarstenS · Feb 22, 2017

Vega now without gaffer tape:
http://www.pcgameshardware.de/Vega-Codename-265481/News/Prototyp-Grafikkarte-1221516/

6+8 pin power connectors, blower-style fan, red illuminated Radeon logo. USB diagnostics still attached.

Malo · Feb 22, 2017

CarstenS said:
Vega now without gaffer tape:
http://www.pcgameshardware.de/Vega-Codename-265481/News/Prototyp-Grafikkarte-1221516/

6+8 pin power connectors, blower-style fan, red illuminated Radeon logo. USB diagnostics still attached.

Interesting that the 6-pin is lit up and the 8 pin is not. Likely nothing at all to make of that though

Kaotik · Feb 22, 2017

CarstenS said:
Vega now without gaffer tape:
http://www.pcgameshardware.de/Vega-Codename-265481/News/Prototyp-Grafikkarte-1221516/

6+8 pin power connectors, blower-style fan, red illuminated Radeon logo. USB diagnostics still attached.

That's still the same prototype though, even though there's no tape this time

CarstenS · Feb 22, 2017

Kaotik said:
That's still the same prototype though, even though there's no tape this time

IOW ... the same what I said. Thank you for double-confirming! ;D

Malo said:
Interesting that the 6-pin is lit up and the 8 pin is not. Likely nothing at all to make of that though

On Fury cards, the LEDs supposedly showed the load, but overall and not necessarily according to the nearest PCIe-connector.

Anarchist4000 · Feb 22, 2017

Malo said:
Interesting that the 6-pin is lit up and the 8 pin is not. Likely nothing at all to make of that though

Actually it's the 8 pin that's partially lit.

Deleted member 2197 · Feb 24, 2017

4K AMD setup with hexacore Ryzen and dual RX 580s benchmarked in AotS

The Radeon RX 580 graphics card was tested on an Intel i7 5820k hexa-core processor with a base clock of 3.3 GHz. The RX 580 manages to score an average framerate of 72.3 fps. The performance shown here is significantly faster than the RX 480 and roughly in the same league as the GTX 1070. The performance bump that we can see here leads me to suspect that this is probably a cut-down Vega core (Vega 11) and not a revised version of the RX 480. You can’t really hope to gain multiple double digit performance bumps from simply an improved revision or better binned GPU.
...
It does however mean that this is an RX 480 successor through and through and will probably be priced in the same range. It also means that we have yet to see the real flagship of AMD, the full-fledged Vega GPU, which will probably take on the name of the Radeon RX 590 graphics card.

http://wccftech.com/amd-rx-580-crossfire-ryzen-4k-aots-benchmarks/

Anarchist4000 · Feb 24, 2017

If a cut down Vega 11 is a 580 what's a full Vega 11? More likely it's a Polaris refresh with GDDR5X and ~40% higher bandwidth.

Michellstar · Feb 24, 2017

Anarchist4000 said:
If a cut down Vega 11 is a 580 what's a full Vega 11? More likely it's a Polaris refresh with GDDR5X and ~40% higher bandwidth.

Probably when they say a cut down Vega core it refers to the full vega 10 the 4096 part, not that there is a cut down vega 11 i believe

Razor1 · Feb 24, 2017

Anarchist4000 said:
If a cut down Vega 11 is a 580 what's a full Vega 11? More likely it's a Polaris refresh with GDDR5X and ~40% higher bandwidth.

Nope, Polaris isn't going to get that type of performance upgrade, even if it was refreshed, just doesn't have the legs to go that far. With a 40% increase in bandwidth it would also need 40% increase in computational power, you think they can do that? That's an increase of 500 mhz? Not going to happen. Or an increase of 40% units, that too not going to happen if they want to keep power consumption in a range where they are competitive. Its already sucking as much juice as a 1070, either of these changes will put it it well above that.

Razor1 · Feb 24, 2017

Michellstar said:
Probably when they say a cut down Vega core it refers to the full vega 10 the 4096 part, not that there is a cut down vega 11 i believe

Yeah most likely this, Vega 11 hasn't been shown yet.

no-X · Feb 24, 2017

Razor1 said:
Nope, Polaris isn't going to get that type of performance upgrade, even if it was refreshed, just doesn't have the legs to go that far. With a 40% increase in bandwidth it would also need 40% increase in computational power, you think they can do that? That's an increase of 500 mhz? Not going to happen. Or an increase of 40% units, that too not going to happen if they want to keep power consumption in a range where they are competitive. Its already sucking as much juice as a 1070, either of these changes will put it it well above that.

It's about 15 % above RX 480. That's quite in line with Polaris 10 XT2 expectations. 10 % higher core clock, GDDR5X (25 % higher bandwidth) - it should result in ~15 % higher performance.

Razor1 · Feb 24, 2017

no-X said:
It's about 15 % above RX 480. That's quite in line with Polaris 10 XT2 expectations. 10 % higher core clock, GDDR5X (25 % higher bandwidth) - it should result in ~15 % higher performance.

You can't look at the raw score, you can only look at the GPU frame rates. I can show you scores of the 1070 ranging from 6000 all the way up to 9000 on 1080p.

At same settings, same version the 1070 gets 60 to 90 fps depending on which ones you are looking at.

And each run of the internal benchmark is different for AOTS, and added to the fact we know Perfmon shows us different frames per second depending on the IHV in this particular engine, well there ya go.

To think the rx480 even gets that close to a 1070 in this specific game, not a chance. I have not see a single review that showed less than 30% difference in that game between those two cards, perfmon or not

AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

3dilettante

CSI PC

CSI PC

CSI PC

Anarchist4000

Infinisearch

seahawk

Deleted member 2197

Guest

CarstenS

Moderator

Malo

Yak Mechanicum

Kaotik

Drunk Member

CarstenS

Moderator

Anarchist4000

Deleted member 2197

Guest

Anarchist4000

Michellstar

Razor1

Razor1

no-X

Razor1