AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

The cited statement is literally saying it is not being exposed to devs. It was noted that GFX9 merged several internal setup stages, seemingly combining the stages that generate vertices that need position calculation and then the stages that processed and set them up. That sounds like it has to do with primitive shaders, or is what the marketing decided to call primitive shaders.
The prior shader stages didn't look like they were exposed to developers, so AMD might not be driven to expose the new internal stages all that quickly. Some of the work that occurs hooks into parts of the pipeline that have been protected from developers so far.
Not currently exposed, but a goal when introduced. I'm assuming the current implementation is a rather limited to make a working product. Only improvement being the culling and deferring interpolation Mantor mentioned about discard, while not outright saying it. The merged stages being an opportune time and place.

How much if any gets exposed could be interesting, but might come down to not wanting to expose some capabilities if Nvidia was working on similar designs. Or just a matter of many moving pieces to complete first.

The stats I've seen for the gaming customer base may be out of date by now, but most systems they'd be selling to would be less well-endowed than a Vega RX. I do not think they are going to abandon those systems, and as such counting on a Vega-specific feature outside of coincidental use seems unwise.
Just providing an example, but those games feature mechanics that Bethesda has deemed popular. Large part of why people still play Fallout. I'm not implying Bethesda will abandon the systems, but that thanks to mods gamers readily push the limits. While not required, unified memory and paging would be a huge gain for their active fans.

I would like to see an interview or some statement about Bethesda's pushing for a leading-edge feature like HBCC or similar tech, given their open-world engine isn't from this decade.
They are tight lipped about upcoming products, but take a look at the paid mods they are adding for even old games. Old engines should work transparently with HBCC and future titles built with it in mind. Their games are too prone to gamers piling objects in one area.

I'd be highly surprised if they don't have a new engine in the works considering the long term strategic partnership they announced with AMD. Especially as AMD is well beyond Nvidia with low level capabilities that will be more widespread when products release. Bethesda has used the same engine from Oblivion with modifications along the way. Making a new one on low level APIs doesn't seem far fetched as the new APIs are ideal with open world's and many objects.

If this isn't two-workgroup solution, then AMD may have noted from analyzing the triangle sieve method that it could take that shader and the regular vertex shader and throw it all in the same bucket, then try to use compile-time analysis to hoist the position and culling paths out of the original position+attribute loop.
Perhaps a single shader removes overhead and makes it more likely to be universally beneficial than it was with the PS4. If not, then going by the Linux patches it may be that it's not optional like it was for the PS4.
Compile time with multiple kernels to choose from for GPU driven rendering seems a possibility. Even consider runtime analysis. Some of this change may come from the HBCC and dynamic resources. That's usually the issue throttling pipelines, but it may also be binning for workgroup distribution. Setting the stage for capabilities we haven't seen.
 
BTW, did not see it mentioned anywhere, but: The review driver for RX Vega lists an MI25 x2.
 
That looks consistent with the slides that mentioned a Vega10x2 product, although there is tension between AMD's naming scheme and what could be achieved with two GPUs.

The Instinct line has naming based on FP16 throughput. One would think two Vega chips could get more than 25 TFLOPs and so have a different name, but if the name were based on single-GPU throughput it could be prohibitive power-wise.
 
That looks consistent with the slides that mentioned a Vega10x2 product, although there is tension between AMD's naming scheme and what could be achieved with two GPUs.

The Instinct line has naming based on FP16 throughput. One would think two Vega chips could get more than 25 TFLOPs and so have a different name, but if the name were based on single-GPU throughput it could be prohibitive power-wise.

Calling it MI50 could be considered deceptive because of the split memory pools?
 
Calling it MI50 could be considered deceptive because of the split memory pools?

Given the possible power consumption of two GPUs able to hit ~25 TF each, it might be something like an 35-40 rather than MI50. I'm not versed on what objections deep-learning packages might have for split memory. I would have thought software development in that field would be sophisticated enough and parallel enough to handle it.
 
Calling it MI50 could be considered deceptive because of the split memory pools?
More because of the power draw. They could connect via Infinity with lots of bandwidth, but for Instinct it's more a density play for servers. MI40@300W seems more appropriate.

I would have thought software development in that field would be sophisticated enough and parallel enough to handle it.
With HBCC paging each separately it probably "just works" as those jobs go really wide with unified memory. I'm not sure DL has a lot of sharing that couldn't trivially be worked around.
 
does anyone know why AMD still doesn't expose programmable sample position to direct3d 12? why do they underpromote such wonderful feature so much, they support it since... SM2.0b\3.0 gpus?
 
AMD marketing with vega has taken a much more sinister turn.

15803468_392252511123269_3071166409963732992_n.jpg


Red Drum.
Reddrum
Red rum.
Redrum.
MURDER!
 
A batch is built up in the pipeline until some condition is met. It could be storage limits, a total number of triangles evaluated is hit, or a total number of contributing triangles is hit, or some other case such as:
Sequential primitives are captured until a predetermined condition is met, such as batch full condition, state storage full condition, or a dependency on previously rendered primitives is determined, according to an embodiment.


The shader is updating a counter per-pixel and each pixel is reading from it.
This is all theorizing why the triangle test might not capture Vega's tiling in the same manner as Pascal or Maxwell.
It could very well be inactivated, but it might have a hard time showing itself since everything perfectly overlaps and reads the same data.
I found this in the white paper of vega. Is this the patent from And you talked about?

Some draw calls also include many small instances (i.e.,they render many similar versions of a simple object). If an instance does not include enough primitives to fill a wavefront of 64 threads, then it cannot take full advantage of the GPU’s parallel processing capability, and some proportion of the GPU's capacity goes unused. The IWD can mitigate this eect by packing multiple small instances into a single wavefront, providing a substantial boost to
utilization.
http://radeon.com/_downloads/vega-whitepaper-11.6.17.pdf
 
Last edited:
I found this in the white paper of vega. Is this what you talked about?
http://radeon.com/_downloads/vega-whitepaper-11.6.17.pdf

The dependency claim is an case where the GPU cannot combine things, while combining small instances is a case where it can.

The whitepaper's description is a case where small instances are doing the same work and do not affect each other, so they can be packed together to increase utilization. A serial dependency is where a primitive's output depends on what another one does, so they cannot be combined or batched without producing a different and possibly wrong result.
 
AMD Radeon RX Vega 10 chips differ - physically and quite significantly

After further inspection I also noticed that the GPU that is used on our Vega 64 sample, differs from other reviewers. Now here's the thing, my Dutch colleagues over at Tweakers.net noticed something similar. It seems that AMD fabbing Vega at two different locations. And there is a significant enough physical difference:
http://www.guru3d.com/news-story/am...iffer-physically-and-quite-significantly.html
 
Seems odd not to have any markings on the chips themselves, no? Or are they now on the perimeter of the whole package? Why didn't the articles just look at those? :rolleyes:

What does the QR code tell?
 
Seems odd not to have any markings on the chips themselves, no? Or are they now on the perimeter of the whole package? Why didn't the articles just look at those?

techPowerUp noted that their Vega 64 sample (with the black substrate) has "Made in Taiwan" on the shim while the Vega 56 (without it) has "Made in Korea".

https://www.techpowerup.com/reviews/AMD/Radeon_RX_Vega_56/images/gpu.jpg
https://www.techpowerup.com/reviews/AMD/Radeon_RX_Vega_64/images/memory.jpg
 
Looks like a spacer or filler was inserted to avoid crushing(?) the chips. May have been a more recent modification, but different fabs is interesting. May indicate more production than expected.
 
Here in France, it seems that the Vega 64 "standard" is around 619 euro or more, with 2 games. No "card alone" stuff... Pretty stiff...
 
Back
Top