AMD Vega 10, Vega 11, Vega 12 and Vega 20 Rumors and Discussion

IIRC AMD said at the December event that the drivers were "based off Fiji with few tweaks and a debug layer, early alpha", but that was early December, not CES


I find that hard to believe with the the amount of the shader array has changed. But then agian has it change to the degree it has based on what they have shown so far or just a modification, not sure.......need more info but as it stands, looks like they have made quite a bit of changes in pretty low level areas.
 
Assuming it is on-demand page migration, it should work differently though (push/copy vs pull/paging). Everything would be like automatic tiled resources, probably except render targets. Subsets of the resources should be swapped in only on demand (upon page fault when accessed, or a prefetch hint is given). The current model still requires manual management with assumptions in the size of VRAM.

Although the real question is if it is that simple... AFAIU it can be done rather easily with the abstractions and coherency guarantees the graphics stack provide, as long as the GPU address translation hierarchy is architected to handle it. But probably not for compute (HSA/OCL), especially for HSA which requires agents to share/mirror the process VAS.

Sounds a fit to Linux's HMM effort though.


Interesting, didn't think about it that way!
 
Performance wise, it doesn't look so. So far at least. TechReport had this to say about it's Doom 4K performance:

Did the 1080 and Titan X also have all their vents taped shut? :p

Though that performance might not sound so impressive, it's worth noting that all of the demo system's vents (including the graphics card's exhaust) were taped up, and it's quite likely the chip was sweating to death in its own waste heat.

It's an engineering sample, we can't tell much about Vega yet.

Regards,
SB
 
Did the 1080 and Titan X also have all their vents taped shut? :p



It's an engineering sample, we can't tell much about Vega yet.

Regards,
SB
One of the few things we can tell is its minimum performance. Even tho is AMD chose this games was for specific reasons and not randomly.
 
One of the few things we can tell is its minimum performance. Even tho is AMD chose this games was for specific reasons and not randomly.

Theres one thing with Doom who was trouble me a little bit, They was using Vulkan, but so Vulkan need a few optimisations for specific gpu's. Is the developpers have allready setup specific path for Vega ? So far of the launch, we can only see the performance increase anyway. ( but ofc i speculate there )
 
Theres one thing with Doom who was trouble me a little bit, They was using Vulkan, but so Vulkan need a few optimisations for specific gpu's. Is the developpers have allready setup specific path for Vega ? So far of the launch, we can only see the performance increase anyway. ( but ofc i speculate there )
I dont know AMDs relation with valve but something like that will require AMD to disclose very deep parts of its architecture very early to many people.
 
Maybe the higher ops count is about removing sources of stalls. Some other possibilities could be that a CU could issue more than one instruction from a wavefront, possibly from more than one category in the absence of a dependence such as issuing a vector and scalar op.
It could still be 64 lanes though, given that FMA and MAD are usually counted as two ops.
 
Major leaks by videocardz.

Vega 10
- 14nm (GlobalFoundries / Samsung)
- 1H 2017
- 2-stack HBM2, up to 16GB at 512GB/s
- 64 NCU, 4096 ALUs
- 12 TFLOPs FP32, meaning ~1.5GHz
- 1/16 rate DP, 2x rate FP16
- 225W TDP

Vega 10X2
- Dual Vega 10
- 2H 2017
- 4-stack HBM2, 1TB/s
- 300W TDP

Vega 20
- 2H 2018
- 7nm
- 4-stack HBM2, 1TB/s, ECC support.
- 64 CU
- 1/2 rate DP, 2x rate FP16
- "Peer-to-peer GPU communication - xGMI"
- PCI-E 4.0
- 150-300W


Biggest news for now is that Vega 10 is not the chip that will be made on TSMC. Maybe that's what Polaris 12 is about?

HBM2 stacks in Vega 10 seem to be 8-Hi, at least for the 16GB versions.
"4-stack" mention in Vega 10X2 could point to it being two GPUs sharing the same interposer. I wonder if this could lead to an unified memory pool. They did talk a lot about the memory controller.
Though Vega 20 is the one claiming peer-to-peer GPU communication, so that's the most probable place we could see such a feature.

Vega 20 may actually be a substantially smaller GPU than Vega 10. 7nm vs. 14nm, same amount of CUs, but it should be able to clock higher.


EDIT: Roadmap also mentions a Vega 11 coming up in 2017, for FP16 and FP32 compute.



8YGwfB.jpg

1Izv5W.jpg

GlH2fQ.jpg
 
Last edited by a moderator:
"4-stack" mention in Vega 10X2 could point to it being two GPUs sharing the same interposer.
Or it means literally four stacks.

I wonder if this could lead to an unified memory pool. They did talk a lot about the memory controller.
But the point being touted for the memory controller is virtual address space and fine-grained data movement, not a unified physical pool. The bullet point being taken off your list (hardware page management support) kinda reflects this.

Seems like they are going after Nvidia's path in relying on the address translation hierarchy for the de facto cache management. It would be interesting to see how it fits into their HSA-influenced ROCm stack.

Though Vega 20 is the one claiming peer-to-peer GPU communication, so that's the most probable place we'll see such a feature.
Since it specifically says xGMI for peer-to-peer link, what is being claimed is more likely dedicated links for inter-GPU traffic like NVLink.
 
Last edited:
But the point being touted for the memory controller is virtual address space and fine-grained data movement, not a unified physical pool. The bullet point being taken off your list (hardware page management support) kinda reflects this.

Seems like they are going after Nvidia's path in relying on the address translation hierarchy for the de facto cache management. It would be interesting to see how it fits into their HSA-influenced ROCm stack.
(...)
Since it specifically says xGMI for peer-to-peer link, what is being claimed is more likely dedicated links for inter-GPU traffic like NVLink.

But 4 GMI links (the amount shown in Greenland's slides) would amount to 100GB/s with a reportedly low latency. This is a whole lot more than what PCI-Express will be able to pass, and perhaps it could be more than enough for e.g. streaming texture data between GPUs without having to duplicate the data in each GPU's VRAM (or HBC as they will call it now?).
 
"4-stack" mention in Vega 10X2 could point to it being two GPUs sharing the same interposer.
Unless the die Raja was holding isn't Vega 10, there won't be two dice on an interposer. Vega 11 would be the only case where two on an interposer might be practical. Vega 10 at 500mm2+ along with the RAM rules that out. That package would be enormous. That would be a traditional dual card which should be able to share data and a density option for enterprise.

It could still be 64 lanes though, given that FMA and MAD are usually counted as two ops.
No reason they couldn't build upon those instructions either. 4 operands for FMA4, if they added that, definitely makes the packed math easier.
 
None of the Vega products quite match features attributed to Greenland via Linkedin or the HPC APU rumors. The supposed lead chip has 1/2 rate DPFP, with the "APU" using it having PCIe 3.0. I suppose there could be a wrinkle related to the CPU component that could explain that, although why AMD would hold back on double-precision for compute products if it were there is unclear.
 
I guess we always assumed Zen and Vega would be separate chips for the APU. Given their mesh, really big APUs might make sense. There could be a Zen APU with a bunch of half rate DP Vega cores that wouldn't technically be a Vega variant. No reason it couldn't get turned into a discrete C/GPU. Zen cores might actually be interesting in a FP64 scientific environment with the prediction. Definitely makes virtualization easier when sharing a supercomputer.

If Vega 11 is the consumer variant, I can't imagine FP64 was added there. Vega 20 in theory has FP64, but is a ways off.
 
Im really not quite sure about the " 2018-2019" roadmap. Why keep Vega 10 if you have Vega20 on 7nm. ( who seems only add DP and GMI ) ? Ofc without having the context of thoses internal slides ( if really sorted by AMD ) its a bit hard to know if the roadmap is really accurate.
 
Last edited:
Back
Top