AMD: Navi Speculation, Rumours and Discussion [2019-2020]

Status
Not open for further replies.
Since this is an AMD thread, how far behind Intel and Nvidia are on implementing PCIExpress 4 isn't directly germane. The facts of the matter is AMD has PCIExpress 4.0 available to the public around 2 years from finalized specifications. I suspect the same or quicker turn around time on PCIExpress 5 from AMD.
 
Since this is an AMD thread, how far behind Intel and Nvidia are on implementing PCIExpress 4 isn't directly germane. The facts of the matter is AMD has PCIExpress 4.0 available to the public around 2 years from finalized specifications. I suspect the same or quicker turn around time on PCIExpress 5 from AMD.

It's entirely germane given that this is a GPU thread, and most people with gaming machines have Intel machines. So it the majority of cases, people buying an AMD GPU will not be able to use pcie4.

Besides, my original question was about nvlink/switch competitors. Pcie4 is irrelevant because that's not even what AMD is using for inter-GPU communications.
 
It's entirely germane given that this is a GPU thread, and most people with gaming machines have Intel machines. So it the majority of cases, people buying an AMD GPU will not be able to use pcie4.

Besides, my original question was about nvlink/switch competitors. Pcie4 is irrelevant because that's not even what AMD is using for inter-GPU communications.

You might want to read about cxl: https://www.anandtech.com/show/1406...w-industry-high-speed-interconnect-from-intel

At its heart, Compute Express Link (CXL) will initially begin as a cache-coherent host-to-device interconnect, focusing on GPUs and FPGAs. It will use current PCIe 5.0 standards for physical connectivity and electrical standards

https://www.anandtech.com/show/14213/compute-express-link-cxl-from-nine-members-to-thirty-three

and amd is also now involved: https://www.anandtech.com/show/1466...ium-a-coherent-inteldeveloped-pcie-50based-io
 
AMD will likely get their high-speed interconnect from the Compute Express Link (CXL) consortium, which they joined last month.
Compute Express Link (CXL) is an open industry standard interconnect offering high-bandwidth, low-latency connectivity between host processors, systems and devices such as accelerator cards, memory buffers, and smart I/O devices. Designed to address the increasing demands of high-performance computational workloads, CXL targets heterogeneous processing and memory systems across a range of high-performance computing applications by enabling coherency and memory semantics between processors and systems. This is increasingly important as processing data in Artificial Intelligence and Machine Learning requires a diverse mix of scalar, vector, matrix and spatial architectures across a range of accelerator options.
https://www.guru3d.com/news-story/amd-joins-consortium-for-cxl-interconnect-based-on-pci-e-5.html
 
Thanks for the replies. It sounds like with CXL they'd get to 64GB/s bidirectional, which is still less than half of Nvidia. I really want AMD to succeed, but the interconnect is really important for my applications. Given the timelines of the future announcements, that appears to be at least 3 years behind.
 
Thanks for the replies. It sounds like with CXL they'd get to 64GB/s bidirectional, which is still less than half of Nvidia. I really want AMD to succeed, but the interconnect is really important for my applications. Given the timelines of the future announcements, that appears to be at least 3 years behind.
CXL is dumb and doesn't support p2p either way.
IFIS v2 in Vega20 is already 100GB/s each and works in whatever topology you want assuming you have enough ports.
 
AMD will likely get their high-speed interconnect from the Compute Express Link (CXL) consortium, which they joined last month.

https://www.guru3d.com/news-story/amd-joins-consortium-for-cxl-interconnect-based-on-pci-e-5.html
Except that AMD already has their high-speed interconnect as @Bondrewd pointed out. Also AMD was already part of all the other "next gen interconnect consortiums" before (CCIX, Gen Z, OpenCAPI), so AMD joining CXL doesn't actually mean much at this point
 
CXL is dumb and doesn't support p2p either way.
IFIS v2 in Vega20 is already 100GB/s each and works in whatever topology you want assuming you have enough ports.

The anandtech article on the MI60 said it was 50GBps each direction and a ring topology. Can you point me to somewhere that says it's what you're describing?
 
Oh you bet it does.
Like jeez, have you seen Rome?

Which is 100GB/s bidir and IF works in whatever topology you want, rings, meshes, whatever.

From the anandtech article: "Notably, since there are only 2 links per GPU, AMD’s topology options will be limited to variations on rings. So GPUs in 4-way configurations won’t all be able to directly address each other". So it's clearly not rings/meshes/whatever, unless they're wrong. And if we're talking about bandwidth between cards that's still 1/3 of the V100 at 300GBps total. This is the point I keep coming back to: the V100 is approaching 2 years old now, and there will likely be an update next year with Turing professional cards. So AMD needs a huge leap in both interconnect bandwidth and topology changes to even compete on that front.

Maybe they've made the trade-off that compute is more valuable than interconnect.
 
Notably, since there are only 2 links per GPU, AMD’s topology options will be limited to variations on rings. So GPUs in 4-way configurations won’t all be able to directly address each other
That's the specific IFIS implementation, not IF in general.
And if we're talking about bandwidth between cards that's still 1/3 of the V100 at 300GBps total.
2/3 akshually, being 100GB/s*2 vs 50GB/s*6.
 
Said Footnote:

2. As of Oct 22, 2018. Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology based accelerators are PCIe® Gen 4.0* capable providing up to 64 GB/s peak theoretical transport data bandwidth from CPU to GPU per card with PCIe Gen 4.0 x16 certified servers. Previous Gen Radeon Instinct compute GPU cards are based on PCIe Gen 3.0 providing up to 32 GB/s peak theoretical transport rate bandwidth performance. Peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions = GB/s per card

PCIe Gen3: 8 * 2 * 2 = 32 GB/s

PCIe Gen4: 16 * 2 * 2 = 64 GB/s

Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology based accelerators include dual Infinity Fabric™ Links providing up to 200 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card. Combined with PCIe Gen 4 compatibility providing an aggregate GPU card I/O peak bandwidth of up to 264 GB/s. Performance guidelines are estimated only and may vary. Previous Gen Radeon Instinct compute GPU cards provide up to 32 GB/s peak PCIe Gen 3.0 bandwidth performance. Infinity Fabric™ Link technology peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions * # links = GB/s per card

Infinity Fabric Link: 25 * 2 * 2 = 100 GB/s

MI50 |MI60 each have two links:

100 GB/s * 2 links per GPU = 200 GB/s
 
Said Footnote:

2. As of Oct 22, 2018. Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology based accelerators are PCIe® Gen 4.0* capable providing up to 64 GB/s peak theoretical transport data bandwidth from CPU to GPU per card with PCIe Gen 4.0 x16 certified servers. Previous Gen Radeon Instinct compute GPU cards are based on PCIe Gen 3.0 providing up to 32 GB/s peak theoretical transport rate bandwidth performance. Peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions = GB/s per card

PCIe Gen3: 8 * 2 * 2 = 32 GB/s

PCIe Gen4: 16 * 2 * 2 = 64 GB/s

Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology based accelerators include dual Infinity Fabric™ Links providing up to 200 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card. Combined with PCIe Gen 4 compatibility providing an aggregate GPU card I/O peak bandwidth of up to 264 GB/s. Performance guidelines are estimated only and may vary. Previous Gen Radeon Instinct compute GPU cards provide up to 32 GB/s peak PCIe Gen 3.0 bandwidth performance. Infinity Fabric™ Link technology peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions * # links = GB/s per card

Infinity Fabric Link: 25 * 2 * 2 = 100 GB/s

MI50 |MI60 each have two links:

100 GB/s * 2 links per GPU = 200 GB/s

Thanks. Is it also the case that it's NOT full mesh? It seems hard to find info on that.
 
CXL is dumb and doesn't support p2p either way.
IFIS v2 in Vega20 is already 100GB/s each and works in whatever topology you want assuming you have enough ports.

Yah, there's a bunch of new interconnect standard. CXL seems the least interesting, most backwards looking of all of them, which is why when it was announced Intel was the only actual hardware company in the "consortium".

Regardless, I'd think the question is how much workloads even saturate Express 3 now, let alone 4.
 
Yah, there's a bunch of new interconnect standard. CXL seems the least interesting, most backwards looking of all of them, which is why when it was announced Intel was the only actual hardware company in the "consortium".

Regardless, I'd think the question is how much workloads even saturate Express 3 now, let alone 4.
Storage workloads can easily do that
 
Status
Not open for further replies.
Back
Top