^^THISAlso, on contrast to NVIDIA's A100, this isn't primarily a matrix/AI GPU, but rather is a vector GPU primarily. As a result, AI enhancements are very modest compared to A100 even when combining the two dies together .. however, vector throughput is through the roof. AMD and NVIDIA is diverging here, AMD is still not fighting NVIDIA in the AI market, rather focusing solely on the HPC market.
I suspect 2 years old A100 still beats MI200 in large AI/ML workloads where interconnect performance is the bottleneck. As I said previously, AMD saw a business opportunity in the traditional (and dying) HPC64 market with government exascale race and before Hopper availability. They executed well. Kudos to Lisa Su
It simply shows its CGN roots and the efficiency related to this old architecture. It's much more power efficient to use ML, tensors and lower precision to solve large scientific problems. Even the traditional FP64 workloads, like weather simulation, are migrating to ML. FP64 is now a niche and AMD must move quickly to a new arch. Maybe MI300...Power consumption is 560W and is liquid cooled, which is curious considering the 6nm process.
On a final note, am I the only one disappointed by MI200 ? We all know that it was on a tight schedule to win the exascale race, but still, except the packaging (not even proprietary to AMD, equivalent inFO-L is available at TSMC), MI200 brings nothing new. 2 years old A100 is more feature packed. No sparsity ! Few and slow interconnect links and so on... in fact, it has huge flaws like we can see in AMD promoted typical HPC 4+1 (GPU+CPU) topology, where not even all GDC are linked ! From 3.2TB/s claimed to a mere 100GB bi-directional will look ugly in real world performance with large dataset... It's no surprise that all AMD benchmarks are with a single MI250X vs a single A100. I guess Nvidia will fire back soon to show how A100 scaling beats MI250X in bandwidth limited scenario. Maybe even something new in few hours at GTC 2021...
Edit: typo
Last edited: