AMD CDNA: MI300 & MI400 (Analysis, Speculation and Rumors in 2024)


Chips and Cheese is also comparing the MI300X primarily to the PCIe version of the H100, which is the weakest version of the H100 with the lowest specs

Chips and Cheese also mentions getting specific help from AMD with its testing, but doesn't appear to have received equivalent input from Nvidia, so there could be some bias in the benchmark results

The introduction says, "We would also like to thank Elio from NScale who assisted us with optimizing our LLM runs as well as a few folks from AMD who helped with making sure our results were reproducible on other MI300X systems." No mention is made of any consultation with any Nvidia folks, and that suggests this is more of an AMD-sponsored look at the MI300X

 
I got the same feeling that TomsHardware did when I read the C&C article. Similar to the performance hpc tit-tat between AMD and Nvidia a few months ago with optimizations only for one side. I think AMD sponsored a couple more tests earlier this year with AI partners using similar optimizations.

If there was any credibility to these claims why avoid MLPerf or the recent AMD refusal of Tiny Corp using MI300x in MLPerf testing?
Can't blame marketing for these decisions ...
 
I got the same feeling that TomsHardware did when I read the C&C article. Similar to the performance hpc tit-tat between AMD and Nvidia a few months ago with optimizations only for one side. I think AMD sponsored a couple more tests earlier this year with AI partners using similar optimizations.

If there was any credibility to these claims why avoid MLPerf or the recent AMD refusal of Tiny Corp using MI300x in MLPerf testing?
Can't blame marketing for these decisions ...
So you got wrong feeling. They explained that AMD did not provide them any special optimization. Just result validation.
 
I got the same feeling that TomsHardware did when I read the C&C article. Similar to the performance hpc tit-tat between AMD and Nvidia a few months ago with optimizations only for one side. I think AMD sponsored a couple more tests earlier this year with AI partners using similar optimizations.

If there was any credibility to these claims why avoid MLPerf or the recent AMD refusal of Tiny Corp using MI300x in MLPerf testing?
Can't blame marketing for these decisions ...
The biggest issue I have with this article is that they use the generic vLLM and not TensorRT-LLM or LMDeploy that are much faster on Nvidia accelerators. AMD is clearly behind this article and they are not in a good spot right now with all their recent benchmark shenanigans...

PS: a recent test that compares different LLM inference backends:
 
So you got wrong feeling. They explained that AMD did not provide them any special optimization. Just result validation.
It is a bit weird to receive only AMD optimizations from Nscale when they could have provided the same for Nvidia. Granted Nscale is a primary AMD partner and though the do offer access to Nvidia hardware did not offer any optimations for the testing. Could have something to do with increased marketing effort for their huge MI300x purchase earlier this year.

Let's see if C & C provides balanced testing/optimizations with the Nvidia contacts provided to them by TomsHardware.
 
It is a bit weird to receive only AMD optimizations from Nscale when they could have provided the same for Nvidia. Granted Nscale is a primary AMD partner and though the do offer access to Nvidia hardware did not offer any optimations for the testing. Could have something to do with increased marketing effort for their huge MI300x purchase earlier this year.

Let's see if C & C provides balanced testing/optimizations with the Nvidia contacts provided to them by TomsHardware.
But they said that no sw optimizations were provided. Vanilla vLLM for both vendors.
 
But they said that no sw optimizations were provided. Vanilla vLLM for both vendors.
"NScale who assisted us with optimizing our LLM runs as well as a few folks from AMD who helped with making sure our results were reproducible on other MI300X systems."

Also, why would you need AMD engineers to help make sure your results were similar to those already published by AMD?
 
Last edited by a moderator:
"NScale who assisted us with optimizing our LLM runs as well as a few folks from AMD who helped with making sure our results were reproducible on other MI300X systems."

Also, why would you need AMD engineers to help make sure your results were similar to those already published by AMD?
Guys from AMD only checked results if they were consistent.
 
Guys from AMD only checked results if they were consistent.
Your speculation or from the article? Why the need for the results to be consistent with AMD published numbers?
What happens in the case the results are not consistent? Independent or managed test results?
 
Your speculation or from the article? Why the need for the results to be consistent with AMD published numbers?
What happens in the case the results are not consistent? Independent or managed test results?
That's what C&C explained in comments. Seems that word selection in article was unfortunate. But you know it's a small site. Private run, right?
 
Instinct “annual cadence”:
MI300X / CDNA3: 23Q4
MI325X / CDNA3: 24Q4
MI350X / CDNA4: assumed 25Q4, and seemingly a rehash of CDNA3 plus an extra set of matrix instructions
MI400 / “CDNA Next”: assumed 26Q4

While for RDNA’s (lack of) cadence:
RDNA3: 22Q4
RDNA4: rumoured late 2024
RDNA5: rumoured late 2025
UDNA6: eh, logically late 2026 by extrapolation?

So there is a possibility of “CDNA Next” turning out to be the so-called “UDNA 6” with dates seemingly lining up. Let’s see if the upcoming event will reflect that.
 
So there is a possibility of “CDNA Next” turning out to be the so-called “UDNA 6” with dates seemingly lining up. Let’s see if the upcoming event will reflect that.

That is certainly a possibility... or it could mean that they are just now starting to flesh out UDNA, adding it to their current roadmap, and it is still more than 4-5years out. If that is the case, that would put UDNA sometime in +2029, after CDNA Next and its derivatives. UDNA5 would be the first in his scenario, since he was talking about "planning" 3 generations for backward and forward compatibility, RDNA5/6/7 and UDNA5/6/7.

The way Huynh phrased that "backward/forward compatibility" answer made it seem like they have been working on adding that to RDNA5/6/7 and UDNA is after that. The other clue is when he was asked about the "when" for UDNA, he gave this answer- "We haven’t disclosed that yet. It’s a strategy. (...) They(devs) actually wish we did it sooner, but I can't change the engine when a plane’s in the air. I have to find the right way to setpoint that so I don’t break things." CDNA Next showing up on the roadmap with 2026 makes it seem "disclosed" which could mean CDNA Next isn't UDNA. Tom's Hardware article with Huyng's interview

Really just depends on when they finalized this "strategy" and started planning the "backwards compatibility" into their architectures.
 
Last edited:
Back
Top