AMD Execution Thread [2023]

Status
Not open for further replies.
it's basically on par with H100
Not made for training.
Which probably means it actually isn't.
Kinda redundant copium hit.
Also, NVIDIA has H200 now, which is even faster.
Point is, it doesn't.
It's like a mid'24 product (NV says Q2 but that's very generous according to HBM3e ramp timelines from the Big 3).
when the H100 runs out of memory
So does MI300, the biggest delta is server-level perf after all.
MI300 is much more expensive and much more complicated to make than H100
Not really, it's like two new TOs.
More Si but they're tiny tiles.
 
Point is, it doesn't.
It's like a mid'24 product (NV says Q2 but that's very generous according to HBM3e ramp timelines from the Big 3).
Same for MI300A, still not ramped up.
So does MI300, the biggest delta is server-level perf after all.
Which is not the point? These are AI GPUs, who cares about HPC performance (FP64)?
Not really, it's like two new TOs.
More Si but they're tiny tiles.
Dylan Patel from SemiAnalysis (the most accurate source of information for HPC/AI on the web) says the cost is actually 2x of H100.

 
Same for MI300A, still not ramped up.
A is the less important product.
X is actively ramping and you can already req preview at Azure.
Which is not the point? These are AI GPUs, who cares about HPC performance (FP64)?
Are you serious?
Server-level perf (8 GPU 4/5U) is how your AI product lives or dies.
Dylan Patel from SemiAnalysis (the most accurate source of information for HPC/AI on the web)
No, he's our pet.
Wouldn't call him accurate on a good day, my man couldn't estimate even MI300A numerics and do its perf modeling for the better part of this summer.
 
I'm too clueless to make a call on if this is good enough to steal nvidias lunch, so I looked at stock prices to see what people with skin in the game seem to think. They are both down about 4.5% over the last 5 days, i'm guessing nvidia is the china ban stuff and AMD the same. I would have expected AMD to go up though at least a little today if this was going to take it to nvidia? A pretty poor way to judge hardware but at the end of the day the end goal is money for both of them.
 
Server-level perf (8 GPU 4/5U) is how your AI product lives or dies.
Aha. Gotcha (thought you were talking about HPC). However, at the AI front, there isn't any advantage for MI300X over the standard H100 (let alone the H200) aside from the more memory capacity, which is why AMD focused only on these edge cases (memory capacity sensitive inference tests). AMD also avoided using TensorRT LMM standard tests for some unknown reason, these are the defacto tests in the realm of AI.

See the link below for more analysis on the situation.
Wouldn't call him accurate on a good day, my man couldn't estimate even MI300A numerics and do its perf modeling for the better part of this summer.
Disagree here.
 
Last edited:
there isn't any advantage for MI300X over the standard H100
Well no, it has more math and more membw.
aside from the more memory capacity
Well yeah, that's kinda the point of building a more expensive 8 stack machine.
which is why AMD focused only on these edge cases
LLaMA2 70B is the least edge cases imaginable.
It's the standard inference benchmark lol.
AMD also avoided using TensorRT LMM
They used vLLM instead which is what the industry at large uses.
these are the defacto tests in the realm of AI.
No, AMD server-level perf numbers were very much in-line with what the industry expects since this is what Meta asked of them. lol.
Disagree here.
Again, he was wrong and I was right, that's all there is to it.
Either way he's more bullish on 300X than I am.

I prefer shilling MI400 (insert training config here) and it's esoteric eDPUs.
 
According to AMD, it's basically on par with H100. Which probably means it actually isn't. Also, NVIDIA has H200 now, which is even faster.

AMD-Instinct-MI300-_-MI300X-Launch-_7.png


AMD showed that MI300X does win significantly against H100 in select cases, when the H100 runs out of memory, as H100 has 80GB, while the MI300X has double that: 192GB. However, H200 has 140GB, which means it fixes these cases.

MI300 is much more expensive and much more complicated to make than H100, MI300X is more power hungry too.
MI300 is the absolut bottom for AMD. Barely faster than H100. The successor will be announced in three months. The H200 refresh will be much faster in inference. And nVidia is working with Amazon on a inference rack combining 32 Superchips with 900GB/s resulting in 20TB of memory.

They are losing Supercomputer contracts because of ARM, they are nowhere in AI. The same story as Intel.
 
They used vLLM instead which is what the industry at large uses.
80+% of world's Ai training is using Nvidia HW so stop your usual BS, the standard is Nvidia, like it or not, thus the standard is whatever library runs faster on NV. End of discussion.

Bottom line of this announcement is that AMD will do fine with MI300X. They lose in large transformer models training where their lack of competitive scale-out solution is obvious, but they will sell everything they can produce, abeit their numbers are still multiple times lower than Nvidia, as green team has secured the vast majority of the supply chain up to end of 2024
 
300X already using e version
No, they're standard 5.2Gbps or whatever parts.
HBM3e starts at 6.4Gbps or therein.
AMD can get 22% or therein more membw by slapping HBM3e in there, but that's mid-24 earliest for any relevant volume.
 
How do you know? AMD said they expecting +$2 billion in revenue for next year. That doesnt sound like a product in high demand. nVidia is doing this alone with Geforce in one quarter.
 
AMD said they expecting +$2 billion in revenue for next year.
Keep in mind this includes their HPC contracts, as MI300 was an HPC product first and foremost, Funded by the El-Captian super computer contract. AMD modified the design on the fly once the AI bubble blew up. So yes AMD is not expecting too much money from AI contracts (relatively speaking of course). However, MI400 will be AI focused and will build on the foundation of MI300, it should also be released relatively quickly.
 
Last edited:
For a "HPC product first and foremost" the performance is really bad... MI300X has 2x the transistors of H200 but only ~20% more performance.
 
AMD also avoided using TensorRT LMM standard tests for some unknown reason, these are the defacto tests in the realm of AI.

Tensorflow is basically obsolete, no-one uses it anymore. Today, you are using a weird in-house thing or PyTorch.

80+% of world's Ai training is using Nvidia HW so stop your usual BS, the standard is Nvidia, like it or not, thus the standard is whatever library runs faster on NV. End of discussion.

Er. The standard is PyTorch, and for inference that usually means using vLLM. Most people run it on NVidia, but that's what's being used, and that's what makes sense to benchmark. If something else runs faster on NV, it doesn't matter, because PyTorch won because despite the massive investment into AI hardware, developer time is still more expensive to users than machine time, and PyTorch wins at that.
 
AMD also avoided using TensorRT LMM standard tests for some unknown reason, these are the defacto tests in the realm of AI.
Not to mention the new version of TensorRT LLM released late November provides a 5x increase in tokens/second compared to the October release. Other solutions like PyTorch are available though might not be as performant, most organizations will use the solution inline with corporate AI strategy and goals.
 
Status
Not open for further replies.
Back
Top