Datacenter GPU market supply and demand (2024)

Obviously the competitive landscape has changed
It hasn't really, AMD is still behind.

By the time MI300X rolled out, H200 was out and negated any disadvantage H100 had.

MI325X is the answer for H200 (barely faster), but by the time MI325X is out, B100 and B200 will be out, and the AMD deficit will be significantly higher.

MI350X is supposedly the answer for B200, but by the time 350X is out, B200 Ultra will be out. Followed quickly by R100.

MI400 is supposed to be the answer to R100, but it will arrive later than R100. Who knows by how much? This much is not known, only time will tell.
 
Last edited:
It hasn't really, AMD is still behind.

By the time MI300X rolled out, H200 was out and negated any disadvantage H100 had.

MI325X is the answer for H200 (barely faster), but by the time MI325X is out, B100 and B200 will be out, and the AMD deficit will be significantly higher.

MI350X is supposedly the answer for B200, but by the time 350X is out, B200 Ultra will be out. Followed quickly by R100.

MI400 is supposed to be the answer to R100, but it wil arrive later than R100. Who knows by how much? This much is not known, only time will tell.
Ok... and if you left my entire statement in there you would see that isn't what I was talking about.
Why do people feel the need to misconstrue a statement about AMD to talk about Nvidia?
 
8xMI300X has 1536GB of HBM which feels like plenty to run inference on most high-end LLM models; I’d trust MS’s claim that it provides leading perf/$ for GPT4(-o?) inference.
ChatGPT and all these commercial LLMs have been trained on clusters of 10k+ A100/H100.
You do "nothing" with 8*MI300X, thus the extreme importance of interconnect and radix
 
ChatGPT and all these commercial LLMs have been trained on clusters of 10k+ A100/H100.
You do "nothing" with 8*MI300X, thus the extreme importance of interconnect and radix
A node is nothing else than a test run. Nobody is putting this into production. Amazon and nVidia are working on these NVL systems connecting 36 or 72 GB200 superships together into one rack. With MI325X companies need >8x more racks for equal performance. AI applications are cheaper or much better with nVidia hardware. Thats the reason AMD osbourned MI300X and MI325 at Computex.
 
ChatGPT and all these commercial LLMs have been trained on clusters of 10k+ A100/H100.
You do "nothing" with 8*MI300X, thus the extreme importance of interconnect and radix
I specifically said inference, I obviously agree with you for training LLMs.

If you want to run something like Llama3-70B, which is close to the state of the art for open source models until 400B is released, I am skeptical there’s a big benefit to going above 8 GPUs with 1536GiB HBM. That should allow a large enough batch size to be efficient.
 
Just curious: Do the product with low demand such as MI300 stocks a certain level of inventory? Or is it manufactured on demand?
 
I specifically said inference, I obviously agree with you for training LLMs.

If you want to run something like Llama3-70B, which is close to the state of the art for open source models until 400B is released, I am skeptical there’s a big benefit to going above 8 GPUs with 1536GiB HBM. That should allow a large enough batch size to be efficient.
I partially agree but everybody and his dog want the new monstrous GB200 NVL72 rack for inference too, not only for training. That's the hottest piece of hardware to get among the hyperscalers and LLM companies. You may say that 13.5TB of unified memory is overkill but OpenAI and Co seem to know what to do with it...
 
Just curious: Do the product with low demand such as MI300 stocks a certain level of inventory? Or is it manufactured on demand?
"Low demand" in this case means just that the wait time is few months shorter than "high demand". They manufacture 24/7 what they've allocated to be manufactured.
(Also this "low demand" supposedly confirmed by Samsung never came from Samsung)
 
Just curious: Do the product with low demand such as MI300 stocks a certain level of inventory? Or is it manufactured on demand?
The production cycle for advanced products like these are quite long. Silicon fabrication + packaging + testing + shipping (though these are high cost enough that air freight can be used) takes 2-3 months. The fab capacity has to be booked in advance though you can pay for faster hot lots if required and capacity is available. The HBM memory has to also be ordered in advance given the current demand (As per recent reports HBM capacity is practically sold out through 2025). AMD has a pretty good idea of the expected demand as they engage with customers well in advance for production and allocation given the lead times involved. The most recent estimate I can remember is they expect ~$4B in sales for MI300 in 2024 (Not sure if this is just MI300X or includes MI300A as well). There are also other components customers have to procure, i.e. sever racks, CPUs, RAM, SSDs, etc which have their own lead times as well so it has to be planned in sync. So short answer, planned in advance and manufactured as per demand. These things also cost like ~$20K USD so you don't want to have a lot of expensive inventory lying around, though given the current demand situation, there is practically no scope for inventory anyway. And from what I understand, the current limitation to production for both AMD and Nvidia is packaging capacity not fab capacity.
 
Last edited:
Back
Top