AMD RDNA5 Architecture Speculation

pTmdfx · Mar 4, 2025

Lurkmass said:
No but much of ROCm's success with compute was designed to exploit BC at the ISA level so that their big customers can specifically avoid having the terrible experiences plagued from their graphics portfolio ...

What you describe is some stretchy level of “BC” at GCN assembly level. But the ISA in reality is still mildly different with breaking changes (if not instructions, then timing) across e.g. gfx908, gfx90a and gfx940. If that is the “BC” you are referring, one could stretch it further to say that such assembly level (semi-)portability exists across GCN, CDNA and RDNA to varying extent.

Otherwise, many are expected to use AMD’s libraries, HIP or mainstream libraries on top of these, which all present high(er) level abstractions like CUDA-like interfaces and/or kernel languages. “BC at ISA level” does not sound like a relevant concern for the abstraction consumers here, only the implementor (so mostly AMD and the ML infra teams of the “big customers”).

I see no avenue that AMD will replace the “machine code” representation with SPIR-V. Their own libraries and “big customers” would certainly still need that path to distribute e.g. target optimized GEMM kernels.

Even the SPIR-V stack itself needs a target to lower to, eh. So I can’t say I understand how “SPIR-V” ended up provoking these two strong reactions above.

Lurkmass · Wednesday at 12:12 AM

xpea said:
Yep and we don't even talk about their stupid business decision to base most of their APIs and frameworks on CUDA like the semianalysis article pointed out. No words can express such pure madness and short vision...

When ROCm began it's debut, AMD initially offered multiple kernel programming interfaces such as HIP (source level CUDA clone), HCC (C++ AMP w/ AMD extensions), and OpenCL C. After some time had settled it was clear to anyone that OpenCL was illsuited due to the lack of many QoL (quality of life) improvement features that other more advanced programming languages had and whilst HCC had the most elegant design with respect to it's device kernel language being the closest to modern host programming languages, HIP soon emerged as the leading choice for compute because it was the most sensible option for anyone looking to port CUDA kernels to it. AMD's own employees passed up on everything else besides HIP because even they preferred working with it over the other options ...

If AMD did make a bad decision, settling on their programming foundation being similar to CUDA was aguably not one of them since their automated portability tools like HIPIFY benefitted massively from this design ...

pTmdfx said:
What you describe is some stretchy level of “BC” at GCN assembly level. But the ISA in reality is still mildly different with breaking changes (if not instructions, then timing) across e.g. gfx908, gfx90a and gfx940. If that is the “BC” you are referring, one could stretch it further to say that such assembly level (semi-)portability exists across GCN, CDNA and RDNA to varying extent.

If a common set of instruction encodings for the x86 architecture is good enough for both AMD and Intel to iterate their future CPU designs off of then I don't see why that level of hardware compatibility is not good enough for AMD's compute accelerators to do the same and AMD/Intel's CPU hardware implementations often don't have consistent timing behaviour between each other or possibly even themselves so I don't see why AMD would need to implement accurate timings as well either ...

Sure there may be software out there built for very old proprietary platforms that do rely on exact hardware timings but ISA level compatibility is still better than having nothing as the solution would then become a matter of patching the binaries rather than generating an entirely incompatible set of binaries from the source code ...

xpea · Wednesday at 3:35 AM

Lurkmass said:
If AMD did make a bad decision, settling on their programming foundation being similar to CUDA was aguably not one of them since their automated portability tools like HIPIFY benefitted massively from this design ...

Everybody understand why AMD build the majority of their tools on CUDA, but it's still a bad business strategy.
They could have started with HIPIFY and build their own stack at the same time. Come on, it's been 10 years already, plenty of time to have a mature ecosystem...

Lurkmass · Wednesday at 4:04 AM

xpea said:
Everybody understand why AMD build the majority of their tools on CUDA, but it's still a bad business strategy.
They could have started with HIPIFY and build their own stack at the same time. Come on, it's been 10 years already, plenty of time to have a mature ecosystem...

Is it really a bad business strategy as you think it is when Intel poached one of AMD's leading staff member who kick started their boltzmann initiative (origin of ROCm in the aftermath of the HSA project) are nearly doing the same thing with DPC++ (Intel's implementation of SYCL with their own set of proprietary extensions to 'mimic' CUDA) and their DPC++ compatibility tool as well ?

xpea · Wednesday at 7:35 AM

Lurkmass said:
Is it really a bad business strategy as you think it is when Intel poached one of AMD's leading staff member who kick started their boltzmann initiative (origin of ROCm in the aftermath of the HSA project) are nearly doing the same thing with DPC++ (Intel's implementation of SYCL with their own set of proprietary extensions to 'mimic' CUDA) and their DPC++ compatibility tool as well ?

Yes both are terrible. In fact Intel+AMD combined are less than 10% of Nvidia in the DC. Of course it's not only because of CUDA moat but reality is that Intel/AMD failed to develop their ecosystem over all these years. As a reminder, CUDA was launched in 2006, nearly 20 years ago...

Lurkmass · Wednesday at 7:57 PM

xpea said:
Yes both are terrible. In fact Intel+AMD combined are less than 10% of Nvidia in the DC. Of course it's not only because of CUDA moat but reality is that Intel/AMD failed to develop their ecosystem over all these years. As a reminder, CUDA was launched in 2006, nearly 20 years ago...

Assuming that we're having this discussion in good faith, what other alternative plans do you think AMD could've taken up after the failure of HSA (besides ROCm) or would you prefer them to do absolutely nothing if you' want to be cheeky in your response ?

AlexV · Thursday at 1:59 AM

Lurkmass said:
So AMD basically tells their compute users to start over again after their OpenCL and HSA failure ? Moving exclusively to SPIR-V in the future just means that any existing codebases won't work on their future hardware designs. When you're getting a next generation Instinct accelerator, you're also paying for the promise that any software that worked on their past/current hardware iterations (gfx9.x) WILL work on a future Instinct product (gfx9.y where y > x) regardless of driver/technical support provided that they don't mess up their hardware implementation ...

When AMD uproots their entire compute stack, are you sure you really want to trust their dodgy history of software support especially when they move to a more maintenance intensive platform ?

Such a shame right after when they were building up trust for ROCm with a hardware BC model ...

Unclear how one adding some flavour of IR is somehow a breaking change but here we are. Perhaps SPIR-V is being massively overloaded here? Otherwise, your view / assessment is rather confusing.

Lurkmass · Thursday at 2:52 AM

AlexV said:
Unclear how one adding some flavour of IR is somehow a breaking change but here we are. Perhaps SPIR-V is being massively overloaded here? Otherwise, your view / assessment is rather confusing.

Because there aren't yet any applications that have been ported to this new IR ? Is the end goal with this new IR somehow supposed to replace the current means of attaining software forward compatibility rather than built-in hardware BC ? If so why change something that served them well enough until now ?

AMD RDNA5 Architecture Speculation

pTmdfx

Lurkmass

xpea

Lurkmass

xpea

Lurkmass

AlexV

Heteroscedasticitate

Lurkmass

Similar threads