AMD RDNA5 Architecture Speculation

pTmdfx · Mar 4, 2025

Lurkmass said:
No but much of ROCm's success with compute was designed to exploit BC at the ISA level so that their big customers can specifically avoid having the terrible experiences plagued from their graphics portfolio ...

What you describe is some stretchy level of “BC” at GCN assembly level. But the ISA in reality is still mildly different with breaking changes (if not instructions, then timing) across e.g. gfx908, gfx90a and gfx940. If that is the “BC” you are referring, one could stretch it further to say that such assembly level (semi-)portability exists across GCN, CDNA and RDNA to varying extent.

Otherwise, many are expected to use AMD’s libraries, HIP or mainstream libraries on top of these, which all present high(er) level abstractions like CUDA-like interfaces and/or kernel languages. “BC at ISA level” does not sound like a relevant concern for the abstraction consumers here, only the implementor (so mostly AMD and the ML infra teams of the “big customers”).

I see no avenue that AMD will replace the “machine code” representation with SPIR-V. Their own libraries and “big customers” would certainly still need that path to distribute e.g. target optimized GEMM kernels.

Even the SPIR-V stack itself needs a target to lower to, eh. So I can’t say I understand how “SPIR-V” ended up provoking these two strong reactions above.

Lurkmass · Mar 5, 2025

xpea said:
Yep and we don't even talk about their stupid business decision to base most of their APIs and frameworks on CUDA like the semianalysis article pointed out. No words can express such pure madness and short vision...

When ROCm began it's debut, AMD initially offered multiple kernel programming interfaces such as HIP (source level CUDA clone), HCC (C++ AMP w/ AMD extensions), and OpenCL C. After some time had settled it was clear to anyone that OpenCL was illsuited due to the lack of many QoL (quality of life) improvement features that other more advanced programming languages had and whilst HCC had the most elegant design with respect to it's device kernel language being the closest to modern host programming languages, HIP soon emerged as the leading choice for compute because it was the most sensible option for anyone looking to port CUDA kernels to it. AMD's own employees passed up on everything else besides HIP because even they preferred working with it over the other options ...

If AMD did make a bad decision, settling on their programming foundation being similar to CUDA was aguably not one of them since their automated portability tools like HIPIFY benefitted massively from this design ...

pTmdfx said:
What you describe is some stretchy level of “BC” at GCN assembly level. But the ISA in reality is still mildly different with breaking changes (if not instructions, then timing) across e.g. gfx908, gfx90a and gfx940. If that is the “BC” you are referring, one could stretch it further to say that such assembly level (semi-)portability exists across GCN, CDNA and RDNA to varying extent.

If a common set of instruction encodings for the x86 architecture is good enough for both AMD and Intel to iterate their future CPU designs off of then I don't see why that level of hardware compatibility is not good enough for AMD's compute accelerators to do the same and AMD/Intel's CPU hardware implementations often don't have consistent timing behaviour between each other or possibly even themselves so I don't see why AMD would need to implement accurate timings as well either ...

Sure there may be software out there built for very old proprietary platforms that do rely on exact hardware timings but ISA level compatibility is still better than having nothing as the solution would then become a matter of patching the binaries rather than generating an entirely incompatible set of binaries from the source code ...

xpea · Mar 5, 2025

Lurkmass said:
If AMD did make a bad decision, settling on their programming foundation being similar to CUDA was aguably not one of them since their automated portability tools like HIPIFY benefitted massively from this design ...

Everybody understand why AMD build the majority of their tools on CUDA, but it's still a bad business strategy.
They could have started with HIPIFY and build their own stack at the same time. Come on, it's been 10 years already, plenty of time to have a mature ecosystem...

Lurkmass · Mar 5, 2025

xpea said:
Everybody understand why AMD build the majority of their tools on CUDA, but it's still a bad business strategy.
They could have started with HIPIFY and build their own stack at the same time. Come on, it's been 10 years already, plenty of time to have a mature ecosystem...

Is it really a bad business strategy as you think it is when Intel poached one of AMD's leading staff member who kick started their boltzmann initiative (origin of ROCm in the aftermath of the HSA project) are nearly doing the same thing with DPC++ (Intel's implementation of SYCL with their own set of proprietary extensions to 'mimic' CUDA) and their DPC++ compatibility tool as well ?

xpea · Mar 5, 2025

Lurkmass said:
Is it really a bad business strategy as you think it is when Intel poached one of AMD's leading staff member who kick started their boltzmann initiative (origin of ROCm in the aftermath of the HSA project) are nearly doing the same thing with DPC++ (Intel's implementation of SYCL with their own set of proprietary extensions to 'mimic' CUDA) and their DPC++ compatibility tool as well ?

Yes both are terrible. In fact Intel+AMD combined are less than 10% of Nvidia in the DC. Of course it's not only because of CUDA moat but reality is that Intel/AMD failed to develop their ecosystem over all these years. As a reminder, CUDA was launched in 2006, nearly 20 years ago...

Lurkmass · Mar 5, 2025

xpea said:
Yes both are terrible. In fact Intel+AMD combined are less than 10% of Nvidia in the DC. Of course it's not only because of CUDA moat but reality is that Intel/AMD failed to develop their ecosystem over all these years. As a reminder, CUDA was launched in 2006, nearly 20 years ago...

Assuming that we're having this discussion in good faith, what other alternative plans do you think AMD could've taken up after the failure of HSA (besides ROCm) or would you prefer them to do absolutely nothing if you' want to be cheeky in your response ?

AlexV · Mar 6, 2025

Lurkmass said:
So AMD basically tells their compute users to start over again after their OpenCL and HSA failure ? Moving exclusively to SPIR-V in the future just means that any existing codebases won't work on their future hardware designs. When you're getting a next generation Instinct accelerator, you're also paying for the promise that any software that worked on their past/current hardware iterations (gfx9.x) WILL work on a future Instinct product (gfx9.y where y > x) regardless of driver/technical support provided that they don't mess up their hardware implementation ...

When AMD uproots their entire compute stack, are you sure you really want to trust their dodgy history of software support especially when they move to a more maintenance intensive platform ?

Such a shame right after when they were building up trust for ROCm with a hardware BC model ...

Unclear how one adding some flavour of IR is somehow a breaking change but here we are. Perhaps SPIR-V is being massively overloaded here? Otherwise, your view / assessment is rather confusing.

Lurkmass · Mar 6, 2025

AlexV said:
Unclear how one adding some flavour of IR is somehow a breaking change but here we are. Perhaps SPIR-V is being massively overloaded here? Otherwise, your view / assessment is rather confusing.

Because there aren't yet any applications that have been ported to this new IR ? Is the end goal with this new IR somehow supposed to replace the current means of attaining software forward compatibility rather than built-in hardware BC ? If so why change something that served them well enough until now ?

hkultala · Mar 29, 2025

Lurkmass said:
Because there aren't yet any applications that have been ported to this new IR ? Is the end goal with this new IR somehow supposed to replace the current means of attaining software forward compatibility rather than built-in hardware BC ? If so why change something that served them well enough until now ?

Do you understand what "IR" means?

You do not have to port applications to it. You compile applications to it.

Lurkmass · Mar 29, 2025

hkultala said:
You do not have to port applications to it. You compile applications to it.

This sounds like an oxymoron ...

Are you claiming otherwise that current compute applications released in mind as compatible GFX9.x binaries will somehow run 'unmodified' on AMD's other HW generations without a porting process ? (even at the source level, code can rely on HW implementation specific behaviour for correctness/performance) Recompilation/modification of the source code from one binary representation (GFX9.x) to another incompatible binary representation (SPIR-V), by definition technically constitutes as porting! (whether it'd be ISA or IR in any case)

trinibwoy · Mar 29, 2025

Recompilation is technically porting but I think in common usage people define ports as involving unique code paths/optimizations for the target platform.

DegustatoR · Thursday at 12:21 PM

OVERLAY TREES FOR RAY TRACING

From https://www.freepatentsonline.com/y2024/0087223.html

FRUSTUM-BOUNDING VOLUME INTERSECTION DETECTION USING HEMISPHERICAL PROJECTION

From https://www.freepatentsonline.com/y2024/0233242.html

TRAVERSAL RECURSION FOR ACCELERATION STRUCTURE TRAVERSAL

From https://www.freepatentsonline.com/y2024/0370965.html

SPHERE-BASED RAY-CAPSULE INTERSECTOR FOR CURVE RENDERING

From https://www.freepatentsonline.com/y2024/0331266.html

SPLIT BOUNDING VOLUMES FOR INSTANCES

From https://www.freepatentsonline.com/y2024/0412446.html

TRAVERSAL AND PROCEDURAL SHADER BOUNDS REFINEMENT

From https://www.freepatentsonline.com/y2024/0412445.html

NEURAL NETWORK-BASED RAY TRACING

From https://www.freepatentsonline.com/y2025/0005842.html

RAYTRACING STRUCTURE TRAVERSAL BASED ON WORK ITEMS

From https://www.freepatentsonline.com/y2025/0104328.html

LOSSY GEOMETRY COMPRESSION USING INTERPOLATED NORMALS FOR USE IN BVH BUILDING AND RENDERING

From https://www.freepatentsonline.com/y2025/0104285.html

They are adding BVH traversal in HW in this patent application this time.

RAYTRACING STRUCTURE TRAVERSAL BASED ON WORK ITEMS

https://www.freepatentsonline.com/y2025/0104328.html

From here: https://forums.anandtech.com/threads/rdna-5-udna-cdna-next-speculation.2624468/post-41425030

SPHERE-BASED RAY-CAPSULE INTERSECTOR FOR CURVE RENDERING looks like Blackwell's linear swept spheres.

raytracingfan · Thursday at 5:23 PM

The split bounding volumes patent sounds like it has similar goal as Mega Geometry's cluster-level acceleration structures, but I'm not sure if it actually works the same way.

Pinstripe · Thursday at 7:01 PM

As always, Nvidia remains the trendsetter and is a whole generation ahead, and AMD follows. So does Microsoft now, basically just collecting and formalizing Nvidia inventions into DirectX, rather than defining them.

Kaotik · Thursday at 7:08 PM

Pinstripe said:
As always, Nvidia remains the trendsetter and is a whole generation ahead, and AMD follows. So does Microsoft now, basically just collecting and formalizing Nvidia inventions into DirectX, rather than defining them.

You really haven't been following this world for a long time, have you? AMD and ATi before it has pioneered plenty of technologies. Microsoft isn't building DirectX on NVIDIAs whims either.

Pinstripe · Thursday at 7:17 PM

Kaotik said:
You really haven't been following this world for a long time, have you? AMD and ATi before it has pioneered plenty of technologies. Microsoft isn't building DirectX on NVIDIAs whims either.

They have been doing so the past ~10 years. Try to keep up.

DegustatoR · Thursday at 7:19 PM

Pinstripe said:
As always, Nvidia remains the trendsetter and is a whole generation ahead, and AMD follows. So does Microsoft now, basically just collecting and formalizing Nvidia inventions into DirectX, rather than defining them.

Contrary to a popular believe MS has always been collecting and formalizing IHVs inventions and suggestions when deciding what will be supported by DX, so there's nothing really new in that.
Granted that lately Nvidia has been the main source for these but AMD also had some influence (GPU Work Graphs for example or the whole idea behind DX12 in general).

Kaotik · Thursday at 7:58 PM

Pinstripe said:
They have been doing so the past ~10 years. Try to keep up.

Someone inform Oxford and Merriam-Webster that the definition of "always" has changed into "past ~10 years".
And as DegustatoR pointed out above, NVIDIA isn't the only one even in the very recent history who has contributed technologies MS decided to bring into DirectX.

AMD RDNA5 Architecture Speculation

pTmdfx

Lurkmass

xpea

Lurkmass

xpea

Lurkmass

AlexV

Heteroscedasticitate

Lurkmass

hkultala

Lurkmass

trinibwoy

Meh

DegustatoR

raytracingfan

Pinstripe

Kaotik

Drunk Member

Pinstripe

DegustatoR

Kaotik

Drunk Member

Similar threads