Public release of AMD HIP SDK for Windows (July 2023)

DmitryKo

Veteran
Finally, a public release of HIP SDK for Windows - only took them 5 years!


This release is also referred to as ROCm 5.5.1


Supported ROCm runtime libraries include
but AI libraries (MIOpen, MIGraphX, Composable Kernel) and RCCL are not included this time.

SDK download

lnstall guide

Windows installer includes a matching video card driver, 23.Q3 AMD Software PRO Edition, but it is optional and regular Adrenalin drivers include HIP runtime support since at least 2021.
 
Last edited:
The branding is a bit confusing, because Windows version of the HIP SDK actualy includes both compiled ROCm runtimes - a hefty 2.9 GB worth of binary files - and C++ header files, which install themseves into Program Files\AMD\ROCm\5.5. Whereas in Linux, HIP is just a CUDA emulation/interop layer in the larger ROCm SDK and runtime (though 'HIP-Clang' is also an infornal name of their current C++/OpenCL/OpenMP LLVM-based compiler).

So it looks like 'AMD HIP SDK for Windows' will be branded as an equivalent to 'ROCm Software Platform' on Linux? One example, rocminfo has been renamed to hipinfo. I wish they simply used a universal 'HIP/ROCm' branding, now that earlier ROCm compilers / languages like HCC have been deprecated.


Supported GPU families include discrete RDNA2 and RDNA3 parts (Radeon RX 6xx0 / Pro W6x00 / Pro V620, Radeon RX 7x00 / Pro W7x00, and some yet-unspecified APUs), but neither GCN5 and RDNA1 (Radeon RX Vega / VII / Vega Pro / Pro VII , Radeon RX 5x00 / Pro W5x00 / Pro V520), nor CDNA parts (Instinct MI25 / 50/ 100 / 200 / 210 / 250 / 300) which in fact never had Windows driver support.

AMD did promise to enable more GPUs with subsequent releases of ROCm. They need to extensively test and validate each specific GPU model, because their branch of the LLVM compiler does not feature a concept of a 'GPU family', like 'RDNA1' and 'GCN5', or a device-independent byte-code representation like SPIR-V, CUDA's PTX / LTO-IR, and AMD's own HSAIL, so it targets individual GPU dies only (like -arch sm30 sm35 sm37 etc. in eary CUDA):

https://www.llvm.org/docs/AMDGPUUsage.html#id11
 
Last edited:
See also:


A Nice Overview Of The ROCm Linux Compute Stack - Phoronix
AMD Radeon Open Compute Platform - XDC 2018




https://forum.beyond3d.com/posts/2204040/
etc.
 
Last edited:
AMD did promise to enable more GPUs with subsequent releases of ROCm. They need to extensively test and validate each specific GPU model, because their branch of the LLVM compiler does not feature a concept of a 'GPU family', like 'RDNA1' and 'GCN5', or a device-independent byte-code representation like SPIR-V, CUDA's PTX, or AMD's own HSAIL, so it targets individual GPU dies only (like -arch sm30 sm35 sm37 etc. in eary CUDA):
Not a big deal since you can do runtime compilation from the kernel source rather than an intermediate representation via the HIP RTC library ...
 
Finally, a public release of HIP SDK for Windows - only took them 5 years!


This release is also referred to as ROCm 5.5.1


Supported ROCm runtime libraries include
but AI libraries (MIOpen, MIGraphX, Composable Kernel) and RCCL are not included this time.


SDK download

lnstall guide

Windows installer includes a matching video card driver, 23.Q3 AMD Software PRO Edition, but it is optional and regular Adrenalin drivers include HIP runtime support since at least 2021.
There is a pull request fo MiOpen for Windows already
 
you can do runtime compilation from the kernel source

HIP for Windows still doesn't officially support a lot of consumer cards, and in fact my list of supported GPUs above went too far beyond current 5.5.1 release notes and previously announced plans. So to recap, RDNA3 cards are supported, but RDNA2 cards are partially suported (AMD promised official Linux support by Fall), and RDNA1 and Vega 20 are unsupported, though they all should probably work fine - but why would developers commit to searching enthusiast sites and community forums to see if these cards work in practice?

IMHO, AMD should at least have supported Vega10 (GCN5) and Navi10 (RDNA1), and probably even extend HIP/ROCm support back to Fiji (GCN3) and Polaris20/30 (GCN4), instead of deprecating Vega20 (Radeon VII / Instinct MI50), the single GCN5 chip they officially supported, in the next major release of ROCm. With at least 8-10 TFLOPS of computing power and a fast memory bus, these cards should be as good as RX 6600.


an intermediate representation via the HIP RTC library

It's only used for linking purposes as far as I understand, and you have to provide full source code. If you need to distribute binaries only, you have to compile and bundle machine code for all the different 'arch' variants (currently 9, more if they add consumer GPUs and APUs in a future release).



Whereas Nvidia has always used PTX bytecode and SSAS assembly, which are both translated to actual GPU machine code by the driver, and recently introduced LTO-IR, a variant of LLVM-IR, as yet another intermediate format for distribution in CUDA 12.



I will have to look more closely at their current workflow though...
 
Last edited:
AMD is preparing new compiler targets that would support generic "family" in addition to current "architecture", i.e. specific chip, for the new code object format V6. The new targets include "gfx9-generic" (Vega), "gfx10.1-generic" (RDNA1), "gfx10.3-generic" (RDNA2), and "gfx11-generic" (RDNA3).


That's better than nearly two dozen of machine code targets, but still far from the convenience of an universal intermediate bytecode format, like those used by CUDA.


The AMDGPU frontend still supports code obect format V2 and HSAIL bytecode, which was originally developed for OpenCL and C++ AMP compilers, then abandoned in favor of ROCm implementations and native machine code generation - why not update it to support recent versions of LLVM-IR?
 
Last edited:
For their CDNA based products, they bypass the lack of an intermediate bytecode format with a largely backwards compatible ISA implementation which means their latest MI300 series GPUs can potentially run binaries built for as far back as Vega ...

GFX10.1 and GFX10.3 can be reasonably targeted together simultaneously with a single binary provided that you don't abuse heterogeneous memory management APIs like hipMallocManaged (unavailable on RDNA2) or lower precision dot product instructions (not present on all RDNA parts) since they both feature a compatible subset with their instruction encodings ...

GFX11 (RDNA3) breaks compatibility once again but GFX12 (RDNA4) at least looks to be backwards compatible with it ...
 
After open-sourcing HIPRT 2.3, AMD intends to open-source more parts of the ROCm 6.x stack and publish internal documentation on hardware features:



This will probably follow the usual AMD model of making bulk commits from internal repositories from time to time, as opposed to fully integrated GitHub based development workflow and build system.

But at least it should be easier to actually check the source code and submit fixes and enhancements to drivers and libraries, where previously you'd just get SDK headers and pre-compiled object files.


Meanwhile, ROCm 6.1 is being prepared for release, but the HIP SDK for Windows is stuck at ROCm version 5.7.1 for full 7 months now, with no updated release in sight...
 
Last edited:
Back
Top