Public release of AMD HIP SDK for Windows (July 2023)

DmitryKo · Jul 30, 2023

Finally, a public release of HIP SDK for Windows - only took them 5 years!

Available now: new HIP SDK helps democratize GPU computing

Available today, the HIP SDK is a milestone in AMD's quest to democratize GPU computing. The creators of some of the world's most demanding GPU-accelerated applications already trust HIP, AMD's Heterogeneous-Compute Interface for Portability, when writing code that can be compiled for AMD and...

community.amd.com

This release is also referred to as ROCm 5.5.1

What is ROCm? — ROCm 5.5.1 Documentation Home

rocm.docs.amd.com

Supported ROCm runtime libraries include

Math Libraries (rocBLAS / hipBLAS / hipBLASLt, rocALUTION, rocWMMA, rocSOLVER / hipSOLVER, rocSPARSE / hipSPARSE, rocFFT / hipFFT, rocRAND / hipRAND), and
C++ Primitive Libraries (rocPRIM, rocThrust, hipCUB),

but AI libraries (MIOpen, MIGraphX, Composable Kernel) and RCCL are not included this time.

SDK download

https://www.amd.com/en/developer/rocm-hub/hip-sdk.html

lnstall guide

https://rocm.docs.amd.com/en/docs-5.5.1/deploy/windows/quick_start.html

Windows installer includes a matching video card driver, 23.Q3 AMD Software PRO Edition, but it is optional and regular Adrenalin drivers include HIP runtime support since at least 2021.

DmitryKo · Jul 30, 2023

The branding is a bit confusing, because Windows version of the HIP SDK actualy includes both compiled ROCm runtimes - a hefty 2.9 GB worth of binary files - and C++ header files, which install themseves into Program Files\AMD\ROCm\5.5. Whereas in Linux, HIP is just a CUDA emulation/interop layer in the larger ROCm SDK and runtime (though 'HIP-Clang' is also an infornal name of their current C++/OpenCL/OpenMP LLVM-based compiler).

So it looks like 'AMD HIP SDK for Windows' will be branded as an equivalent to 'ROCm Software Platform' on Linux? One example, rocminfo has been renamed to hipinfo. I wish they simply used a universal 'HIP/ROCm' branding, now that earlier ROCm compilers / languages like HCC have been deprecated.

Supported GPU families include discrete RDNA2 and RDNA3 parts (Radeon RX 6xx0 / Pro W6x00 / Pro V620, Radeon RX 7x00 / Pro W7x00, and some yet-unspecified APUs), but neither GCN5 and RDNA1 (Radeon RX Vega / VII / Vega Pro / Pro VII , Radeon RX 5x00 / Pro W5x00 / Pro V520), nor CDNA parts (Instinct MI25 / 50/ 100 / 200 / 210 / 250 / 300) which in fact never had Windows driver support.

https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.html#windows-supported-gpus

AMD did promise to enable more GPUs with subsequent releases of ROCm. They need to extensively test and validate each specific GPU model, because their branch of the LLVM compiler does not feature a concept of a 'GPU family', like 'RDNA1' and 'GCN5', or a device-independent byte-code representation like SPIR-V, CUDA's PTX / LTO-IR, and AMD's own HSAIL, so it targets individual GPU dies only (like -arch sm30 sm35 sm37 etc. in eary CUDA):

https://www.llvm.org/docs/AMDGPUUsage.html#id11

DmitryKo · Jul 30, 2023

HIP SDK 5.5 for Windows · ROCm ROCm · Discussion #2347

AMD is pleased to announce the availability of the HIP SDK for Windows as part of the ROCm platform. The HIP SDK OS and GPU support page lists the versions of Windows and GPUs validated by AMD. HIP...

github.com

A Nice Overview Of The ROCm Linux Compute Stack - Phoronix
AMD Radeon Open Compute Platform - XDC 2018

What is AMD ROCm? – random blog

AMD ROCm: a wasted opportunity – random blog

The problem described is not that you have to statically link your HIP kernels. ... | Hacker News

news.ycombinator.com

Basic API Questions?

Render states and shaders are high-level abstractions, introduced 25-30 years ago in software frameworks (specifically OpenGL and Photorealistic Renderman) for 'workstation' computers that had several dozen MBytes of memory, a single CPU with several MFLOPs, and a simple 'framebuffer' graphics...

forum.beyond3d.com

OpenCL 3.0 [2020]

OpenCL 3.0 has been released, with the language specifications upgraded to 'C++ for OpenCL' , a C++17 and OpenCL C 2.0 compliant compiler based on the Clang/LLVM which replaces 'OpenCL C++' from v2.2. khronos.org/news/press/khronos-group-releases-opencl-3.0...

forum.beyond3d.com

https://forum.beyond3d.com/posts/2204040/
etc.

Lurkmass · Jul 30, 2023

DmitryKo said:
AMD did promise to enable more GPUs with subsequent releases of ROCm. They need to extensively test and validate each specific GPU model, because their branch of the LLVM compiler does not feature a concept of a 'GPU family', like 'RDNA1' and 'GCN5', or a device-independent byte-code representation like SPIR-V, CUDA's PTX, or AMD's own HSAIL, so it targets individual GPU dies only (like -arch sm30 sm35 sm37 etc. in eary CUDA):

Not a big deal since you can do runtime compilation from the kernel source rather than an intermediate representation via the HIP RTC library ...

Granath · Jul 31, 2023

DmitryKo said:
Finally, a public release of HIP SDK for Windows - only took them 5 years!

Available now: new HIP SDK helps democratize GPU computing

Available today, the HIP SDK is a milestone in AMD's quest to democratize GPU computing. The creators of some of the world's most demanding GPU-accelerated applications already trust HIP, AMD's Heterogeneous-Compute Interface for Portability, when writing code that can be compiled for AMD and...

community.amd.com

This release is also referred to as ROCm 5.5.1

What is ROCm? — ROCm 5.5.1 Documentation Home

rocm.docs.amd.com

Supported ROCm runtime libraries include

Math Libraries (rocBLAS / hipBLAS / hipBLASLt, rocALUTION, rocWMMA, rocSOLVER / hipSOLVER, rocSPARSE / hipSPARSE, rocFFT / hipFFT, rocRAND / hipRAND), and

C++ Primitive Libraries (rocPRIM, rocThrust, hipCUB),

but AI libraries (MIOpen, MIGraphX, Composable Kernel) and RCCL are not included this time.

SDK download

https://www.amd.com/en/developer/rocm-hub/hip-sdk.html

lnstall guide

Windows quick-start installation guide — HIP SDK installation Windows

Windows quick-start installation guide

rocm.docs.amd.com

Windows installer includes a matching video card driver, 23.Q3 AMD Software PRO Edition, but it is optional and regular Adrenalin drivers include HIP runtime support since at least 2021.

There is a pull request fo MiOpen for Windows already

DmitryKo · Aug 2, 2023

Lurkmass said:
you can do runtime compilation from the kernel source

HIP for Windows still doesn't officially support a lot of consumer cards, and in fact my list of supported GPUs above went too far beyond current 5.5.1 release notes and previously announced plans. So to recap, RDNA3 cards are supported, but RDNA2 cards are partially suported (AMD promised official Linux support by Fall), and RDNA1 and Vega 20 are unsupported, though they all should probably work fine - but why would developers commit to searching enthusiast sites and community forums to see if these cards work in practice?

IMHO, AMD should at least have supported Vega10 (GCN5) and Navi10 (RDNA1), and probably even extend HIP/ROCm support back to Fiji (GCN3) and Polaris20/30 (GCN4), instead of deprecating Vega20 (Radeon VII / Instinct MI50), the single GCN5 chip they officially supported, in the next major release of ROCm. With at least 8-10 TFLOPS of computing power and a fast memory bus, these cards should be as good as RX 6600.

an intermediate representation via the HIP RTC library

It's only used for linking purposes as far as I understand, and you have to provide full source code. If you need to distribute binaries only, you have to compile and bundle machine code for all the different 'arch' variants (currently 9, more if they add consumer GPUs and APUs in a future release).

The problem described is not that you have to statically link your HIP kernels. ... | Hacker News

news.ycombinator.com

Whereas Nvidia has always used PTX bytecode and SSAS assembly, which are both translated to actual GPU machine code by the driver, and recently introduced LTO-IR, a variant of LLVM-IR, as yet another intermediate format for distribution in CUDA 12.

CUDA 12.0 Compiler Support for Runtime LTO Using nvJitLink Library | NVIDIA Technical Blog

CUDA Toolkit 12.0 introduces a new nvJitLink library for Just-in-Time Link Time Optimization (JIT LTO) support.

developer.nvidia.com

I will have to look more closely at their current workflow though...

DmitryKo · Feb 13, 2024

AMD is preparing new compiler targets that would support generic "family" in addition to current "architecture", i.e. specific chip, for the new code object format V6. The new targets include "gfx9-generic" (Vega), "gfx10.1-generic" (RDNA1), "gfx10.3-generic" (RDNA2), and "gfx11-generic" (RDNA3).

AMDGPU LLVM Adding GFX 9/10/11 "Generic Targets" To Build Once & Run On Multiple GPUs - Phoronix

www.phoronix.com

That's better than nearly two dozen of machine code targets, but still far from the convenience of an universal intermediate bytecode format, like those used by CUDA.

The AMDGPU frontend still supports code obect format V2 and HSAIL bytecode, which was originally developed for OpenCL and C++ AMP compilers, then abandoned in favor of ROCm implementations and native machine code generation - why not update it to support recent versions of LLVM-IR?

Lurkmass · Feb 24, 2024

For their CDNA based products, they bypass the lack of an intermediate bytecode format with a largely backwards compatible ISA implementation which means their latest MI300 series GPUs can potentially run binaries built for as far back as Vega ...

GFX10.1 and GFX10.3 can be reasonably targeted together simultaneously with a single binary provided that you don't abuse heterogeneous memory management APIs like hipMallocManaged (unavailable on RDNA2) or lower precision dot product instructions (not present on all RDNA parts) since they both feature a compatible subset with their instruction encodings ...

GFX11 (RDNA3) breaks compatibility once again but GFX12 (RDNA4) at least looks to be backwards compatible with it ...

DmitryKo · Apr 3, 2024

After open-sourcing HIPRT 2.3, AMD intends to open-source more parts of the ROCm 6.x stack and publish internal documentation on hardware features:

AMD Says They'll Be Open-Sourcing More Of Their GPU Software Stack & Hardware Docs - Phoronix

www.phoronix.com

AMD ROCm Going Open-Source: Will Include Software Stack & Hardware Documentation

AMD plans to open-source portions of its ROCm software stack and hardware documentation in a future update to refine its ecosystem.

wccftech.com

This will probably follow the usual AMD model of making bulk commits from internal repositories from time to time, as opposed to fully integrated GitHub based development workflow and build system.

But at least it should be easier to actually check the source code and submit fixes and enhancements to drivers and libraries, where previously you'd just get SDK headers and pre-compiled object files.

Meanwhile, ROCm 6.1 is being prepared for release, but the HIP SDK for Windows is stuck at ROCm version 5.7.1 for full 7 months now, with no updated release in sight...