AMD supporting NVIDIA extensions on the sly?

MfA · May 31, 2023

I don't currently have the hardware to see what's going on, but I noticed that the Shark library for ML, which had some publicity for a bit for accelerating Stable Diffusion on AMD hardware, was supposed to support WMMA on RDNA3 according to AMD. Which is strange because it compiles to Vulkan/SpirV and there's supposedly no WMMA support in there (only through various ways in RocM).

The Shark library seems to think the RDNA3 driver supports the VK_NV_cooperative_matrix extension but the driver doesn't seem to advertise it ... what's going on here?

Lurkmass · May 31, 2023

The library just makes an attempt to detect the specific arch for AMD hardware rather than querying capabilities from the driver itself ...

MfA · Jun 1, 2023

They are pulling in tuned code (mlir?) from Google storage. I'll try to run it to see what exactly is the spirv code it tries to throw at RDNA3 cards.

PS. though the iree tool chain seems to be included with the drivers, bleh.

Krteq · Jun 1, 2023

Support of some Vulkan extensions from other IHVs is pretty normal.

For example, on my renoir based working NTB, I have these Vulkan exts listed as supported
VK_INTEL_shader_integer_functions2
VK_NV_compute_shader_derivatives

AMD Radeon Graphics (RADV RENOIR) - Vulkan Hardware Database by Sascha Willems

vulkan.gpuinfo.org

...and on my RTX 3080 gaming rig
VK_AMD_buffer_marker

NVIDIA GeForce RTX 3080 - Vulkan Hardware Database by Sascha Willems

vulkan.gpuinfo.org

MfA · Jun 1, 2023

It's not in the release notes though. I only just noticed the gpuinfo site let's you find the drivers which advertise an extension, the extension is advertised on a tiny subset of 7900 XT(X) entries in the database. Not dependent on how new the drivers are either, weird.

PS. just to be clear, I'm not talking about how supporting NVIDIA extensions is on the sly, but the fact that they have never publicized support for VK_NV_cooperative_matrix and the fact that their latest drivers don't tend to expose support. If the can just execute the SpirV code but don't expose support for the Vulkan extension at all, that's rather on the sly.

Lurkmass · Jun 2, 2023

Krteq said:
Support of some Vulkan extensions from other IHVs is pretty normal.

It really isn't ...

RADV driver isn't officially supported by AMD either. When you look at the driver they develop in official capacity (AMDVLK) 'their' drivers don't feature any other vendor extensions aside from their own. VK_AMD_buffer_marker is not an official extension as well and adds no meaningful new functionality too since it's solely used for debugging purposes only ...

MfA said:
It's not in the release notes though. I only just noticed the gpuinfo site let's you find the drivers which advertise an extension, the extension is advertised on a tiny subset of 7900 XT(X) entries in the database. Not dependent on how new the drivers are either, weird.

PS. just to be clear, I'm not talking about how supporting NVIDIA extensions is on the sly, but the fact that they have never publicized support for VK_NV_cooperative_matrix and the fact that their latest drivers don't tend to expose support. If the can just execute the SpirV code but don't expose support for the Vulkan extension at all, that's rather on the sly.

No AMD driver (including community developed ones) advertises support for VK_NV_cooperative_matrix. It's just the way the library forces a specific subset of AMD hardware to execute a different codepath compared to other AMD hardware ...

I'm under the impression that you're overthinking this case. The library probably doesn't even make use the extension itself!

MfA · Jun 2, 2023

IREE definitely uses it in the RDNA3 compilation, but it's possible it's just used as a shitty IR and somewhere down the line it's raised again and compiled to something different before it hits final SPIRV. Would be nice if someone who actually understands that codebase could explain to settle my idle curiosity.

iree/compiler/src/iree/compiler/Codegen/SPIRV/test/config_amd_matmul_cooperative_ops.mlir at f84d8a8708c86c9c1bd91af06f426b8223baa6dc · openxla/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit. - openxla/iree

github.com

iree/compiler/src/iree/compiler/Dialect/Vulkan/Utils/TargetTriple.cpp at f84d8a8708c86c9c1bd91af06f426b8223baa6dc · openxla/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit. - openxla/iree

github.com

Lurkmass · Jun 2, 2023

MfA said:
IREE definitely uses it in the RDNA3 compilation, but it's possible it's just used as a shitty IR and somewhere down the line it's raised again and compiled to something different before it hits final SPIRV. Would be nice if someone who actually understands that codebase could explain to settle my idle curiosity.

iree/compiler/src/iree/compiler/Codegen/SPIRV/test/config_amd_matmul_cooperative_ops.mlir at f84d8a8708c86c9c1bd91af06f426b8223baa6dc · openxla/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit. - openxla/iree

github.com

iree/compiler/src/iree/compiler/Dialect/Vulkan/Utils/TargetTriple.cpp at f84d8a8708c86c9c1bd91af06f426b8223baa6dc · openxla/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit. - openxla/iree

github.com

I can explain after further inspection ...

If you look into their actual API implementation, the library itself doesn't make any use of any Nvidia extension but it does make use of an extension which allows the author to explicitly control the wave size of shaders. On RDNA 1/2, Wave32 is their native subgroup size and is often their preferred subgroup mode over Wave64 for executing shaders. On RDNA 3 with the introduction of VOPD, in theory you want shaders to be Wave64 rather than Wave32 for optimal performance in potentially most cases since it's harder for the compiler to use dual-issue for Wave32 ...

I heard from a RADV contributor based on his own testing that WMMA vs non-WMMA performance is mostly the same which may suggest that the instruction is a microcode implementation ...

What IREE is very likely doing is generating SPIR-V to force AMD's compiler to produce Wave64 shaders to take advantage of VOPD ...

AMD supporting NVIDIA extensions on the sly?

Similar threads