AMD supporting NVIDIA extensions on the sly?

MfA

Legend
I don't currently have the hardware to see what's going on, but I noticed that the Shark library for ML, which had some publicity for a bit for accelerating Stable Diffusion on AMD hardware, was supposed to support WMMA on RDNA3 according to AMD. Which is strange because it compiles to Vulkan/SpirV and there's supposedly no WMMA support in there (only through various ways in RocM).

The Shark library seems to think the RDNA3 driver supports the VK_NV_cooperative_matrix extension but the driver doesn't seem to advertise it ... what's going on here?
 
The library just makes an attempt to detect the specific arch for AMD hardware rather than querying capabilities from the driver itself ...
 
They are pulling in tuned code (mlir?) from Google storage. I'll try to run it to see what exactly is the spirv code it tries to throw at RDNA3 cards.

PS. though the iree tool chain seems to be included with the drivers, bleh.
 
Last edited:
It's not in the release notes though. I only just noticed the gpuinfo site let's you find the drivers which advertise an extension, the extension is advertised on a tiny subset of 7900 XT(X) entries in the database. Not dependent on how new the drivers are either, weird.

PS. just to be clear, I'm not talking about how supporting NVIDIA extensions is on the sly, but the fact that they have never publicized support for VK_NV_cooperative_matrix and the fact that their latest drivers don't tend to expose support. If the can just execute the SpirV code but don't expose support for the Vulkan extension at all, that's rather on the sly.
 
Last edited:
Support of some Vulkan extensions from other IHVs is pretty normal.
It really isn't ...

RADV driver isn't officially supported by AMD either. When you look at the driver they develop in official capacity (AMDVLK) 'their' drivers don't feature any other vendor extensions aside from their own. VK_AMD_buffer_marker is not an official extension as well and adds no meaningful new functionality too since it's solely used for debugging purposes only ...
It's not in the release notes though. I only just noticed the gpuinfo site let's you find the drivers which advertise an extension, the extension is advertised on a tiny subset of 7900 XT(X) entries in the database. Not dependent on how new the drivers are either, weird.

PS. just to be clear, I'm not talking about how supporting NVIDIA extensions is on the sly, but the fact that they have never publicized support for VK_NV_cooperative_matrix and the fact that their latest drivers don't tend to expose support. If the can just execute the SpirV code but don't expose support for the Vulkan extension at all, that's rather on the sly.
No AMD driver (including community developed ones) advertises support for VK_NV_cooperative_matrix. It's just the way the library forces a specific subset of AMD hardware to execute a different codepath compared to other AMD hardware ...

I'm under the impression that you're overthinking this case. The library probably doesn't even make use the extension itself!
 
IREE definitely uses it in the RDNA3 compilation, but it's possible it's just used as a shitty IR and somewhere down the line it's raised again and compiled to something different before it hits final SPIRV. Would be nice if someone who actually understands that codebase could explain to settle my idle curiosity.


 
Last edited:
IREE definitely uses it in the RDNA3 compilation, but it's possible it's just used as a shitty IR and somewhere down the line it's raised again and compiled to something different before it hits final SPIRV. Would be nice if someone who actually understands that codebase could explain to settle my idle curiosity.


I can explain after further inspection ...

If you look into their actual API implementation, the library itself doesn't make any use of any Nvidia extension but it does make use of an extension which allows the author to explicitly control the wave size of shaders. On RDNA 1/2, Wave32 is their native subgroup size and is often their preferred subgroup mode over Wave64 for executing shaders. On RDNA 3 with the introduction of VOPD, in theory you want shaders to be Wave64 rather than Wave32 for optimal performance in potentially most cases since it's harder for the compiler to use dual-issue for Wave32 ...

I heard from a RADV contributor based on his own testing that WMMA vs non-WMMA performance is mostly the same which may suggest that the instruction is a microcode implementation ...

What IREE is very likely doing is generating SPIR-V to force AMD's compiler to produce Wave64 shaders to take advantage of VOPD ...
 
Back
Top