Bondrewd
Veteran
Naaa, if AMD wanted a dedicated GEMM block they could've done MFMA without DPFP/SPFP GEMM support.I guess in RDNA4 the WMMA op's are handled by an actual dedicated ALU, but very tightly integrated with the vector units i.e. sharing the same data path and issue port so concurrent execution is not possible, just like on RDNA3, but the capabilities are significantly enhanced -- extended type support and sparsity.