Machine Learning: WinML/DirectML, CoreML & all things ML

April 1, 2024
It comes as little surprise that Windows continues to be a popular choice for professional workstations, and in 2023, about 90% of Puget Systems customers purchased a Windows-based system. Today, we’ll discuss the benefits and drawbacks of Windows-based workstations compared to Linux-based systems, specifically with regard to Stable Diffusion workflows, and provide performance results from our testing across various Stable Diffusion front-end applications.
We will discuss some other considerations with regard to the choice of OS, but our focus will largely be on testing performance with both an NVIDIA and an AMD GPU across several popular SD image generation frontends, including three forks of the ever-popular Stable Diffusion WebUI by AUTOMATIC1111.
A new option for AMD GPUs is ZLUDA, which is a translation layer that allows unmodified CUDA applications on AMD GPUs. However, the future of ZLUDA is unclear as the CUDA EULA forbids reverse-engineering CUDA elements for translation targeting non-NVIDIA platforms.

We decided to run some tests, and surprisingly, we found several instances where ZLUDA within Windows outperformed ROCm 5.7 in Linux, such as within the DirectML fork of SD-WebUI. Compared to other options, ZLUDA does not appear to be meaningfully impacted by the presence of HAGS.
They cant forbid "Clean Room Design" can they ?

The Google vs Oracle suit kind of established that using an API counts as fair use. So I think it's fair to make something completely compatible with CUDA, unless if there are some patents involved. One thing is that you'll need to make everything, including the toolings (e.g. the compiler and other support tools), otherwise you'll be under the purview of NVIDIA's licensing terms.