Machine Learning: WinML/DirectML, CoreML & all things ML

April 1, 2024
It comes as little surprise that Windows continues to be a popular choice for professional workstations, and in 2023, about 90% of Puget Systems customers purchased a Windows-based system. Today, we’ll discuss the benefits and drawbacks of Windows-based workstations compared to Linux-based systems, specifically with regard to Stable Diffusion workflows, and provide performance results from our testing across various Stable Diffusion front-end applications.
...
We will discuss some other considerations with regard to the choice of OS, but our focus will largely be on testing performance with both an NVIDIA and an AMD GPU across several popular SD image generation frontends, including three forks of the ever-popular Stable Diffusion WebUI by AUTOMATIC1111.
...
A new option for AMD GPUs is ZLUDA, which is a translation layer that allows unmodified CUDA applications on AMD GPUs. However, the future of ZLUDA is unclear as the CUDA EULA forbids reverse-engineering CUDA elements for translation targeting non-NVIDIA platforms.

We decided to run some tests, and surprisingly, we found several instances where ZLUDA within Windows outperformed ROCm 5.7 in Linux, such as within the DirectML fork of SD-WebUI. Compared to other options, ZLUDA does not appear to be meaningfully impacted by the presence of HAGS.
 
They cant forbid "Clean Room Design" can they ?

The Google vs Oracle suit kind of established that using an API counts as fair use. So I think it's fair to make something completely compatible with CUDA, unless if there are some patents involved. One thing is that you'll need to make everything, including the toolings (e.g. the compiler and other support tools), otherwise you'll be under the purview of NVIDIA's licensing terms.
 
nVidia announced a new partnership with Microsoft:
Windows Copilot Runtime to Add GPU Acceleration for Local PC SLMs
Microsoft and NVIDIA are collaborating to help developers bring new generative AI capabilities to their Windows native and web apps. This collaboration will provide application developers with easy application programming interface (API) access to GPU-accelerated small language models (SLMs) that enable retrieval-augmented generation (RAG) capabilities that run on-device as part of Windows Copilot Runtime.

SLMs provide tremendous possibilities for Windows developers, including content summarization, content generation and task automation. RAG capabilities augment SLMs by giving the AI models access to domain-specific information not well represented in ‌base models. RAG APIs enable developers to harness application-specific data sources and tune SLM behavior and capabilities to application needs.

These AI capabilities will be accelerated by NVIDIA RTX GPUs, as well as AI accelerators from other hardware vendors, providing end users with fast, responsive AI experiences across the breadth of the Windows ecosystem.

The API will be released in developer preview later this year.
 
Back
Top