AMD Boltzmann Initiative: C++ Compiler w/CUDA Interop For AMD GPUs

Ryan Smith · Nov 16, 2015

Normally I avoid posting any of my own articles here, but in this case I don't think anyone else got briefed in advance about this (since the embargo came and went), and this is very interesting news.

http://www.anandtech.com/show/9792/...e-announced-c-and-cuda-compilers-for-amd-gpus

Boltzmann is a new Linux 64-bit driver with HSA+ compatibility, with a C++ compiler built on top of that (including single source programming for GPU kernels), and then a CUDA source-to-source translation layer built on top of that. Mark Papermaster says that they can automatically convert 90% of CUDA code right now.

Nakai · Nov 16, 2015

Many people think that AMD hijacked CUDA with that. In my opinion this is exaggerating. This is not a sign that AMD tries to overcome CUDA, although this is a nice side effect. It is more like a sign that dedicated compute APIs like CUDA and OpenCL will come to an end..

...instead of focusing on a specific compute architecture, people will want to write HPC code in C++ which is usable on the most systems, with only little to no drawback for a change of the underlying hardware architecture. By the use of C++-11/14/17 and OpenMP4.0 this should result in a new way of developing heterogenous HPC software.

Ext3h · Nov 16, 2015

And now the primary source: https://community.amd.com/community...a-defining-moment-for-heterogeneous-computing

Reporters are still asking questions ATM, so more information may pop up over the course of the day.

Early access has been confirmed for Q1 2016, it's still unclear whether it will be an open or limited program.

Platform support isn't clarified yet either. So far only FirePro + 64bit Linux can be considered given.

Just for comparison, to see what this means:

OpenMP 4.0 and the C++17 extensions are supported by Nvidias NVCC, Intels compiler and AMDs HCC as well as GCC (only OpenMP 4.0 yet) for pure CPU code.

All 4 platforms (GCN, CUDA, Xeon PhI, x64) can now be catered by a single code base. No more OpenCL, and especially no more CUDA.

Ext3h · Nov 16, 2015

Nakai said:
Many people think that AMD hijacked CUDA with that. In my opinion this is exaggerating. This is not a sign that AMD tries to overcome CUDA, although this is a nice side effect.

It is breaking the vendor lock-in imposed onto legacy applications.

The HIP tool doesn't only apply for moving from Nvidias to AMDs hardware, but also works for moving from Nvidia hardware to the Xeon Phi and possibly any other HSA capable accelerator units with minimal changes to the code base. That's not just hijacking CUDA, it's eradicating it.

This removes a major incentive to build new clusters with NV hardware, as legacy applications are no longer a hard constraint. That means the playfield between AMD, Intel and Nvidia on the HPC sector is levelled.

Nakai · Nov 16, 2015

I am very anxious about this matter. AMD could break the GPU HPC market with this. A sober point of view is always appreciated, especially it is Nvidia we are talking about. Their market strength is very permeating, meaning NV will definitely try to overcome this issue with multiple strategies. Let's wait and see. This could bring the change in the HPC market which is necessary. Still other aspects are necessary to achieve greater market share, which are products, driver and software support. I hope they don't fail in these aspects.

silent_guy · Nov 16, 2015

You could see that as AMD removing a barrier from using its GPUs, you could also see this as a capitulation that legitimizes CUDA even more. It's probably the former, but I'm not convinced yet.

I'm also not sure if it really matters for the biggest elephant in the room, neural networks. It's incredible how this is an Nvidia land grab with all the actors that matter. Not that you can't find some open source work that uses OpenCL and AMD, but you'd have to be explicitly looking for it. If you go the other way, as most people would do, you read tutorials, you install frameworks etc., and all blogs and researchers simply say 'just install CUDA and the (Nvidia closed) cuDNN library. AMD never even comes up.

I don't think today's announcement changes that. I do think that AMD needs to release a library that explicitly targets that market otherwise they'll miss out on a huge opportunity.

Ext3h · Nov 16, 2015

Ryan Smith said:
...

I just had to hijack your graphic.

The path via HIPify Tools is all nice, and so is the HIP library which gives some abstraction for formerly vendor specific APIs.

But more interesting is the common feature set now supported by all 4 compilers, consisting of C++17, the parallel extension (GCC support should be added once accepted) and OpenMP 4.0 for implicit offloading to the GPU.

This is where Intel gained the most followers for the Intel Xeon Phi series, because it's just far more intuitive to develop for an platform which acts homogenous, compared to explicit kernel declarations, memory transfers etc. with OpenCL/CUDA or now HIP.

Not that the latter one had no justification, it still makes sense when you need more fine tuned control over memory management, but it's not what most scientists demand.
Take that "neural networks" topic as an example, implementing such a thing in C++ is a finger exercise for most researchers in that field. Porting it to CUDA or OpenCL isn't, since there's a lot of unintuitive setup and explicit memory management involved. Modern language features like lambda expressions, OOP or the STL and alike aren't available in C based languages like CUDA either.

I can only recommend everyone reading this, and being fluent in C or C++, to give OpenMP 4.0 a try. It's really easy to grasp the concept, and to offload algorithms to the GPU with little to no semantic overhead.

Simple introduction to OpenMP 4.0 for HSA capable accelerators
OpenMP 4.0 manual
C++17 parallel STL draft

Jawed · Nov 16, 2015

Easy to code = no performance worth having

Ext3h · Nov 16, 2015

OK, got two more pieces of information:

Early access will be open to every interested developer in Q1 2016, no invite required..
For now, ONLY FirePro series on 64bit Linux will be supported by the new HCC compiler.

Ext3h · Nov 17, 2015

That's the HCC from the announcement. It does exist:
https://bitbucket.org/multicoreware/cppamp-driver-ng-35

And there is the announced headless mode support:
https://bitbucket.org/multicoreware/cppamp-driver-ng-35/wiki/HSA Support Status

So I guess this means open alpha right now?

bridgman · Nov 17, 2015

Ext3h said:
So I guess this means open alpha right now?

We have published driver support for Kaveri & Carrizo so far. Support for dGPU will be rolling out in upcoming releases.

Deleted member 13524 · Dec 13, 2015

Jawed said:
Easy to code = no performance worth having

Yeah, I don't even know why we moved away from assembly after all these years

Razor1 · Dec 13, 2015

That is because compilers are much better now at optimizing well written code

. Big difference from easy to code. Its not as simple as following programming guidelines and the compiler just pops out automatic optimizations........

Never expect code translators to just do it all, they never do, and never will.

AMD Boltzmann Initiative: C++ Compiler w/CUDA Interop For AMD GPUs

Ryan Smith

Nakai

Ext3h

Ext3h

Nakai

silent_guy

Ext3h

Jawed

Ext3h

Ext3h

bridgman

Deleted member 13524

Guest

Razor1

Similar threads