AMD Boltzmann Initiative: C++ Compiler w/CUDA Interop For AMD GPUs

Ryan Smith

Regular
Supporter
Normally I avoid posting any of my own articles here, but in this case I don't think anyone else got briefed in advance about this (since the embargo came and went), and this is very interesting news.

http://www.anandtech.com/show/9792/...e-announced-c-and-cuda-compilers-for-amd-gpus

AMD%20@%20SC15-page-003_678x452.jpg


Boltzmann is a new Linux 64-bit driver with HSA+ compatibility, with a C++ compiler built on top of that (including single source programming for GPU kernels), and then a CUDA source-to-source translation layer built on top of that. Mark Papermaster says that they can automatically convert 90% of CUDA code right now.

AMDHIP2b_575px.png
 
Many people think that AMD hijacked CUDA with that. In my opinion this is exaggerating. This is not a sign that AMD tries to overcome CUDA, although this is a nice side effect. It is more like a sign that dedicated compute APIs like CUDA and OpenCL will come to an end..

...instead of focusing on a specific compute architecture, people will want to write HPC code in C++ which is usable on the most systems, with only little to no drawback for a change of the underlying hardware architecture. By the use of C++-11/14/17 and OpenMP4.0 this should result in a new way of developing heterogenous HPC software.
 
And now the primary source: https://community.amd.com/community...a-defining-moment-for-heterogeneous-computing

Reporters are still asking questions ATM, so more information may pop up over the course of the day.

Early access has been confirmed for Q1 2016, it's still unclear whether it will be an open or limited program.

Platform support isn't clarified yet either. So far only FirePro + 64bit Linux can be considered given.

Just for comparison, to see what this means:

OpenMP 4.0 and the C++17 extensions are supported by Nvidias NVCC, Intels compiler and AMDs HCC as well as GCC (only OpenMP 4.0 yet) for pure CPU code.

All 4 platforms (GCN, CUDA, Xeon PhI, x64) can now be catered by a single code base. No more OpenCL, and especially no more CUDA.
 
Many people think that AMD hijacked CUDA with that. In my opinion this is exaggerating. This is not a sign that AMD tries to overcome CUDA, although this is a nice side effect.
It is breaking the vendor lock-in imposed onto legacy applications.

The HIP tool doesn't only apply for moving from Nvidias to AMDs hardware, but also works for moving from Nvidia hardware to the Xeon Phi and possibly any other HSA capable accelerator units with minimal changes to the code base. That's not just hijacking CUDA, it's eradicating it.

This removes a major incentive to build new clusters with NV hardware, as legacy applications are no longer a hard constraint. That means the playfield between AMD, Intel and Nvidia on the HPC sector is levelled.
 
I am very anxious about this matter. AMD could break the GPU HPC market with this. A sober point of view is always appreciated, especially it is Nvidia we are talking about. Their market strength is very permeating, meaning NV will definitely try to overcome this issue with multiple strategies. Let's wait and see. This could bring the change in the HPC market which is necessary. Still other aspects are necessary to achieve greater market share, which are products, driver and software support. I hope they don't fail in these aspects.
 
You could see that as AMD removing a barrier from using its GPUs, you could also see this as a capitulation that legitimizes CUDA even more. It's probably the former, but I'm not convinced yet.

I'm also not sure if it really matters for the biggest elephant in the room, neural networks. It's incredible how this is an Nvidia land grab with all the actors that matter. Not that you can't find some open source work that uses OpenCL and AMD, but you'd have to be explicitly looking for it. If you go the other way, as most people would do, you read tutorials, you install frameworks etc., and all blogs and researchers simply say 'just install CUDA and the (Nvidia closed) cuDNN library. AMD never even comes up.

I don't think today's announcement changes that. I do think that AMD needs to release a library that explicitly targets that market otherwise they'll miss out on a huge opportunity.
 
I just had to hijack your graphic.
hijacked.png

The path via HIPify Tools is all nice, and so is the HIP library which gives some abstraction for formerly vendor specific APIs.

But more interesting is the common feature set now supported by all 4 compilers, consisting of C++17, the parallel extension (GCC support should be added once accepted) and OpenMP 4.0 for implicit offloading to the GPU.

This is where Intel gained the most followers for the Intel Xeon Phi series, because it's just far more intuitive to develop for an platform which acts homogenous, compared to explicit kernel declarations, memory transfers etc. with OpenCL/CUDA or now HIP.

Not that the latter one had no justification, it still makes sense when you need more fine tuned control over memory management, but it's not what most scientists demand.
Take that "neural networks" topic as an example, implementing such a thing in C++ is a finger exercise for most researchers in that field. Porting it to CUDA or OpenCL isn't, since there's a lot of unintuitive setup and explicit memory management involved. Modern language features like lambda expressions, OOP or the STL and alike aren't available in C based languages like CUDA either.

I can only recommend everyone reading this, and being fluent in C or C++, to give OpenMP 4.0 a try. It's really easy to grasp the concept, and to offload algorithms to the GPU with little to no semantic overhead.

Simple introduction to OpenMP 4.0 for HSA capable accelerators
OpenMP 4.0 manual
C++17 parallel STL draft
 
OK, got two more pieces of information:
  • Early access will be open to every interested developer in Q1 2016, no invite required..
  • For now, ONLY FirePro series on 64bit Linux will be supported by the new HCC compiler.
 
That is because compilers are much better now at optimizing well written code ;). Big difference from easy to code. Its not as simple as following programming guidelines and the compiler just pops out automatic optimizations........

Never expect code translators to just do it all, they never do, and never will.
 
Back
Top