OpenCL 3.0 [2020]

Discussion in 'Rendering Technology and APIs' started by DmitryKo, Apr 27, 2020.

  1. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    322
    Likes Received:
    359
    After not submitting a conformance test in over 5 years, I don't think they can legally advertise OpenCL support on any of their new products and they used to support SPIR 1.2/OpenCL on gfx8 in their older drivers but not anymore so OpenCL is probably as good as dead on AMD ... (ROCm OpenCL driver is a disaster when we take a look at Blender Cycles/Darktable/GIMP/Davinci Resolve/Autodesk Maya/SideFX Houdini since it doesn't work with any of these apps)

    I find it hard to believe that any of HP/Cray's customers would use OpenCL to only be able to run code exclusively for a single platform. SYCL is virtually useless without SPIR-V kernel compiler as well so projects like hipSYCL will eventually fall into the same roadblocks like other OpenCL driver implementations did that prevents it from being production ready ...
     
  2. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    904
    Likes Received:
    1,081
    Location:
    55°38′33″ N, 37°28′37″ E
    SYCL 2020 is certainly not limited to OpenCL / SPIR(-V). There can be multiple back-ends that output LLVM bytecode (such as OpenCL 1.x SPIR, Vulkan/OpenCL 2.x SPIR-V, or CUDA PTX), native GPU binary code, or transcoded source code such as OpenCL C or HIP (which is how hipSYCL on ROCm works). This was one of the significant new features in this release.
    SPIR-V is not required for native binary code compilation either. OpenCL/SPIR-V target is a design choice made by CodePlay and Intel DP++ to support multiple CPU, GPU and FPGA architectures.


    Unannounced plans are, well, unannounced. I doubt even AMD's own HPC team knows at this point whether they are going to support current revisions of OpenCL, SPIR-V or SYCL; these decisions could only be made after they ship the complete ROCm software stack.

    As for quality of AMD driver implementations, unfortunately, it's hardly any news. AMD regular Adrenalin OpenCL driver does not work with Blender Cycles renderer on Polaris 11 (GCN4) cards either, while the same driver works fine on Vega (GCN5) and Navi10 (RDNA) cards...
     
    #42 DmitryKo, Apr 20, 2021
    Last edited: Apr 20, 2021
  3. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    322
    Likes Received:
    359
    Blender has announced that they'll be dropping their OpenCL backend in their Cycles rendering engine ...
     
  4. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    904
    Likes Received:
    1,081
    Location:
    55°38′33″ N, 37°28′37″ E
    Yep, Cycles-X for Blender 3.0 will be based exclusively on OptiX and CUDA. They're also considering SYCL, HIP and Metal, but this won't come in initial release.

    GROMACS, the GPU back-end for molecular dynamics, is also going to deprecate OpenCL and rebase their code on SYCL and CUDA (which has been added in 2020) - see their IWOCL 2021 session video (4:30) and slides (page 7).

    It's not going to be a problem for Intel - their DPC++ framework is essentially SYCL 2020 and it works down to UHD500 (gen9, Skylake). As for AMD, they still have a full year ahead to make up their minds about their priorities in API support; I guess they could at least update their OpenCL drivers with latest LLVM/Clang infrastructure to support SPIR-V and 'C++ for OpenCL', so that third-party SYCL runtimes like HipSYCL could target existing AMD hardware (not just Radeon Instinct series on Linux).


    BTW IWOCL 2021 didn't have major announcements this year; OpenCL WG session (video and slides) and SYCL WG session (video and slides) were basically a retelling of the 2020 presentations above, but there were some sessions on implementing C++20 support and/or C++ standard libraries (libcxx).
     
    #44 DmitryKo, May 1, 2021
    Last edited: May 11, 2021 at 9:31 PM
    BRiT likes this.
  5. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    322
    Likes Received:
    359
    AMD were considering releasing an OpenCL extension for ray tracing. They were also experimenting with HIP for Blender's Cycles renderer and Brecht disagrees with AMD's suggestion to do offline compilation for rendering kernels because he thinks it's too expensive. Metal is unlikely to be a feasible target for Cycles since it's toy shading language doesn't properly support pointers or unstructured control flow. Metal doesn't have a single source programming model either like CUDA or HIP so Metal shaders would need to be in a separate file ...

    As for GROMACS, I think they'll be disappointed at the end if they truly expect to "write once and run anywhere" (between AMD/Intel) with SYCL. They are already struggling with performance on more complex kernels, needing vendor specific code branches, and are seeing large regressions with hipSYCL for a very early implementation on their SYCL backend. I think the team should consider making an abstraction for other potential GPU backends like DPC++, HIP, and including CUDA as well if they want to get the most performance out of all vendors ...

    AMD have been committed to sticking with their HIP API with no signs of changing and making an OpenCL SPIR-V compiler would be a monumental task with the end result being lower performance and more maintenance which is why they don't favour that approach. This direction could change depending if other vendors (ideally Nvidia) are willing to implement an OpenCL SPIR-V compiler and OpenCL 2.x too. They also started shipping their HIP libraries and runtimes on Windows as well ...
     
    Krteq and BRiT like this.
  6. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    610
    Likes Received:
    320
    AMD struggle to invest on GPGPU on Windows despite it is a big market (Atuodesk and Adobe software are just a small slice of the cake), thanks to WSL something is moving finally... But investing on a butchered dead horse like OpenCl does not make any sense anymore.
     
  7. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    904
    Likes Received:
    1,081
    Location:
    55°38′33″ N, 37°28′37″ E
    I don't really buy it that itermediate representations like LLVM and SPIR-V would result in lower performance. The recommended object code format in CUDA is PTX (Parallel Thread Execution), an intermediate format which is a subset of LLVM 3.7 (used by NVPTX back-end in Clang/LLVM and NVVM back-end in the NVCC compiler), even though CUDA also supports architecture-specific machine code.

    https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html
    https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-ref

    (LLVM IR is a high-level language based around an abstracted general purpose central processor; SPIR-V arguably is an even higher level abstraction suitable for GPU parallel processing. Both are a full equivalent of your C/C++, GLSL, HLSL etc. source code, though converted to a translator-friendly, machine-readable binary format.)


    Khronos Group maintains an open-source bidirectional SPIR-V to LLVM translator, an SPIR-V back-end for LLVM, SPIR-V Cross converter (to HLSL/GLSL/Metal), etc. LLVM maintains the Multi-Level IR Compiler Framework that supports SPIR-V to LLVM conversion, and the AMDGPU back-end. AMD just needs to devote more developer resources to these open source projects, rather than start their own implementations and abandon them soon.

    https://github.com/KhronosGroup/SPIRV-LLVM-Translator
    https://github.com/KhronosGroup/SPIRV-Cross
    https://github.com/KhronosGroup/SPIR/

    https://mlir.llvm.org/docs/SPIRVToLLVMDialectConversion/
    https://mlir.llvm.org/docs/Dialects/SPIR-V/

    https://github.com/KhronosGroup/LLVM-SPIRV-Backend
    https://www.phoronix.com/scan.php?page=news_item&px=Intel-2021-LLVM-SPIR-V-Backend

    https://llvm.org/docs/AMDGPUUsage.html
    https://rocmdocs.amd.com/en/latest/ROCm_Compiler_SDK/ROCm-Native-ISA.html#memory-model

    amdhip64.dll has been included with Radeon Adrenalin drivers since 2019, however ROCm/HIP development is still not supported on Windows, and the HIPCC compiler officially only supports Radeon Instinct MI series, i.e. GCN4 (Polaris 11), GCN5 (Vega) and CDNA.

    https://github.com/ROCm-Developer-Tools/HIP/commit/e2bf34cd5e6444ee04adae7c0c496bf52cff4f31
    https://github.com/illuhad/hipSYCL/issues/78#issuecomment-582810780
    https://github.com/ROCm-Developer-Tools/HIP/issues/84


    GROMACS always based their GPGPU path on CUDA. Their IWOCS 2021 presentation says that support for OpenCL 1.x was implemented on top of their pre-existing CUDA abstractions, and so is support for SYCL 2020 and DPC++.

    AFAIK, AMD HIPCC compiler also supports OpenCL C 2.x, so it can be made to support C++ for OpenCL, and hipSYCL would be better off directly targeting the OpenCL dialect of C++ , rather than jump hoops trying to abstract SYCL/OpenCL on top of HIP/CUDA...

    AMD just need to take the OptiX API and call it HippiX.

    https://developer.nvidia.com/blog/how-to-get-started-with-optix-7/
     
    #47 DmitryKo, May 9, 2021
    Last edited: May 12, 2021 at 12:35 AM
    BRiT likes this.
  8. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    322
    Likes Received:
    359
    There's a big difference between SPIR-V and PTX. SPIR-V is designed by a committee where every participating member has to make a compromise and choose to expose features that all vendors can support which will constrain compiler design. The specifications for PTX are solely controlled by Nvidia and is designed to be forward compatible with future Nvidia HW but even better is that every new iteration of PTX doesn't have to retain backwards compatibility with their older HW either. Your example between SPIR-V and PTX only underlines the apparent tradeoff that compiler designers have to make. Pick your poison (SPIR-V/portability or PTX/performance/simplicity) ...

    Even if SPIR-V doesn't have a performance deficit, it doesn't mean that it'll have the same maintenance cost compared to a compiler that only emits native code. Making an offline compiler takes far less effort and is less error prone as well while making a compiler for an IR takes more resources and introduces more bugs in the process too ...

    SPIR-V Cross doesn't support the OpenCL dialect of SPIR-V at all which has features like pointers for local and private memory or unstructured control flow along with several other things. Most vendors only support SPIR-V's shader capabilities so they can't do any of the fun things that are exclusive to it's kernel capabilities that we see in OpenCL. None of these projects are relevant to implementing a SPIR-V compiler with kernel capabilities and AMD aren't avoiding these projects because they want to but it's because making this SPIR-V compiler is too much work for virtually no return. LLVM is just an infrastructure for a collection different of backends. AMD still has to implement a SPIR-V compiler over there ...

    AMD is using LLVM for it's HIP-Clang backend but they aren't going to make a SPIR-V backend. Same goes for Nvidia where they have a CUDA-Clang backend for LLVM but there's no SPIR-V backend for them either. Only Intel has a SPIR-V compiler in one of the LLVM backends. It's amazing how LLVM can be used for many different frontends (CUDA/DPC++/HIP) to target their unique backends (GCN/PTX/SPIR-V) as well so LLVM alone isn't going to get us closer to portability ...

    HCC has been deprecated in favour of the HIP-Clang compiler and there is a project that advertises the usage of ROCm on Windows ...

    They should go beyond just their CUDA abstraction. SYCL doesn't have much of a future outside of Intel and even then DPC++ is their superior version of SYCL. Even the GROMACS team won't dare use SYCL (PTX backend) on Nvidia HW since they don't totally buy into it's portability claims either so I think they're bound to learn this the really hard way when experimenting with SYCL for AMD/Intel ...

    AMD HCC only supports offline compilation into GCN bytecode. What good are all these source languages for if vendors can't agree to have one IR to rule all implementations ? You do realize that both ARM and x86 support C++ as a source language but they are by no means consistent across each other even with the same C++ code or compiler as well. The Clang compiler can detect undefined behaviour in the C++ code too for this express purpose ...
     
    BRiT likes this.
  9. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    904
    Likes Received:
    1,081
    Location:
    55°38′33″ N, 37°28′37″ E
    SPIR-V is nothing like a common set of machine instructions to be supported by all vendors. It's rather an intermediate representation for general operations / functions / variables in OpenCL C, 'C++ for OpenCL', and GLSL/HLSL. Each general operation is translated to specific machine code sequence according to the capabilities of specific vendor's hardware.

    HSAIL format was indeed similar to RISC assembly code.

    NVIDIA simply captured a certain version of the LLVM framework. LLVM IR couldn't work as a redistributable binary if they didn't 'freeze' up the specs, it's a moving target that would subtly change with each major release of LLVM.

    And just like SPIR 1.0/2.0, NVVM implements a restricted subset of LLVM, omitting features only required for certain processors and operating systems which would make no sense for GPGPU programming.

    They don't need to directly translate SPIR-V to machine code. SPIR-V can be mapped to LLVM, so AMD can either use the bidirectional SPIR-V / LLVM translator from Khronos or the LLVM MLIR framework (used in the TensorFlow runtime) to convert SPIR-V to LLVM, and then translate from LLVM to machine code.

    Of course it will. Intel is working on the Khronos open-source SPIR-V back-end and the DPC++ front-end- so Clang\LLVM will be able to produce SPIR-V redistributables (in addition to CUDA PTX and GCN machine code) from OpenCL C 2.0, 'C++ for OpenCL' and SYCL/DPC++ source code.
    https://clang.llvm.org/docs/SYCLSupport.html
    https://clang.llvm.org/docs/OpenCLSupport.html

    All these vendors are using standard Clang for GPGPU, which supports their preferred distribution formats fairly well. I'd rather see them implement all the different C++ abstractions (CUDA/HIP, 'C++ for OpenCL', and SYCL/DP++) and make them work with MLIR SPIR-V dialect, and eventually standard C++23/26 language constructs (i.e. executors)...


    AFAIK it's not byte code, it's machine language (GCN assembly), since HIP/ROCm compiler is built on AMDGPU back-end.

    HIPCC is still the same Clang/LLVM based compiler, though with a different set of C++ template libraries.

    So far most DPC++ extensions ended up in a recent SYCL release, and Intel has open-sourced the DPC++ front-end.

    It is both a compiler infrastructure and an intermediate language specification shared between front- and back-end layers.

    There is no such dilemma, portability and performance are independent variables.

    So I said, 'translator to HLSL/GLSL/Metal'.
     
    #49 DmitryKo, May 12, 2021 at 12:06 AM
    Last edited: May 12, 2021 at 12:40 AM
    Alessio1989 and BRiT like this.
  10. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    322
    Likes Received:
    359
    SPIR-V is a binary format that needs to be supported by all participating vendors regardless of their native HW instruction format ...

    You're making this more confusing than it unnecessarily has to be. LLVM IR is straight up not designed to be ingested by the drivers and is meant solely for internal usage by the Clang compiler. No drivers will flat out accept LLVM IR. Meanwhile GCN/PTX/SPIR-V kernels are intended for driver consumption ...

    Yes, they do and don't be silly since no drivers will accept either LLVM or MLIR. Their only supported endpoints are GCN/PTX/SPIR-V depending on the drivers. The magic doesn't happen in the compilers like you seem to think but the magic happens in their drivers ...

    Congrats on Intel making their SPIR-V compiler public which only works with their HW ? AMD and Nvidia still have to make SPIR-V compilers for their own hardware if they care about interoperability or portability ...

    AMD and Nvidia still aren't going to make a MLIR or a SPIR-V kernel compiler. Source languages like the CUDA kernel language are tied tightly to a specific hardware vendor so it's useless for other vendors to support them since programs written in them won't run properly for their HW and those programs rely on behaviour that is specific to HW other than their own. There's a very good reason why AMD developed their own source language (HIP) instead adopting an existing one like CUDA or the others since they cannot simultaneously meet all of AMD's goals such as exposing a portable subset of CUDA, exposing low level features for their HW, and being able to run properly on both AMD and Nvidia hardware. Here's the obvious rundown on each source language ...

    CUDA: existing kernels don't work properly on AMD HW
    C++ for OpenCL: too many design differences compared to CUDA
    SYCL: ditto from the above
    DPC++: likely won't work properly on AMD HW

    Just because they use the same compiler doesn't mean that they'll be portable. Remember what I said about undefined behaviours in C++ ?

    Means very little in practice. Tons of Vulkan extensions end up in the core Vulkan API but they still remain as optional capabilities. SYCL might as well not have a committee behind it at all because no other vendors officially support it! I guess whenever AMD or Nvidia get serious about supporting SYCL, they'll just revert the requirements imposed by Intel like they did recently for OpenCL ...

    It's mostly a compiler infrastructure and drivers will never ingest LLVM IR ...

    AMD and Nvidia don't share your opinion. Their backends in the Clang compiler only emits GCN or PTX code. If they truly believed you then AMD/Nvidia wouldn't support exposing low level features like inline GCN/PTX assembly in their source languages as seen in CUDA/HIP. Your response also doesn't address the other fundamental issue with vendor agnostic IRs such as maintenance cost as well ...

    SPIR-V Cross doesn't do that either. The purpose behind SPIR-V Cross is to convert one shader format into another shader format. It is impossible to convert OpenCL/SPIR-V compute kernels into less flexible shader formats ...
     
  11. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    904
    Likes Received:
    1,081
    Location:
    55°38′33″ N, 37°28′37″ E
    Of course LLVM and MLIR are designed for external use just like SPIR-V. They all define both a text-based 'toy language' and a 'bytecode/bitcode' binary interchange format.

    PTX/NVVM and SPIR 2.x (the OpenCL version preceding SPIR-V) are directly based on LLVM IR - both of these specs simply define which subset of respective LLVM spec they support, excluding features like DLLs entry points which only make sense for CPUs.


    Again, PTX binaries do not contain any 'assembly' or machine code, it's just plain NVVM intermediate binary format (a subset of LLVM IR).

    CUDA allso allows binary code objects (CuBin) which do include machine code (presented as SASS assembly language in the tools), just like OpenCL and HIP, but you can only run it on the specific architecture chosen at compile time - again, just like OpenCL and HIP.

    So far everyone is using PTX since it offers both portability and performance, owing to the quality of NVidia's proprietary optimising machine code compiler.


    HIP does not have any 'low level features' specific to AMD hardware, it can be directly compiled to CUDA PTX using the native NVCC compiler.

    HIP is just an subset of CUDA 8.x features implemented on top of AMD open-source OpenCL 2.x driver infrastructure for GCN+ GPUs. AMD just took what they could implement in a viable timeframe, rather than include each and every API and language feature.

    I understand your point that each vendor needs to fully implement the latest version of CUDA (and design their hardware to resemble latest NVidia GPUs as well).

    In reality we will still continue to have different hardware designs and APIs which are slowly converging to a common core, obviously much influenced by CUDA.



    It's just word juggling. There are components/libraries to translate source or intermediary code into machine instructions - call them compiler, driver, compiler driver, or anything you like.

    The translator doesn't use Vulkan SPIR-V extensions until instructed. The back-end would use OpenCL C/C++ model, which doesn't define SPIR-V extensions at all.

    Not a compiler's fault, programmers have to avoid undefined behaviour in the first place.

    Vendors only need to support OpenCL C source code, which the 'Big Three' do.

    Same cost as maintaining the Clang/LLVM compiler infrastructure and the AMDGPU back-end as used by ROCm.

    Yes, though they also tried to support OpenCL but that path was deprecated. Did I have to copy the entire project description?
     
    #51 DmitryKo, May 14, 2021 at 12:56 PM
    Last edited: May 15, 2021 at 6:52 PM
    BRiT likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...