OpenCL 3.0 [2020]

Discussion in 'Rendering Technology and APIs' started by DmitryKo, Apr 27, 2020.

  1. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    320
    Likes Received:
    357
    After not submitting a conformance test in over 5 years, I don't think they can legally advertise OpenCL support on any of their new products and they used to support SPIR 1.2/OpenCL on gfx8 in their older drivers but not anymore so OpenCL is probably as good as dead on AMD ... (ROCm OpenCL driver is a disaster when we take a look at Blender Cycles/Darktable/GIMP/Davinci Resolve/Autodesk Maya/SideFX Houdini since it doesn't work with any of these apps)

    I find it hard to believe that any of HP/Cray's customers would use OpenCL to only be able to run code exclusively for a single platform. SYCL is virtually useless without SPIR-V kernel compiler as well so projects like hipSYCL will eventually fall into the same roadblocks like other OpenCL driver implementations did that prevents it from being production ready ...
     
  2. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    902
    Likes Received:
    1,076
    Location:
    55°38′33″ N, 37°28′37″ E
    SYCL 2020 is certainly not limited to OpenCL / SPIR(-V). There can be multiple back-ends that output LLVM bytecode (such as OpenCL 1.x SPIR, Vulkan/OpenCL 2.x SPIR-V, or CUDA PTX), native GPU binary code, or transcoded source code such as OpenCL C or HIP (which is how hipSYCL on ROCm works). This was one of the significant new features in this release.
    SPIR-V is not required for native binary code compilation either. OpenCL/SPIR-V target is a design choice made by CodePlay and Intel DP++ to support multiple CPU, GPU and FPGA architectures.


    Unannounced plans are, well, unannounced. I doubt even AMD's own HPC team knows at this point whether they are going to support current revisions of OpenCL, SPIR-V or SYCL; these decisions could only be made after they ship the complete ROCm software stack.

    As for quality of AMD driver implementations, unfortunately, it's hardly any news. AMD regular Adrenalin OpenCL driver does not work with Blender Cycles renderer on Polaris 11 (GCN4) cards either, while the same driver works fine on Vega (GCN5) and Navi10 (RDNA) cards...
     
    #42 DmitryKo, Apr 20, 2021
    Last edited: Apr 20, 2021
  3. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    320
    Likes Received:
    357
    Blender has announced that they'll be dropping their OpenCL backend in their Cycles rendering engine ...
     
  4. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    902
    Likes Received:
    1,076
    Location:
    55°38′33″ N, 37°28′37″ E
    Yep, Cycles-X for Blender 3.0 will be based exclusively on OptiX and CUDA. They're also considering SYCL, HIP and Metal, but this won't come in initial release.

    GROMACS, the GPU backend for Folding@Home, is also going to deprecate OpenCL and rebase their code on SYCL and CUDA (which has been added in 2020) - see their IWOCL 2021 session video (4:30) and slides (page 7).

    It's not going to be a problem for Intel - their DPC++ framework is essentially SYCL 2020 and it works down to UHD500 (gen9, Skylake). As for AMD, they still have a full year ahead to make up their minds about their priorities in API support; I guess they could at least update their OpenCL drivers with latest LLVM/Clang infrastructure to support SPIR-V and 'C++ for OpenCL', so that third-party SYCL runtimes like HipSYCL could target existing AMD hardware (not just Radeon Instinct series on Linux).


    BTW IWOCL 2021 didn't have major announcements this year; OpenCL WG session (video and slides) and SYCL WG session (video and slides) were basically a retelling of the 2020 presentations above, but there were some sessions on implementing C++20 support and/or C++ standard libraries (libcxx).
     
    #44 DmitryKo, May 1, 2021
    Last edited: May 2, 2021
    BRiT likes this.
  5. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    320
    Likes Received:
    357
    AMD were considering releasing an OpenCL extension for ray tracing. They were also experimenting with HIP for Blender's Cycles renderer and Brecht disagrees with AMD's suggestion to do offline compilation for rendering kernels because he thinks it's too expensive. Metal is unlikely to be a feasible target for Cycles since it's toy shading language doesn't properly support pointers or unstructured control flow. Metal doesn't have a single source programming model either like CUDA or HIP so Metal shaders would need to be in a separate file ...

    As for GROMACS, I think they'll be disappointed at the end if they truly expect to "write once and run anywhere" (between AMD/Intel) with SYCL. They are already struggling with performance on more complex kernels, needing vendor specific code branches, and are seeing large regressions with hipSYCL for a very early implementation on their SYCL backend. I think the team should consider making an abstraction for other potential GPU backends like DPC++, HIP, and including CUDA as well if they want to get the most performance out of all vendors ...

    AMD have been committed to sticking with their HIP API with no signs of changing and making an OpenCL SPIR-V compiler would be a monumental task with the end result being lower performance and more maintenance which is why they don't favour that approach. This direction could change depending if other vendors (ideally Nvidia) are willing to implement an OpenCL SPIR-V compiler and OpenCL 2.x too. They also started shipping their HIP libraries and runtimes on Windows as well ...
     
    Krteq and BRiT like this.
  6. Alessio1989

    Regular Newcomer

    Joined:
    Jun 6, 2015
    Messages:
    609
    Likes Received:
    320
    AMD struggle to invest on GPGPU on Windows despite it is a big market (Atuodesk and Adobe software are just a small slice of the cake), thanks to WSL something is moving finally... But investing on a butchered dead horse like OpenCl does not make any sense anymore.
     
  7. DmitryKo

    Regular

    Joined:
    Feb 26, 2002
    Messages:
    902
    Likes Received:
    1,076
    Location:
    55°38′33″ N, 37°28′37″ E
    I don't really buy it that itermediate representations like LLVM and SPIR-V would result in lower performance. The recommended object code format in CUDA is PTX (Parallel Thread Execution), an intermediate format which is a subset of LLVM 3.7 (used by NVPTX back-end in Clang/LLVM and NVVM back-end in the NVCC compiler), even though CUDA also supports architecture-specific machine code.

    https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html
    https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-ref

    (LLVM IR is a high-level language based around an abstracted general purpose central processor; SPIR-V arguably is an even higher level abstraction suitable for GPU parallel processing. Both are a full equivalent of your C/C++, GLSL, HLSL etc. source code, though converted to a translator-friendly, machine-readable binary format.)


    Khronos Group maintains an open-source SPIR-V to LLVM translator, an SPIR-V back-end for LLVM, SPIR-V Cross converter (to HLSL/GLSL/Metal), etc. LLVM maintains the Multi-Level IR Compiler Framework that supports SPIR-V to LLVM conversion, and the AMDGPU back-end. AMD just needs to devote more developer resources to these open source projects, rather than start their own implementations and abandon them soon.

    https://github.com/KhronosGroup/SPIRV-LLVM-Translator
    https://github.com/KhronosGroup/SPIRV-Cross
    https://github.com/KhronosGroup/SPIR/

    https://mlir.llvm.org/docs/SPIRVToLLVMDialectConversion/
    https://mlir.llvm.org/docs/Dialects/SPIR-V/

    https://github.com/KhronosGroup/LLVM-SPIRV-Backend
    https://www.phoronix.com/scan.php?page=news_item&px=Intel-2021-LLVM-SPIR-V-Backend

    https://llvm.org/docs/AMDGPUUsage.html
    https://rocmdocs.amd.com/en/latest/ROCm_Compiler_SDK/ROCm-Native-ISA.html#memory-model

    amdhip64.dll has been included with Radeon Adrenalin drivers since 2019, however ROCm/HIP development is still not supported on Windows, and the HCC compiler officialy only supports Radeon Instinct MI series, i.e. GCN4 (Polaris 11), GCN5 (Vega) and CDNA.

    https://github.com/ROCm-Developer-Tools/HIP/commit/e2bf34cd5e6444ee04adae7c0c496bf52cff4f31
    https://github.com/illuhad/hipSYCL/issues/78#issuecomment-582810780
    https://github.com/ROCm-Developer-Tools/HIP/issues/84


    GROMACS always based their GPGPU path on CUDA. Their IWOCS 2021 presentation says that support for OpenCL 1.x was implemented on top of their pre-existing CUDA abstractions, and so is support for SYCL 2020 and DPC++.

    AFAIK, AMD HCC compiler also supports OpenCL C 2.x, so it can be made to support C++ for OpenCL, and hipSYCL would be better off directly targeting the OpenCL dialect of C++ , rather than jump hoops trying to abstract SYCL/OpenCL on top of HIP/CUDA...

    AMD just need to take the OptiX API and call it HippiX.

    https://developer.nvidia.com/blog/how-to-get-started-with-optix-7/
     
    #47 DmitryKo, May 9, 2021 at 1:23 AM
    Last edited: May 9, 2021 at 1:29 AM
    BRiT likes this.
  8. Lurkmass

    Regular Newcomer

    Joined:
    Mar 3, 2020
    Messages:
    320
    Likes Received:
    357
    There's a big difference between SPIR-V and PTX. SPIR-V is designed by a committee where every participating member has to make a compromise and choose to expose features that all vendors can support which will constrain compiler design. The specifications for PTX are solely controlled by Nvidia and is designed to be forward compatible with future Nvidia HW but even better is that every new iteration of PTX doesn't have to retain backwards compatibility with their older HW either. Your example between SPIR-V and PTX only underlines the apparent tradeoff that compiler designers have to make. Pick your poison (SPIR-V/portability or PTX/performance/simplicity) ...

    Even if SPIR-V doesn't have a performance deficit, it doesn't mean that it'll have the same maintenance cost compared to a compiler that only emits native code. Making an offline compiler takes far less effort and is less error prone as well while making a compiler for an IR takes more resources and introduces more bugs in the process too ...

    SPIR-V Cross doesn't support the OpenCL dialect of SPIR-V at all which has features like pointers for local and private memory or unstructured control flow along with several other things. Most vendors only support SPIR-V's shader capabilities so they can't do any of the fun things that are exclusive to it's kernel capabilities that we see in OpenCL. None of these projects are relevant to implementing a SPIR-V compiler with kernel capabilities and AMD aren't avoiding these projects because they want to but it's because making this SPIR-V compiler is too much work for virtually no return. LLVM is just an infrastructure for a collection different of backends. AMD still has to implement a SPIR-V compiler over there ...

    AMD is using LLVM for it's HIP-Clang backend but they aren't going to make a SPIR-V backend. Same goes for Nvidia where they have a CUDA-Clang backend for LLVM but there's no SPIR-V backend for them either. Only Intel has a SPIR-V compiler in one of the LLVM backends. It's amazing how LLVM can be used for many different frontends (CUDA/DPC++/HIP) to target their unique backends (GCN/PTX/SPIR-V) as well so LLVM alone isn't going to get us closer to portability ...

    HCC has been deprecated in favour of the HIP-Clang compiler and there is a project that advertises the usage of ROCm on Windows ...

    They should go beyond just their CUDA abstraction. SYCL doesn't have much of a future outside of Intel and even then DPC++ is their superior version of SYCL. Even the GROMACS team won't dare use SYCL (PTX backend) on Nvidia HW since they don't totally buy into it's portability claims either so I think they're bound to learn this the really hard way when experimenting with SYCL for AMD/Intel ...

    AMD HCC only supports offline compilation into GCN bytecode. What good are all these source languages for if vendors can't agree to have one IR to rule all implementations ? You do realize that both ARM and x86 support C++ as a source language but they are by no means consistent across each other even with the same C++ code or compiler as well. The Clang compiler can detect undefined behaviour in the C++ code too for this express purpose ...
     
    BRiT likes this.
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...