| ===================== |
| Clang Offload Bundler |
| ===================== |
| |
| .. contents:: |
| :local: |
| |
| .. _clang-offload-bundler: |
| |
| Introduction |
| ============ |
| |
| For heterogeneous single source programming languages, use one or more |
| ``--offload-arch=<target-id>`` Clang options to specify the target IDs of the |
| code to generate for the offload code regions. |
| |
| The tool chain may perform multiple compilations of a translation unit to |
| produce separate code objects for the host and potentially multiple offloaded |
| devices. The ``clang-offload-bundler`` tool may be used as part of the tool |
| chain to combine these multiple code objects into a single bundled code object. |
| |
| The tool chain may use a bundled code object as an intermediate step so that |
| each tool chain step consumes and produces a single file as in traditional |
| non-heterogeneous tool chains. The bundled code object contains the code objects |
| for the host and all the offload devices. |
| |
| A bundled code object may also be used to bundle just the offloaded code |
| objects, and embedded as data into the host code object. The host compilation |
| includes an ``init`` function that will use the runtime corresponding to the |
| offload kind (see :ref:`clang-offload-kind-table`) to load the offload code |
| objects appropriate to the devices present when the host program is executed. |
| |
| .. _clang-bundled-code-object-layout: |
| |
| Bundled Code Object Layout |
| ========================== |
| |
| The layout of a bundled code object is defined by the following table: |
| |
| .. table:: Bundled Code Object Layout |
| :name: bundled-code-object-layout-table |
| |
| =================================== ======= ================ =============================== |
| Field Type Size in Bytes Description |
| =================================== ======= ================ =============================== |
| Magic String string 24 ``__CLANG_OFFLOAD_BUNDLE__`` |
| Number Of Code Objects integer 8 Number od bundled code objects. |
| 1st Bundle Entry Code Object Offset integer 8 Byte offset from beginning of |
| bundled code object to 1st code |
| object. |
| 1st Bundle Entry Code Object Size integer 8 Byte size of 1st code object. |
| 1st Bundle Entry ID Length integer 8 Character length of bundle |
| entry ID of 1st code object. |
| 1st Bundle Entry ID string 1st Bundle Entry Bundle entry ID of 1st code |
| ID Length object. This is not NUL |
| terminated. See |
| :ref:`clang-bundle-entry-id`. |
| \... |
| Nth Bundle Entry Code Object Offset integer 8 |
| Nth Bundle Entry Code Object Size integer 8 |
| Nth Bundle Entry ID Length integer 8 |
| Nth Bundle Entry ID string 1st Bundle Entry |
| ID Length |
| 1st Bundle Entry Code Object bytes 1st Bundle Entry |
| Code Object Size |
| \... |
| Nth Bundle Entry Code Object bytes Nth Bundle Entry |
| Code Object Size |
| =================================== ======= ================ =============================== |
| |
| .. _clang-bundle-entry-id: |
| |
| Bundle Entry ID |
| =============== |
| |
| Each entry in a bundled code object (see |
| :ref:`clang-bundled-code-object-layout`) has a bundle entry ID that indicates |
| the kind of the entry's code object and the runtime that manages it. |
| |
| Bundle entry ID syntax is defined by the following BNF syntax: |
| |
| .. code:: |
| |
| <bundle-entry-id> ::== <offload-kind> "-" <target-triple> [ "-" <target-id> ] |
| |
| Where: |
| |
| **offload-kind** |
| The runtime responsible for managing the bundled entry code object. See |
| :ref:`clang-offload-kind-table`. |
| |
| .. table:: Bundled Code Object Offload Kind |
| :name: clang-offload-kind-table |
| |
| ============= ============================================================== |
| Offload Kind Description |
| ============= ============================================================== |
| host Host code object. ``clang-offload-bundler`` always includes |
| this entry as the first bundled code object entry. For an |
| embedded bundled code object this entry is not used by the |
| runtime and so is generally an empty code object. |
| |
| hip Offload code object for the HIP language. Used for all |
| HIP language offload code objects when the |
| ``clang-offload-bundler`` is used to bundle code objects as |
| intermediate steps of the tool chain. Also used for AMD GPU |
| code objects before ABI version V4 when the |
| ``clang-offload-bundler`` is used to create a *fat binary* |
| to be loaded by the HIP runtime. The fat binary can be |
| loaded directly from a file, or be embedded in the host code |
| object as a data section with the name ``.hip_fatbin``. |
| |
| hipv4 Offload code object for the HIP language. Used for AMD GPU |
| code objects with at least ABI version V4 when the |
| ``clang-offload-bundler`` is used to create a *fat binary* |
| to be loaded by the HIP runtime. The fat binary can be |
| loaded directly from a file, or be embedded in the host code |
| object as a data section with the name ``.hip_fatbin``. |
| |
| openmp Offload code object for the OpenMP language extension. |
| ============= ============================================================== |
| |
| **target-triple** |
| The target triple of the code object. |
| |
| **target-id** |
| The canonical target ID of the code object. Present only if the target |
| supports a target ID. See :ref:`clang-target-id`. |
| |
| Each entry of a bundled code object must have a different bundle entry ID. There |
| can be multiple entries for the same processor provided they differ in target |
| feature settings. If there is an entry with a target feature specified as *Any*, |
| then all entries must specify that target feature as *Any* for the same |
| processor. There may be additional target specific restrictions. |
| |
| .. _clang-target-id: |
| |
| Target ID |
| ========= |
| |
| A target ID is used to indicate the processor and optionally its configuration, |
| expressed by a set of target features, that affect ISA generation. It is target |
| specific if a target ID is supported, or if the target triple alone is |
| sufficient to specify the ISA generation. |
| |
| It is used with the ``-mcpu=<target-id>`` and ``--offload-arch=<target-id>`` |
| Clang compilation options to specify the kind of code to generate. |
| |
| It is also used as part of the bundle entry ID to identify the code object. See |
| :ref:`clang-bundle-entry-id`. |
| |
| Target ID syntax is defined by the following BNF syntax: |
| |
| .. code:: |
| |
| <target-id> ::== <processor> ( ":" <target-feature> ( "+" | "-" ) )* |
| |
| Where: |
| |
| **processor** |
| Is a the target specific processor or any alternative processor name. |
| |
| **target-feature** |
| Is a target feature name that is supported by the processor. Each target |
| feature must appear at most once in a target ID and can have one of three |
| values: |
| |
| *Any* |
| Specified by omitting the target feature from the target ID. |
| A code object compiled with a target ID specifying the default |
| value of a target feature can be loaded and executed on a processor |
| configured with the target feature on or off. |
| |
| *On* |
| Specified by ``+``, indicating the target feature is enabled. A code |
| object compiled with a target ID specifying a target feature on |
| can only be loaded on a processor configured with the target feature on. |
| |
| *Off* |
| specified by ``-``, indicating the target feature is disabled. A code |
| object compiled with a target ID specifying a target feature off |
| can only be loaded on a processor configured with the target feature off. |
| |
| There are two forms of target ID: |
| |
| *Non-Canonical Form* |
| The non-canonical form is used as the input to user commands to allow the user |
| greater convenience. It allows both the primary and alternative processor name |
| to be used and the target features may be specified in any order. |
| |
| *Canonical Form* |
| The canonical form is used for all generated output to allow greater |
| convenience for tools that consume the information. It is also used for |
| internal passing of information between tools. Only the primary and not |
| alternative processor name is used and the target features are specified in |
| alphabetic order. Command line tools convert non-canonical form to canonical |
| form. |
| |
| Target Specific information |
| =========================== |
| |
| Target specific information is available for the following: |
| |
| *AMD GPU* |
| AMD GPU supports target ID and target features. See `User Guide for AMDGPU Backend |
| <https://llvm.org/docs/AMDGPUUsage.html>`_ which defines the `processors |
| <https://llvm.org/docs/AMDGPUUsage.html#amdgpu-processors>`_ and `target |
| features <https://llvm.org/docs/AMDGPUUsage.html#amdgpu-target-features>`_ |
| supported. |
| |
| Most other targets do not support target IDs. |