libc/docs/gpu/testing.rst - toolchain/llvm-project - Git at Google

 .. _libc_gpu_testing:


 =========================
 Testing the GPU C library
 =========================

 .. note::
    Running GPU tests with high parallelism is likely to cause spurious failures,
    out of resource errors, or indefinite hangs. limiting the number of threads
    used while testing using ``LIBC_GPU_TEST_JOBS=<N>`` is highly recommended.

 .. contents:: Table of Contents
   :depth: 4
   :local:

 Testing infrastructure
 ======================

 The LLVM C library supports different kinds of :ref:`tests <build_and_test>`
 depending on the build configuration. The GPU target is considered a full build
 and therefore provides all of its own utilities to build and run the generated
 tests. Currently the GPU supports two kinds of tests.

 #. **Hermetic tests** - These are unit tests built with a test suite similar to
    Google's ``gtest`` infrastructure. These use the same infrastructure as unit
    tests except that the entire environment is self-hosted. This allows us to
    run them on the GPU using our custom utilities. These are used to test the
    majority of functional implementations.

 #. **Integration tests** - These are lightweight tests that simply call a
    ``main`` function and checks if it returns non-zero. These are primarily used
    to test interfaces that are sensitive to threading.

 The GPU uses the same testing infrastructure as the other supported ``libc``
 targets. We do this by treating the GPU as a standard hosted environment capable
 of launching a ``main`` function. Effectively, this means building our own
 startup libraries and loader.

 Testing utilities
 =================

 We provide two utilities to execute arbitrary programs on the GPU. That is the
 ``loader`` and the ``start`` object.

 Startup object
 --------------

 This object mimics the standard object used by existing C library
 implementations. Its job is to perform the necessary setup prior to calling the
 ``main`` function. In the GPU case, this means exporting GPU kernels that will
 perform the necessary operations. Here we use ``_begin`` and ``_end`` to handle
 calling global constructors and destructors while ``_start`` begins the standard
 execution. The following code block shows the implementation for AMDGPU
 architectures.

 .. code-block:: c++

   extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
   _begin(int argc, char **argv, char **env) {
     LIBC_NAMESPACE::atexit(&LIBC_NAMESPACE::call_fini_array_callbacks);
     LIBC_NAMESPACE::call_init_array_callbacks(argc, argv, env);
   }

   extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
   _start(int argc, char **argv, char **envp, int *ret) {
     __atomic_fetch_or(ret, main(argc, argv, envp), __ATOMIC_RELAXED);
   }

   extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
   _end(int retval) {
     LIBC_NAMESPACE::exit(retval);
   }

 Loader runtime
 --------------

 The startup object provides a GPU executable with callable kernels for the
 respective runtime. We can then define a minimal runtime that will launch these
 kernels on the given device. Currently we provide the ``amdhsa-loader`` and
 ``nvptx-loader`` targeting the AMD HSA runtime and CUDA driver runtime
 respectively. By default these will launch with a single thread on the GPU.

 .. code-block:: sh

    $> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=native -flto
    $> amdhsa_loader --threads 1 --blocks 1 ./a.out
    Test Passed!

 The loader utility will forward any arguments passed after the executable image
 to the program on the GPU as well as any set environment variables. The number
 of threads and blocks to be set can be controlled with ``--threads`` and
 ``--blocks``. These also accept additional ``x``, ``y``, ``z`` variants for
 multidimensional grids.

 Running tests
 =============

 Tests will only be built and run if a GPU target architecture is set and the
 corresponding loader utility was built. These can be overridden with the
 ``LIBC_GPU_TEST_ARCHITECTURE`` and ``LIBC_GPU_LOADER_EXECUTABLE`` :ref:`CMake
 options <gpu_cmake_options>`. Once built, they can be run like any other tests.
 The CMake target depends on how the library was built.

 #. **Cross build** - If the C library was built using ``LLVM_ENABLE_PROJECTS``
    or a runtimes cross build, then the standard targets will be present in the
    base CMake build directory.

    #. All tests - You can run all supported tests with the command:

       .. code-block:: sh

         $> ninja check-libc

    #. Hermetic tests - You can run hermetic with tests the command:

       .. code-block:: sh

         $> ninja libc-hermetic-tests

    #. Integration tests - You can run integration tests by the command:

       .. code-block:: sh

         $> ninja libc-integration-tests

 #. **Runtimes build** - If the library was built using ``LLVM_ENABLE_RUNTIMES``
    then the actual ``libc`` build will be in a separate directory.

    #. All tests - You can run all supported tests with the command:

       .. code-block:: sh

         $> ninja check-libc-amdgcn-amd-amdhsa
         $> ninja check-libc-nvptx64-nvidia-cuda

    #. Specific tests - You can use the same targets as above by entering the
       runtimes build directory.

       .. code-block:: sh

         $> ninja -C runtimes/runtimes-amdgcn-amd-amdhsa-bins check-libc
         $> ninja -C runtimes/runtimes-nvptx64-nvidia-cuda-bins check-libc
         $> cd runtimes/runtimes-amdgcn-amd-amdhsa-bins && ninja check-libc
         $> cd runtimes/runtimes-nvptx64-nvidia-cuda-bins && ninja check-libc

 Tests can also be built and run manually using the respective loader utility.
	.. _libc_gpu_testing:


	=========================
	Testing the GPU C library
	=========================

	.. note::
	Running GPU tests with high parallelism is likely to cause spurious failures,
	out of resource errors, or indefinite hangs. limiting the number of threads
	used while testing using ``LIBC_GPU_TEST_JOBS=<N>`` is highly recommended.

	.. contents:: Table of Contents
	:depth: 4
	:local:

	Testing infrastructure
	======================

	The LLVM C library supports different kinds of :ref:`tests <build_and_test>`
	depending on the build configuration. The GPU target is considered a full build
	and therefore provides all of its own utilities to build and run the generated
	tests. Currently the GPU supports two kinds of tests.

	#. Hermetic tests - These are unit tests built with a test suite similar to
	Google's ``gtest`` infrastructure. These use the same infrastructure as unit
	tests except that the entire environment is self-hosted. This allows us to
	run them on the GPU using our custom utilities. These are used to test the
	majority of functional implementations.

	#. Integration tests - These are lightweight tests that simply call a
	``main`` function and checks if it returns non-zero. These are primarily used
	to test interfaces that are sensitive to threading.

	The GPU uses the same testing infrastructure as the other supported ``libc``
	targets. We do this by treating the GPU as a standard hosted environment capable
	of launching a ``main`` function. Effectively, this means building our own
	startup libraries and loader.

	Testing utilities
	=================

	We provide two utilities to execute arbitrary programs on the GPU. That is the
	``loader`` and the ``start`` object.

	Startup object
	--------------

	This object mimics the standard object used by existing C library
	implementations. Its job is to perform the necessary setup prior to calling the
	``main`` function. In the GPU case, this means exporting GPU kernels that will
	perform the necessary operations. Here we use ``_begin`` and ``_end`` to handle
	calling global constructors and destructors while ``_start`` begins the standard
	execution. The following code block shows the implementation for AMDGPU
	architectures.

	.. code-block:: c++

	extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
	_begin(int argc, char argv, char env) {
	LIBC_NAMESPACE::atexit(&LIBC_NAMESPACE::call_fini_array_callbacks);
	LIBC_NAMESPACE::call_init_array_callbacks(argc, argv, env);
	}

	extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
	_start(int argc, char argv, char envp, int *ret) {
	__atomic_fetch_or(ret, main(argc, argv, envp), __ATOMIC_RELAXED);
	}

	extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
	_end(int retval) {
	LIBC_NAMESPACE::exit(retval);
	}

	Loader runtime
	--------------

	The startup object provides a GPU executable with callable kernels for the
	respective runtime. We can then define a minimal runtime that will launch these
	kernels on the given device. Currently we provide the ``amdhsa-loader`` and
	``nvptx-loader`` targeting the AMD HSA runtime and CUDA driver runtime
	respectively. By default these will launch with a single thread on the GPU.

	.. code-block:: sh

	$> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=native -flto
	$> amdhsa_loader --threads 1 --blocks 1 ./a.out
	Test Passed!

	The loader utility will forward any arguments passed after the executable image
	to the program on the GPU as well as any set environment variables. The number
	of threads and blocks to be set can be controlled with ``--threads`` and
	``--blocks``. These also accept additional ``x``, ``y``, ``z`` variants for
	multidimensional grids.

	Running tests
	=============

	Tests will only be built and run if a GPU target architecture is set and the
	corresponding loader utility was built. These can be overridden with the
	``LIBC_GPU_TEST_ARCHITECTURE`` and ``LIBC_GPU_LOADER_EXECUTABLE`` :ref:`CMake
	options <gpu_cmake_options>`. Once built, they can be run like any other tests.
	The CMake target depends on how the library was built.

	#. Cross build - If the C library was built using ``LLVM_ENABLE_PROJECTS``
	or a runtimes cross build, then the standard targets will be present in the
	base CMake build directory.

	#. All tests - You can run all supported tests with the command:

	.. code-block:: sh

	$> ninja check-libc

	#. Hermetic tests - You can run hermetic with tests the command:

	.. code-block:: sh

	$> ninja libc-hermetic-tests

	#. Integration tests - You can run integration tests by the command:

	.. code-block:: sh

	$> ninja libc-integration-tests

	#. Runtimes build - If the library was built using ``LLVM_ENABLE_RUNTIMES``
	then the actual ``libc`` build will be in a separate directory.

	#. All tests - You can run all supported tests with the command:

	.. code-block:: sh

	$> ninja check-libc-amdgcn-amd-amdhsa
	$> ninja check-libc-nvptx64-nvidia-cuda

	#. Specific tests - You can use the same targets as above by entering the
	runtimes build directory.

	.. code-block:: sh

	$> ninja -C runtimes/runtimes-amdgcn-amd-amdhsa-bins check-libc
	$> ninja -C runtimes/runtimes-nvptx64-nvidia-cuda-bins check-libc
	$> cd runtimes/runtimes-amdgcn-amd-amdhsa-bins && ninja check-libc
	$> cd runtimes/runtimes-nvptx64-nvidia-cuda-bins && ninja check-libc

	Tests can also be built and run manually using the respective loader utility.