GPU Pipelines and Workloads

This section describes requirements and guidelines for deployment and testing of a new Python project using GPUs on GitLab. The basic guidelines build upon those of the Python Coding Guidelines, but are specific to the GPU environment and describe how to specify a GPU runner for the pipeline jobs and how to deploy a workload on a GPU node in the cluster using a Kubernetes chart deployment.

Running pipeline jobs on a GPU node

A template for a pipeline job on a GPU node is provided in the gitlab-ci/includes/gpu.gitlab-ci.yml location. This template adds a new test stage to the pipeline job, which runs the workload on the GPU node.

In order to use this template add the following to your .gitlab-ci.yml file:

include:
    # GPU
  - project: 'ska-telescope/templates-repository'
    file: 'gitlab-ci/includes/gpu.gitlab-ci.yml'

You will probably also want to add the following to your .gitlab-ci.yml file, specifyng that the non-GPU pipeline tests should not be run in case you aren’t using a GPU:

include:
    # Python
  - project: 'ska-telescope/templates-repository'
    file: 'gitlab-ci/includes/python.gitlab-ci.yml'

Alternatively, if you don’t want to use the provided GPU template, any step on your pipeline can be configured to use the GPU node by adding the following to the step:

tags:
    - k8srunner-gpu-v100

The unit tests themselves should be marked with the gputest tag.

@pytest.mark.gputest
def test_cuda():
    """A dummy test for a cuda function"""
    test = dummy.cuda_dummy_function()
    assert test == "cuda-function"

Deploying a workload on a GPU node

The STENCIL project provides a template deployment chart that can be used to deploy a workload on a GPU node.

All that’s needed to deploy the existing chart is to issue the command:

make k8s-install-chart

If you want to create your own chart that deploys a workload to a GPU node, you need to define the following besides the usual steps needed for a CPU workload:

On the values.yaml file:

# [...]
image:
    repository: nvidia/cuda # The image to use
    tag: "11.0-base" # The tag to use if needed. Otherwise, leave the tag empty (i.e. "")

# [...]
resources:
    limits:
        nvidia.com/gpu: 1 # The maximum number of GPUs to use (this number is an integer and reserves a full physical device)
    requests:
        nvidia.com/gpu: 1 # The minimum number of GPUs to use (this number is an integer and reserves a full physical device)

# [...]
# The GPU nodes have a taint that prevents purely CPU workloads from being scheduled on the GPU nodes. This taint is removed by the following toleration:
tolerations:
- key: "nvidia.com/gpu"
    operator: "Equal"
    value: "true"
    effect: "NoExecute"

NOTE: The GPU resources are scarce. Reserving 1 GPU uses a full physical device for your workload and can quickly exhaust the available GPU resources.

On the deployment.yaml file:

# [...]
spec:
    template:
        spec:
            runtimeClassName: "nvidia"

Under normal circumstances after the workload is finished, the container should be deleted. In case you need to manually remove the deployed chart, issue the following command:

make k8s-uninstall-chart

Summary

This basic template project is available on GitLab. And demonstrates the following:

  1. Provides functions and unit tests that run on a GPU worker node runner by calling the GPU gitlab CI/CD template.

  2. Defines an example chart that deploys a workload to a GPU node.