How to Work with CI/CD at SKAO#

Step-by-step instructions for common CI/CD tasks.


Use a Specific Runner#

Pipelines run on shared GitLab runners by default. Add tags to your jobs to use SKAO-specific runners.

Enable specific runners for your project:

  1. Go to Settings → CI/CD → Runners

  2. Find the runner you need under “Available specific runners”

  3. Click Enable for this project

Specify runner tags in your job:

my-job:
  tags:
    - ska-default
  script:
    - echo "Running on SKA runner"

The STFC cluster provides runners with tags stfc or docker-executor.

Note

SKAO templates automatically set runner tags using variables. Use these variables instead of hardcoding tags:

  • SKA_DEFAULT_RUNNER — Defaults to ska-default

  • SKA_K8S_RUNNER — Defaults to ska-k8s

  • SKA_GPU_RUNNER — Defaults to ska-gpu-a100


Configure CI Health Metrics#

SKAO requires all projects to collect code health metrics: unit tests, linting, and coverage.

Automated collection (recommended):

Include the finaliser template and ensure your reports follow the required format:

include:
  - project: 'ska-telescope/templates-repository'
    file: 'gitlab-ci/includes/finaliser.gitlab-ci.yml'

Report requirements:

  1. Create files in the test or linting stages — do not include them in the repository

  2. Unit tests report: JUnit XML at ./build/reports/unit-tests.xml

  3. Linting report: JUnit XML at ./build/reports/linting.xml

  4. Coverage report: XML (Coverage.py format) at ./build/reports/code-coverage.xml

Important: Copy reports in after_script, not script, to capture them even if tests fail:

# Correct approach
my-test-job:
  script:
    - python3 -m pytest ...
  after_script:
    - cp unit-tests.xml report.json cucumber.json ../build/reports/

Create Manual Metrics#

If you prefer to create your own ci-metrics.json file, follow this structure:

{
  "commit-sha": "cd07bea4bc8226b186dd02831424264ab0e4f822",
  "build-status": {
    "last": {
      "timestamp": 1568202193.0
    }
  },
  "coverage": {
    "percentage": 60.00
  },
  "tests": {
    "errors": 0,
    "failures": 3,
    "total": 170
  },
  "lint": {
    "errors": 4,
    "failures": 0,
    "total": 7
  }
}

Important

The ci-metrics.json file must not exist in the repository. Create it during the CI pipeline.


Upload to Central Artefact Repository#

Use environment variables to upload Python packages and Docker images to the Central Artefact Repository (CAR).

Python modules:

publish to nexus:
  stage: publish
  tags:
    - docker-executor
  variables:
    TWINE_USERNAME: $CAR_PYPI_USERNAME
    TWINE_PASSWORD: $CAR_PYPI_PASSWORD
  script:
    - scripts/validate-metadata.sh
    - pip install twine
    - twine upload --repository-url $CAR_PYPI_REPOSITORY_URL dist/*
  only:
    variables:
      - $CI_COMMIT_MESSAGE =~ /^.+$/
      - $CI_COMMIT_TAG =~ /^(([0-9]+)\.([0-9]+)\.([0-9]+)(?:-([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?)$/

OCI images:

script:
  - cd docker/tango/tango-cpp
  - echo ${CAR_OCI_REGISTRY_PASSWORD} | docker login --username ${CAR_OCI_REGISTRY_USERNAME} --password-stdin ${CAR_OCI_REGISTRY_HOST}
  - make DOCKER_BUILD_ARGS="--no-cache" DOCKER_REGISTRY_USERNAME=$CAR_OCI_REGISTRY_USERNAME ...

Use GPU Runners#

For jobs requiring GPU resources, use the appropriate runner tag:

gpu-job:
  tags:
    - ska-gpu-a100
  script:
    - python train_model.py

CI/CD Reference lists available GPU tags.


Build Multi-Platform OCI Images#

To build ARM architecture images, use the BuildX runners:

For ARMv5:

include:
  - project: 'ska-telescope/templates-repository'
    file: 'gitlab-ci/includes/oci-image.gitlab-ci.yml'

variables:
  OCI_USE_PLATFORM_ARMV5: "true"

For ARMv8:

variables:
  OCI_USE_PLATFORM_ARMV8: "true"

Only enable one flag at a time. These jobs run on the ska-buildx runner pool.


Handle Rate Limiting#

GitLab uses rate limiting to prevent abuse. A 429 status code indicates GitLab has rate limited your request.

Best practices:

  • Add retry logic to your scripts

  • Use caching to reduce repeated requests

  • Avoid unnecessary API calls in loops

curl --retry 30 --retry-delay 3 \
  --header "PRIVATE-TOKEN: <your_access_token>" \
  "${repository}/path/to/file.something/raw?ref=<branch_name>"

Troubleshoot a failing pipeline#

Click the failed pipeline to see the error log. Common issues:

  • Linting errors — run the linter locally before pushing

  • Test failures — run tests locally with make test or similar

  • Missing dependencies — check the project’s README for setup instructions


See Also#