How to Work with CI/CD at SKAO#
Step-by-step instructions for common CI/CD tasks.
Use a Specific Runner#
Pipelines run on shared GitLab runners by default. Add tags to your jobs to use SKAO-specific runners.
Enable specific runners for your project:
Go to Settings → CI/CD → Runners
Find the runner you need under “Available specific runners”
Click Enable for this project
Specify runner tags in your job:
my-job:
tags:
- ska-default
script:
- echo "Running on SKA runner"
The STFC cluster provides runners with tags stfc or docker-executor.
Note
SKAO templates automatically set runner tags using variables. Use these variables instead of hardcoding tags:
SKA_DEFAULT_RUNNER— Defaults toska-defaultSKA_K8S_RUNNER— Defaults toska-k8sSKA_GPU_RUNNER— Defaults toska-gpu-a100
Configure CI Health Metrics#
SKAO requires all projects to collect code health metrics: unit tests, linting, and coverage.
Automated collection (recommended):
Include the finaliser template and ensure your reports follow the required format:
include:
- project: 'ska-telescope/templates-repository'
file: 'gitlab-ci/includes/finaliser.gitlab-ci.yml'
Report requirements:
Create files in the
testorlintingstages — do not include them in the repositoryUnit tests report: JUnit XML at
./build/reports/unit-tests.xmlLinting report: JUnit XML at
./build/reports/linting.xmlCoverage report: XML (Coverage.py format) at
./build/reports/code-coverage.xml
Important: Copy reports in after_script, not script, to capture them even if tests fail:
# Correct approach
my-test-job:
script:
- python3 -m pytest ...
after_script:
- cp unit-tests.xml report.json cucumber.json ../build/reports/
Create Manual Metrics#
If you prefer to create your own ci-metrics.json file, follow this structure:
{
"commit-sha": "cd07bea4bc8226b186dd02831424264ab0e4f822",
"build-status": {
"last": {
"timestamp": 1568202193.0
}
},
"coverage": {
"percentage": 60.00
},
"tests": {
"errors": 0,
"failures": 3,
"total": 170
},
"lint": {
"errors": 4,
"failures": 0,
"total": 7
}
}
Important
The ci-metrics.json file must not exist in the repository. Create it
during the CI pipeline.
Upload to Central Artefact Repository#
Use environment variables to upload Python packages and Docker images to the Central Artefact Repository (CAR).
Python modules:
publish to nexus:
stage: publish
tags:
- docker-executor
variables:
TWINE_USERNAME: $CAR_PYPI_USERNAME
TWINE_PASSWORD: $CAR_PYPI_PASSWORD
script:
- scripts/validate-metadata.sh
- pip install twine
- twine upload --repository-url $CAR_PYPI_REPOSITORY_URL dist/*
only:
variables:
- $CI_COMMIT_MESSAGE =~ /^.+$/
- $CI_COMMIT_TAG =~ /^(([0-9]+)\.([0-9]+)\.([0-9]+)(?:-([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?)$/
OCI images:
script:
- cd docker/tango/tango-cpp
- echo ${CAR_OCI_REGISTRY_PASSWORD} | docker login --username ${CAR_OCI_REGISTRY_USERNAME} --password-stdin ${CAR_OCI_REGISTRY_HOST}
- make DOCKER_BUILD_ARGS="--no-cache" DOCKER_REGISTRY_USERNAME=$CAR_OCI_REGISTRY_USERNAME ...
Use GPU Runners#
For jobs requiring GPU resources, use the appropriate runner tag:
gpu-job:
tags:
- ska-gpu-a100
script:
- python train_model.py
CI/CD Reference lists available GPU tags.
Build Multi-Platform OCI Images#
To build ARM architecture images, use the BuildX runners:
For ARMv5:
include:
- project: 'ska-telescope/templates-repository'
file: 'gitlab-ci/includes/oci-image.gitlab-ci.yml'
variables:
OCI_USE_PLATFORM_ARMV5: "true"
For ARMv8:
variables:
OCI_USE_PLATFORM_ARMV8: "true"
Only enable one flag at a time. These jobs run on the ska-buildx runner pool.
Handle Rate Limiting#
GitLab uses rate limiting to prevent abuse. A 429 status code indicates GitLab has rate limited your request.
Best practices:
Add retry logic to your scripts
Use caching to reduce repeated requests
Avoid unnecessary API calls in loops
curl --retry 30 --retry-delay 3 \
--header "PRIVATE-TOKEN: <your_access_token>" \
"${repository}/path/to/file.something/raw?ref=<branch_name>"
Troubleshoot a failing pipeline#
Click the failed pipeline to see the error log. Common issues:
Linting errors — run the linter locally before pushing
Test failures — run tests locally with
make testor similarMissing dependencies — check the project’s README for setup instructions