Understanding CI/CD at SKAO#

Learn why SKAO uses CI/CD and how the pipeline architecture works.

Why CI/CD?#

Continuous Integration and Continuous Deployment (CI/CD) ensures that SKAO software is:

Consistently built — Every project follows the same pipeline stages
Automatically tested — Tests run on every commit
Reliably deployed — Pipelines publish artefacts to the Central Artefact Repository
Traceable — Every build is linked to a specific git commit

CI/CD reduces manual errors, speeds up delivery, and maintains code quality across the distributed SKAO development teams.

How SKAO CI/CD Works#

SKAO uses GitLab CI/CD with custom runners hosted on SKAO infrastructure.

The pipeline flow:

Developer pushes code
      ↓
GitLab detects .gitlab-ci.yml
      ↓
Pipeline triggered
      ↓
Jobs run on SKA runners
      ↓
Build → Lint → Test → Scan → Publish → Pages
      ↓
Central Artefact Repository stores artefacts
      ↓
System updates metrics and badges

The Templates Repository#

To standardise pipelines across all projects, SKAO maintains a templates-repository.

Benefits:

Consistency — All projects use the same job definitions
Maintainability — Updates to templates propagate to all projects
Simplicity — Developers include templates instead of writing jobs from scratch
Flexibility — Make targets allow local customisation

How it works:

Each template calls standardised make targets. This means the same commands work locally and in the pipeline:

# In the template
script:
  - make python-test

# Developers can customise via Makefile variables
PYTHON_VARS_AFTER_PYTEST = -m 'not post_deployment' --forked

Note

You can provide variables directly in .gitlab-ci.yml, but this is not recommended because it makes local development and pipeline behaviour diverge.

Kubernetes-Based Runners#

SKAO CI/CD runners operate on Kubernetes clusters, providing:

Auto-scaling — Runners scale based on demand
Isolation — Each job runs in its own container
Shared cache — Speeds up job times across runners
Docker support — Docker-in-Docker available for container builds

Architecture:

Runners run on nodes labelled for CI/CD jobs
A dedicated Docker daemon runs on nodes (not Docker-in-Docker) for security
BuildX worker pools support multi-platform container builds

Note

Kubernetes runners do not support docker-compose.

GPU Pipelines#

For machine learning and compute-intensive workloads, SKAO provides GPU runners.

Available clusters:

techops — Main CI/CD cluster with limited GPU support (primarily for building GPU-enabled artefacts)
dp — Data Processing cluster with more GPUs for running actual workloads

Using GPUs:

Tag your job with a GPU runner tag (e.g., ska-gpu-a100)
Set resource limits in your container configuration to claim GPU instances

GPU runners follow the same Kubernetes architecture as standard runners.

Multi-Platform Builds with BuildX#

SKAO supports building container images for multiple architectures (AMD64, ARM) using Docker BuildX.

How it works:

Dedicated BuildX worker pools run on ska-buildx tagged runners
QEMU emulation enables cross-platform builds
The OCI build template provides oci-image-build-armv5 and oci-image-build-armv8 jobs

When to use:

Deploying to ARM-based edge devices
Supporting multiple processor architectures
Building universal container images

CI Health Metrics and Badges#

SKAO tracks code health across all projects through automated metrics collection.

Required metrics:

Unit tests — Number of tests, errors, failures
Linting — Static analysis results
Coverage — Percentage of code covered by tests

Badges:

Badges display metrics on each repository, showing the default branch status. This provides quick visibility into project health.

Why metrics matter:

Track quality trends over time
Identify projects needing attention
Ensure compliance with SKAO standards
Support release decisions

The Central Artefact Repository#

The Central Artefact Repository (CAR) stores all published artefacts and provides:

Native tool access — Use pip, docker, helm natively
Versioning — Semantic versioning for all artefacts
Metadata — Extensible metadata for lifecycle management
Security — Vulnerability scanning, access control, provenance
Integration — APIs for DevSecOps processes

See Central Artefact Repository for full details.

Rate Limiting#

GitLab implements rate limiting to protect against denial-of-service attacks and ensure fair resource usage.

What happens:

Excessive requests receive HTTP 429 responses
Scripts must wait before retrying

Best practices:

Implement retry logic with delays
Cache results where possible
Avoid tight loops making API calls

Understanding CI/CD at SKAO#

Why CI/CD?#

How SKAO CI/CD Works#

The Templates Repository#

Kubernetes-Based Runners#

GPU Pipelines#

Multi-Platform Builds with BuildX#

CI Health Metrics and Badges#

The Central Artefact Repository#

Rate Limiting#

See Also#