Testing Strategy

The ICAL pipeline is responsible for improve the accuracy of measured visibilities by removing residual calibration errors (following instrumental calibration) and errors due to ionospheric effects. This is done by running several cycles of self-calibration to generate calibration solutions. These solutions are used by the CIMG pipeline to generate images of the sky.

We have adopted the Rapthor pipeline, a direction-dependent self-calibration pipeline developed for LOFAR by a team at ASTRON, as the basis for the SKA-low ICAL pipeline. We contribute towards extending and improving Rapthor in collaboration with the ASTRON team. Rapthor must meet the requirements for self-calibration for both telescopes. Upstream pre-processing as well as both functional and non-functional requirements differ. These differences requires some changes to the standard testing policy and strategy adopted at SKAO.

This testing strategy documents the approaches we are taking and the changes we have introduced in order to ensure that the ICAL pipeline is fit for SKAO purposes.

Much of the code in Rapthor will be extracted into external repositories (e.g. LSMTool) where it can more easily be maintained, tested and used by other pipelines. When making these changes, we apply the same testing strategy to these repositories as for Rapthor.

Unit testing

Unit tests are located in the Rapthor repository and run on CI/CD with every push. All tests must pass before any branch is merged to the ‘master’ branch. As per the SKAO testing strategy:

  • We follow a test-first approach whenever possible.

  • We aim for >= 75% overall test coverage on any new code added by SKAO developers and all critical parts of the codebase. Any code not covered by tests must be justified and any risks associated with this must be mitigated and documented.

  • We write one or more tests to confirm any bugs before making any code changes to fix them.

Bugs are often identified by manual testing and test coverage is currently insufficient. We are therefore prioritising adding tests for existing code and automating as much of the manual testing as possible.

Code often needs to be refactored to make it easier to test. In this case we adopt the following process:

  1. Add sufficient tests to ensure the correct behaviour of the existing code

  2. Refactor to make the code easier to test more thoroughly

  3. Add additional unit tests

  4. Ensure all tests pass

  5. Refactor to improve code quality and maintainability

We use pytest for testing Python code.

Component testing

Rapthor uses CWL to define workflows. These comprise one or more steps that run C++ components or Python scripts (from Rapthor itself or LoSoTo) on the CLI.

We have developed a mock for a CWL workflow execution so that we can test the expected inputs and outputs of workflows and the interface between them.

Since we are planning to move away from CWL towards a pure Python approach, we are only testing CWL workflows at a high level and/or where there is a high risk of introducing bugs.

Integration testing

The current testing environment for SDP pipelines is on an AWS HPC cluster. Jobs are submitted via SLURM scripts and the pipeline and all its dependencies are loaded via Spack modules.

The ska-sdp-ical repository was created to store SKAO-specific scripts, configuration files and documentation so that these can be updated and released independently of Rapthor itself.

Test data are simulated and stored in S3 buckets and, in order to test both the scientific quality of self-calibration outputs, both the input and output data need to have sufficient coverage and resolution in time and frequency. Due to the current computational performance and scalability of the pipeline, this testing is slow (from several hours up to several days). We are, therefore, taking a multi-layered approach to integration testing.

  1. SKAO-specific scripts and configuration files are verified for correctness upon every push and merge request to the ska-sdp-ical repository using Bats (Bash Automated Testing System).

  2. Quick tests that use tiny datasets in the rapthor repository ensure that Rapthor runs without failure (without evaluating scientific quality).

  3. Slower tests that ensure correct behaviour for different SKAO test cases (defined in collaboration with users and testers) are run nightly on CI/CD in the ska-sdp-ical repo, on the main branch, with more representative test data stored outside the repository. Checks are included to ensure there are no regressions in scientific quality.

  4. Benchmarking runs use a representative dataset to track improvements or regressions in computational performance of ICAL releases that are deployed to AWS. These are run on AWS from the ska-sdp-spack repository and use scripts in the ska-sdp-ical repository and data stored on S3.

Note

Automated integration testing is a work in progress and most tests are currently run manually on AWS.