SKA SDP Wflow Low Selfcal

This repository defines a self calibration pipeline for the SKA LOW. It relies on DP3 and WSClean. This repository is under development, and should not be used for science purposes. Current documentation can be viewed at:

Getting started

This project is defined on the basis of this template https://gitlab.com/ska-telescope/templates/ska-python-skeleton (Documentation Status)

Project status

This repository was created in PI18, currently being developed by Team Schaap.

Usage

To run this pipeline you will need the following software installed:

  • DYSCO https://github.com/aroffringa/dysco

  • DP3 https://git.astron.nl/RD/DP3

  • WSClean https://wsclean.readthedocs.io/en/latest/installation.html

You can then proceed and install the pipeline module from this repository using ‘pip install -e .’ from the main directory.

To run the pipeline you can use the following command (adjusting the paths to DP3 and WSClean executables):

python3 src/ska_sdp_wflow_low_selfcal/pipeline/main.py
–dp3_path /home/csalvoni/scratch/schaap/dp3/build/DP3
–wsclean_path /home/csalvoni/scratch/schaap/wsclean/build/wsclean
–input_ms /var/scratch/csalvoni/rapthor_working_dir/chiara/midbands.ms
–work_dir /var/scratch/csalvoni/rapthor_working_dir/python_running/working_dir
–logging_tag $SLURM_JOB_ID
–resume_from_operation calibrate_1
–run_single_operation False
–run_distributed True

The parameter “–logging_tag” sets a tag that is included in the filenames of the log files. The main workflow log file is wflow-low-selfcal..log. Other log files are DP3...log, wsclean..log and dask..log. The pipeline generates separate DP3 log files for each DP3 run, which contain a description of the pipeline part in the file name besides the tag. By default, the logging tag is the process id of the workflow script.

The parameter “–resume_from_operation” allows restarting the pipeline from a given operation. Choose among calibrate_1, predict_1, image_1, calibrate_2, image_2, calibrate_3, predict_3, image_3.

The parameter “–run_single_operation” allows running a single operation in the pipeline, given with “resume_from_operation”

The parameter “–run_distributed” enables using MPI for calibration and imaging. These operations then run using multiple processes and/or nodes. MPI processes are started using ‘mpirun’, which should use the (SLURM) environment for determining how many processes it should run on which nodes. By default, using MPI is disabled.

Contribute

If you want to contribute to this project, please consult the section below.

The system used for development needs to have Python 3 and pip installed.

Installation

You can install this module via pip, using ‘pip install ska-sdp-wflow-low-selfcal’. To run in distributed mode, you will also need to install ‘mpi4py’.

If you want to install it via git, follow the instructions below.

In order to clone and work with this repository, you need to have poetry installed. You can get it with:

curl -sSL https://install.python-poetry.org | python3 -

Clone the repository with its submodules

git clone --recursive git@gitlab.com:ska-telescope/sdp/science-pipeline-workflows/ska-sdp-wflow-low-selfcal.git
cd ska-sdp-wflow-low-selfcal
git submodule init
git submodule update

Enter poetry virtual environment and build the project

poetry shell
poetry build && poetry install

Now you can use the make instructions of the submodule:

make python-build

You can also format the code with make python-format and check the linting with make python-lint