SKA SDP Spectral Line Imaging Pipeline

A spectral line imaging pipeline developed by Team DHRUVA for SKAO.

The repository is hosted on gitlab. The documentation is available at this page.

If you wish to contribute to this repository, please refer Developer Guide

Using the pipeline

Installation with pip

The latest release is available in SKA's pip reposoitory. You can install this package using following command:

pip install --extra-index-url https://artefact.skao.int/repository/pypi-internal/simple ska-sdp-spectral-line-imaging

πŸ“ For xradio to work on macOS, it is required to pre-install python-casacore using pip install python-casacore.

Once installed, the spectral line imaging pipeline is available as a python package, and as ska-sdp-spectral-line-imaging cli command.

Run ska-sdp-spectral-line-imaging --help to get help on different subcommands.

Containerized usage

The pipeline can also be deployed inside a oci container.

  1. Run following command to pull the oci image.

    docker pull artefact.skao.int/ska-sdp-spectral-line-imaging:1.1.0
    

    The entrypoint of above image is set to the executable ska-sdp-spectral-line-imaging.

  2. Run image with volume mounts to enable read write to storage.

    docker run [-v local:container] <image-name> [run | install-config] ...
    

You can also spin up local containerized dask cluster using the docker image, and run pipeline within it. Please refer to "Running on a Local Dask Cluster" section of the documentation.

Install the config

Install the default config YAML of the pipeline to a specific directory using the install-config subcommand.

ska-sdp-spectral-line-imaging install-config --config-install-path path/to/dir

Parameters of the default configuration can be overriden

ska-sdp-spectral-line-imaging install-config --config-install-path path/to/dir \
                    --set parameters.imaging.gridding_params.cell_size 0.2 \
                    --set parameters.predict_stage.cell_size 0.2 \
                    --set parameters.read_model.pols [XX,YY]

Run the pipeline

Run the spectral line pipeline using the run subcommand.

For example:

ska-sdp-spectral-line-imaging run \
--input /path/to/input \
--config /path/to/config \
--output /path/to/output/dir

For all the options, run ska-sdp-spectral-line-imaging run --help.

Providing MSv2 and MSv4

The pipeline accepts either MSv2 measurement sets or MSv4 processing sets as input and automatically handles both formats. When using MSv2 input, you can configure chunking parameters in the config YAML file. Note that chunking settings are ignored when MSv4 input is provided.

Autocompletions for bash and zsh

bash

export YAML_PATH=/path/to/pipeline/default
source ./scripts/bash-completions.bash

zsh

source ./scripts/bash-completions.bash
bashcompinit

Some pre-requisites of the pipeline

Regarding the model visibilities and model images

If your MSv2 data already contains MODEL_DATA column , you don’t need to run the read_model and predict_stage stages, which can be turned off using the config file. Continuum subtraction stage will operate on VISIBILITY and existing VISIBILITY_MODEL variables.

If MODEL_DATA column is not present, you can predict the model visibilities by passing model FITS images to the pipeline via read_model stage. The predict_stage will generate VISIBILITY_MODEL data by running predict operation on the model image data.

About the model FITS images:

  1. If FITS image is spectral cube, it must have same frequency coordinates (the reference frequency, frequency delta and number of channels) as the measurement set.

  2. Currently read_model does not support reading data of multiple polarizations from a single FITS file. So for each polarization value in the processing set, there has to be a seperate FITS image.