Dockerfiles for RASCIL

RASCIL supports the publishing of various docker images. The related Dockerfiles can be found in the docker directory and its subdirectories. The images are based on a python wheel created from RASCIL.

Makefiles are also included, which support building, pushing, and tagging images. The images are named as specified in the release file of the docker image directory, and tagged by the RASCIL version stored in rascil/version.py.

There are various directories for docker files:

  • rascil-base: A minimal RASCIL, without data

  • rascil-full: Base with data

  • rascil-notebook: Supports running jupyter notebook

  • rascil-imaging-qa: Runs the Continuum Imaging Quality Assessment tool

  • rascil-rcal: Supports running RCAL as consumer of SDP visibility receive data. Note that this is not published as of rascil==1.1.0

Automatic publishing

The docker images are automatically built by the CI pipeline.

When the repository is tagged, and a new version of it is released, a versioned docker images of each type is published to the Central Artifact Repository (CAR). To find out what versions you can download, look for the relevant RASCIL docker image in the CAR. Example:

artefact.skao.int/rascil-base:1.0.0

Upon every commit an image with the commit tag is published to the GitLab Registry. Note that these are development images and should only be used with caution.

registry.gitlab.com/ska-telescope/external/rascil/rascil-imaging-qa:<commit-tag>

The list of available development images can be found here, where you can find the commit-tag as well: https://gitlab.com/ska-telescope/external/rascil-main/container_registry/

Build, push, and tag a set of Dockerfiles

If you want to build an image yourself, follow these steps:

  • cd into one of the subdirectories

  • Build the image with make build

Other useful make commands :

  • push pushes the images to the docker registry

  • push_latest pushes the :latest tag

  • push_version pushes a version tag without the git SHA

Note, the above make commands use environment variables to determine the image name and repository. For a full list and defaults, please consult the Makefile in docker/make/.

Useful make commands that can be run from the docker directory:

  • build_all_latest builds, and tags as latest, all the images

  • rm_all removes all the images

  • ls_all lists all the images

Test the images

The docker/Makefile contains commands for testing all the images. These write results into the host /tmp area. For docker:

  • make test_base

  • make test_full

  • make test_notebook

  • make test_imaging_qa

  • make test_rcal

And for singularity:

  • make test_base_singularity

  • make test_full_singularity

  • make test_notebook_singularity

  • make test_imaging_qa_singularity

  • make test_rcal_singularity

Generic RASCIL images

rascil-base and rascil-full

The base and full images are available at:

artefact.skao.int/rascil-base
artefact.skao.int/rascil-full

rascil-base does not have the RASCIL test data but is smaller in size. However, for many of the tests and demonstrations the test data is needed, which are included in rascil-full.

To run RASCIL with your home directory available inside the image:

docker run -it --volume $HOME:$HOME artefact.skao.int/rascil-full:<version>

Now let’s run an example. First it simplifies using the container if we do not try to write inside the container, and that’s why we mapped in our $HOME directory. So to run the /rascil/examples/scripts/imaging.py script, we first change directory to the name of the HOME directory, which is the same inside and outside the container, and then give the full address of the script inside the container. This time we will show the prompts from inside the container:

% docker run -p 8888:8888 -v $HOME:$HOME -it artefact.skao.int/rascil-full:1.0.0
rascil@d0c5fc9fc19d:/rascil$ cd /<your home directory>
rascil@d0c5fc9fc19d:/<your home directory>$ python3 /rascil/examples/scripts/imaging.py
...
rascil@d0c5fc9fc19d:/<your home directory>$ ls -l imaging*.fits
-rw-r--r-- 1 rascil rascil 2102400 Feb 11 14:04 imaging_dirty.fits
-rw-r--r-- 1 rascil rascil 2102400 Feb 11 14:04 imaging_psf.fits
-rw-r--r-- 1 rascil rascil 2102400 Feb 11 14:04 imaging_restored.fits

In this example, we change directory to an external location (your home directory in this case, use yours instead), and then we run the script using the absolute path name inside the container.

RASCIL Notebooks

The docker image to use with RASCIL Jupyter Notebooks is:

artefact.skao.int/rascil-notebook

Run Jupyter Notebooks inside the container:

docker run -it -p 8888:8888 --volume $HOME:$HOME artefact.skao.int/rascil-notebook:1.0.0
cd /<your home directory>
jupyter notebook --no-browser --ip 0.0.0.0  /rascil/examples/notebooks/

The Juptyer server will start and output possible URLs to use:

[I 14:08:39.041 NotebookApp] Serving notebooks from local directory: /rascil/examples/notebooks
[I 14:08:39.041 NotebookApp] The Jupyter Notebook is running at:
[I 14:08:39.042 NotebookApp] http://d0c5fc9fc19d:8888/?token=f050f82ed0f8224e559c2bdd29d4ed0d65a116346bcb5653
[I 14:08:39.042 NotebookApp]  or http://127.0.0.1:8888/?token=f050f82ed0f8224e559c2bdd29d4ed0d65a116346bcb5653
[I 14:08:39.042 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 14:08:39.045 NotebookApp] No web browser found: could not locate runnable browser.

The 127.0.0.1 is the one we want. Enter this address in your local browser. You should see the standard Jupyter directory page.

Images of RASCIL applications

Continuum imaging Quality Assessment tool (a.k.a imaging_qa)

imaging_qa finds compact sources in a continuum image and compares them to the sources used in the simulation, thus revealing the quality of the imaging.

DOCKER

Pull the image:

docker pull artefact.skao.int/rascil-imaging-qa:<version>

Run the image:

docker run -v ${PWD}:/myData -e DOCKER_PATH=${PWD} \
    -e CLI_ARGS='--ingest_fitsname_restored /myData/my_restored.fits \
    --ingest_fitsname_residual /myData/my_residual.fits' \
    --rm artefact.skao.int/rascil-imaging-qa:1.0.0

Run it from the directory where your images you want to check are. The output files will appear in the same directory. Update the CLI_ARGS string with the command line arguments of the imaging_qa code as needed. DOCKER_PATH is used to extract the path of the output files the app produced in your local machine, not in the docker container. This is used for generating the output file index files.

SINGULARITY

Pull the image:

singularity pull rascil-imaging-qa.img docker://artefact.skao.int/rascil-imaging-qa:1.0.0

Run the image:

singularity run \
    --env CLI_ARGS='--ingest_fitsname_restored test-imaging-pipeline-dask_continuum_imaging_restored.fits \
        --ingest_fitsname_residual test-imaging-pipeline-dask_continuum_imaging_residual.fits' \
    rascil-imaging-qa.img

Run it from the directory where your images you want to check are. The output files will appear in the same directory. If the singularity image you downloaded is in a different path, point to that path in the above command. Update the CLI_ARGS string with the command line arguments of the imaging qa code as needed.

Providing input arguments from a file

You may create a file that contains the input arguments for the app. Here is an example of it, called args.txt:

--ingest_fitsname_restored=/myData/test-imaging-pipeline-dask_continuum_imaging_restored.fits
--ingest_fitsname_residual=/myData/test-imaging-pipeline-dask_continuum_imaging_residual.fits
--check_source=True
--plot_source=True

Make sure each line contains one argument, there is an equal sign between arg and its value, and that there aren’t any trailing white spaces in the lines (and no empty lines). The paths to images and other input files has to be the absolute path within the container. Here, we use the DOCKER example of mounting our data into the /myData directory.

Then, calling docker run simplifies as:

docker run -v ${PWD}:/myData  -e DOCKER_PATH=${PWD} -e CLI_ARGS='@/myData/args.txt' \
--rm artefact.skao.int/rascil-imaging-qa:1.0.0

Here, we assume that your custom args.txt file is also mounted together with the data into /myData. Provide the absolute path to that file when your run the above command.

You can use an args file to run the singularity version with same principles, baring in mind that singularity will automatically mount your filesystem into the container with paths matching those on your system.

RCAL visibility receive consumer

The image is a prototype of running the Realtime Calibration Pipeline (RCAL) as a consumer within the visibility receive script.

The rascil_rcal directory contains the necessary extra code and Dockerfile to build a docker image that can be used as a consumer for the visibility receive script. This processing script can be deployed in the SDP system. It receives data packets from the Correlator and Beam Former (CBF) or its emulator.

A prototype rcal-consumer has been added to the docker image. It formats the received data packets into objects that can be passed into a VisibilityBucket. A VisibilityBucket is filled up until full, i.e. when it received all frequency channel data for a single time sample. The resulting Visibility object is then passed to RCAL, which processes the data and produces the resulting gain solutions (and optional png images).

The docker image is available from the Central Artifact Repository (tagged with the release version number):

artefact.skao.int/rascil-rcal:<version>

and from the GitLab container registry (tagged with latest and updated upon merge to master):

registry.gitlab.com/ska-telescope/external/rascil/rascil-rcal:latest

Note: as of rascil==1.1.0, the rcal image is no longer released by default.

Running RASCIL as a cluster

The following methods of running RASCIL as a cluster, will provide a set of docker-based environments, which host a Dask scheduler, various Dask workers (numbers can be customized), and a Jupyter lab notebook, which directly connects to the scheduler.

Kubernetes

RASCIL can be run as a cluster in Kubernetes using helm and kubectl (you need to have these two installed). If you want to run it in a local developer environment (e.g. a laptop), we recommend using Minikube.

A custom values.yaml files is provided in /rascil/docker/kubernetes. It is meant to be used with a custom Dask Helm chart maintained by SKA developers, hosted in a GitLab repository. The documentation and details of the SKA Dask Helm chart can be found at https://developer.skao.int/projects/ska-sdp-helmdeploy-charts/en/latest/charts/dask.html.

You can modify the values.yaml file, if needed, e.g. you can change the number of worker replicas, or the docker image used (e.g. the version that should be run). If you don’t use a PersistentVolumeClaim, remove mounts and volume sections from the jupyter and worker entries. (See also /rascil/docker/kubernetes/README.md)

Start Minikube and add the helm repository:

helm repo add ska-helm https://gitlab.com/ska-telescope/sdp/ska-sdp-helmdeploy-charts/-/raw/master/chart-repo
helm repo update

cd into the /rascil/docker/kubernetes directory and install the RASCIL cluster:

helm install test ska-helm/dask -f values.yaml

Instructions on how to connect to the Dask dashboard and the Jupyter lab notebook are printed in the screen, please follow those. You can follow the deployment process and access logs using kubectl or via k9s.

To uninstall the chart and clean out all pods, run:

helm uninstall test

Note: this will remove changes you might have made in the Jupyter notebooks.

Singularity

Singularity can be used to load and run the docker images:

singularity pull RASCIL-full.img docker://artefact.skao.int/rascil-full:1.0.0
singularity exec RASCIL-full.img python3 /rascil/examples/scripts/imaging.py

As in docker, don’t run from the /rascil/ directory.

Inside a SLURM file singularity can be used by prefacing dask and python commands with “singularity exec”. For example:

ssh $host singularity exec /home/<your-name>/workspace/RASCIL-full.img dask-scheduler --port=8786 &
ssh $host singularity exec /home/<your-name>/workspace/RASCIL-full.img dask-worker --host ${host} --nprocs 4 --nthreads 1  \
--memory-limit 100GB $scheduler:8786 &
CMD="singularity exec /home/<your-name>/workspace/RASCIL-full.img python3 ./cluster_test_ritoy.py ${scheduler}:8786 | tee ritoy.log"
eval $CMD

Customisability

The docker images described here are ones we have found useful. However, if you have the RASCIL code tree installed then you can also make your own versions working from these Dockerfiles.

Important updates

Starting with version 0.3.0, RASCIL is installed as a package into the docker images and the repository is not cloned anymore. Hence, every python script (except the ones in the examples directory) within the image has to be called with the -m switch in the following format, when running within the docker container, e.g.:

python -m rascil.apps.rascil_advise <args>