Installation example using schaap spack
This page explains the work to install the SKA low pipeline on the CSD3 cluster and get it working. This page is copied from https://confluence.skatelescope.org/display/SE/SKA+LOW+pipeline+on+CSD3
Step 1: Install schaap-spack
To install the ska low pipeline on CSD3 I used schaap-spack: https://git.astron.nl/RD/schaap-spack
mkdir schaap-spack
cd schaap-spack
git clone https://github.com/spack/spack.git
source ./spack/share/spack/setup-env.sh
git clone https://git.astron.nl/RD/schaap-spack.git
spack repo add ./schaap-spack
On CSD3 spack is already available, so to make sure our custom installation is used all the time add you the .bashrc the following line:
source /home/hpcsalv1/schaap-spack/spack/share/spack/setup-env.sh
Schaap software require the gcc compiler, which is available in multiple versions on CSD3. To use the right one we just have to load it:
module load gcc/11
afterwards we can use schaap spack to install the necessary software:
spack install python@3.9
spack install dp3@latest
spack install wsclean
Note 1: we also install python3.9 since it s needed by the pipeline, and on CSD3 we have until python 3.8 available
Note 2: It is important to have the latest version of DP3, since earlier versions caused unexpected crashes (still to be investigated)
Step 2: Install the pipeline
mkdir /home/hpcsalv1/ska_sdp_low_wflow
cd /home/hpcsalv1/ska_sdp_low_wflow
git clone --recursive git@gitlab.com:ska-telescope/sdp/science-pipeline-workflows/ska-sdp-wflow-low-selfcal.git
cd ska-sdp-wflow-low-selfcal
git submodule init
git submodule update
Create and activate a virtual environment
(in my case, the path to the python3.9 enviroment is ~/schaap-spack/spack/opt/spack/linux-centos7-cascadelake/gcc-11.2.0/python-3.9.18-kurwxlxz5e2timy2ja7jtzdkqmxaydqy/bin/python)
cd /home/hpcsalv1/ska_sdp_low_wflow
spack load python@3.9
virtualenv -p path_to_python_3.9 chiara_env
source chiara_env/bin/activate
Install poetry using pip
pip install poetry
Possible error: FileNotFoundError: [Errno 2] No such file or directory: ‘/tmp/pip-build-nmfhsumj/cryptography/setup.py’ → pip3 install –upgrade pip
Open poetry environment and install repo
poetry shell
Empty the PYTHONPATH variable before installation
export PYTHONPATH=""
pip install -e .
pip install mpi4py
to check your installation, one can run the tests with for example:
pytest tests/test_support_functions.py
Step 3: run on a single node
Now that everything is installed, each time you access CSD3 and want to run the pipeline you can follow these steps:
cd /home/hpcsalv1/ska_sdp_low_wflow/ska-sdp-wflow-low-selfcal
source ../chiara_env/bin/activate
poetry shell
module purge
spack load dp3@latest
spack load wsclean
Run, for example:
python src/ska_sdp_wflow_low_selfcal/pipeline/main.py \
--input_ms /home/hpcsalv1/rds/rds-sdhp-S7lLL7eOZIg/hpcsalv1/data/midbands_averaged.ms \
--work_dir /home/hpcsalv1/rds/rds-sdhp-S7lLL7eOZIg/hpcsalv1/workdir \
--imaging_taper_gaussian 0.004deg \
--imaging_size 5000 \
--imaging_scale 0.001658792 \
--calibration_nchannels 1 \
--run_distributed True \
--resume_from_operation calibrate_3
Possible errors I encountered:
No module named ‘numpy.core._multiarray_umath’ → solve with: export PYTHONPATH=”” no module named ska_sdp … → make sure you are using the python version installed in the virtual enviroment (which python, python version).
To force it to use the right one, use the full path to python in the command:
/home/hpcsalv1/ska_sdp_low_wflow/chiara_env/bin/python src/ska_sdp_wflow_low_selfcal/pipeline/main.py \
--input_ms /home/hpcsalv1/rds/rds-sdhp-S7lLL7eOZIg/hpcsalv1/data/midbands_averaged.ms \
--work_dir /home/hpcsalv1/rds/rds-sdhp-S7lLL7eOZIg/hpcsalv1/workdir \
--imaging_taper_gaussian 0.004deg \
--imaging_size 5000 \
--imaging_scale 0.001658792 \
--calibration_nchannels 1 \
--run_distributed True \
--resume_from_operation calibrate_3