How to run on the AWS DP HPC cluster using SLURM and the Prefect UI
This page describes how to run the Continuum Imaging Pipeline (CIMG) on one or more nodes on the AWS DP HPC cluster using SLURM with the Prefect server running on the headnode.
If you want to run the job without the Prefect UI, see the Instructions to run the pipeline using only SLURM.
There are three main steps to running the Continuum Imaging Pipeline (CIMG) on the AWS DP HPC cluster using SLURM with Prefect:
Set up a Prefect server on the headnode
Submit a SLURM job
Monitor the flow in the Prefect UI
Prerequisites
An account on the AWS DP HPC cluster
This repository cloned to a directory on the AWS DP HPC cluster
Steps
1. Set up a Prefect server on the headnode
If there is already a Prefect server running on the headnode, you can skip this step.
Log into the DP HPC headnode.
Start a tmux session (or attach to an existing one if you already have one set up):
tmux new -s prefect # or tmux attach -t prefect
Change to the project root directory.
Run the shell script that starts the prefect server:
./scripts/dev/prefect/aws-prefect-start.sh
If the port is already in use, you can set the environment variable
PREFECT_PORTto an alternative port before running the command above.export PREFECT_PORT=12345 ./scripts/dev/prefect/aws-prefect-start.sh
You should see a Prefect startup message in the terminal as well as the instructions for setting up the SSH tunnel to access the Prefect UI on your local machine. Make note of these instructions to access the Prefect UI.
Example output:
To tunnel from your laptop (example): aws-vault exec dp-hpc -- \ ssh -N -4 -L 127.0.0.1:14200:127.0.0.1:46200 <your-username>@<headnode-address> Then open: http://127.0.0.1:14200 in your browser to view the Prefect dashboard.
The Prefect server will continue running in the tmux session until you stop it, even if you log out of the headnode.
Detach from tmux (
CTRL-B D) to leave the server running.
2. Submit a SLURM job to run the pipeline
Make sure the Prefect server is running before submitting the job. Note that this set up does not support running multiple jobs simultaneously.
Log into the headnode.
Set repository path:
export REPO_DIR=~/path/to/repo/ska-sdp-cimg
If using a custom Prefect port, Make sure to use the same port as the one you set when starting the Prefect server.
export PREFECT_PORT=12345
Edit the SLURM script scripts/dev/aws-run-cimg-spack-deployed.sbatch if needed (paths, job parameters, number of nodes etc).
Submit the job:
sbatch scripts/dev/aws-run-cimg-spack-deployed.sbatch
3. Monitor with Prefect UI
If you noted down the SSH tunneling instructions from the output of the Prefect server startup script, run the SSH tunnel command on your local machine. Alternatively, you can find those instructions in the prefect log file in the project root directory. Example filename: prefect-server-20260331-101739.log.
Run the command on your local machine.
aws-vault exec dp-hpc -- \ ssh -N -4 -L 127.0.0.1:14200:127.0.0.1:46200 <your-username>@<headnode-address>
Open the browser (may be different if you set a custom port):
4. Finishing up
Outputs are in
$PWD/runsStop Prefect server when done:
Reattach tmux
Press
CTRL-CExit session
Close SSH tunnel (
CTRL-D)