How to run pipelines using SLURM on AWS

This guide provides general instructions for running pipelines on the AWS DP HPC cluster using SLURM. These steps are applicable to most pipelines.

Prerequisites

Log into the DP HPC headnode.
Change directory to the repository root folder:
cd ~/path/to/repo/<pipeline-repo>
Edit the SLURM script as needed.
- Locate the SLURM script for your pipeline (for example, scripts/prod/aws-run-<pipeline>.sbatch).
- Adjust paths, job parameters, number of nodes, and other settings as required.
Submit the SLURM job:
sbatch path/to/pipeline/script.sbatch
To run on multiple nodes, override the SLURM directives when submitting the job. For example:
sbatch --nodes=3 --ntasks=3 --cpus-per-task=96 scripts/prod/aws-run-cimg.sbatch
Check job status:
squeue sacct

Once the SLURM job has finished, check your pipeline’s documentation for the expected output location and log files, as these vary between pipelines.

If you need help locating job output or checking whether the run completed successfully, see How to monitor a pipeline job on AWS.

Confirm that the job completed successfully in SLURM, then verify the expected data products and logs using your pipeline’s documentation.