How to run pipelines using SLURM on AWS
This guide provides general instructions for running pipelines on the AWS DP HPC cluster using SLURM. These steps are applicable to most pipelines.
Prerequisites
An account on the AWS DP HPC cluster
The pipeline repository cloned to a directory on the AWS DP HPC cluster
Steps
Log into the DP HPC headnode.
Change directory to the repository root folder:
cd ~/path/to/repo/<pipeline-repo>
Edit the SLURM script as needed.
Locate the SLURM script for your pipeline (for example,
scripts/prod/aws-run-<pipeline>.sbatch).Adjust paths, job parameters, number of nodes, and other settings as required.
Submit the SLURM job:
sbatch path/to/pipeline/script.sbatchTo run on multiple nodes, override the SLURM directives when submitting the job. For example:
sbatch --nodes=3 --ntasks=3 --cpus-per-task=96 scripts/prod/aws-run-cimg.sbatch
Check job status:
squeue sacct
Finishing up
Once the SLURM job has finished, check your pipeline’s documentation for the expected output location and log files, as these vary between pipelines.
If you need help locating job output or checking whether the run completed successfully, see How to monitor a pipeline job on AWS.
Verification
Confirm that the job completed successfully in SLURM, then verify the expected data products and logs using your pipeline’s documentation.