How to run pipelines using SLURM on AWS ======================================= This guide provides general instructions for running pipelines on the AWS DP HPC cluster using SLURM. These steps are applicable to most pipelines. Related ------- - :doc:`How to load Spack modules on AWS ` - :doc:`How to start an interactive compute node on AWS ` - :doc:`How to monitor a pipeline job on AWS ` - `Accessing the clusters (Confluence) `_ - `How to run ICAL on AWS `_ - `How to run CIMG on AWS `_ Prerequisites ------------- - An account on the AWS DP HPC cluster - The pipeline repository cloned to a directory on the AWS DP HPC cluster Steps ----- 1. Log into the DP HPC headnode. 2. Change directory to the repository root folder: .. code-block:: bash cd ~/path/to/repo/ 3. Edit the SLURM script as needed. - Locate the SLURM script for your pipeline (for example, ``scripts/prod/aws-run-.sbatch``). - Adjust paths, job parameters, number of nodes, and other settings as required. 4. Submit the SLURM job: .. code-block:: bash sbatch path/to/pipeline/script.sbatch 5. To run on multiple nodes, override the SLURM directives when submitting the job. For example: .. code-block:: bash sbatch --nodes=3 --ntasks=3 --cpus-per-task=96 scripts/prod/aws-run-cimg.sbatch 6. Check job status: .. code-block:: bash squeue sacct Finishing up ------------ Once the SLURM job has finished, check your pipeline's documentation for the expected output location and log files, as these vary between pipelines. If you need help locating job output or checking whether the run completed successfully, see :doc:`How to monitor a pipeline job on AWS `. Verification ------------ Confirm that the job completed successfully in SLURM, then verify the expected data products and logs using your pipeline's documentation.