DaskSlurmCluster ================ Purpose ------- ``DaskSlurmCluster`` is the SLURM-specific cluster implementation used by batchlet. * Creates a dask cluster inside an existing SLURM allocation created with ``sbatch`` or ``salloc``. * Uses ``srun`` to start scheduler and worker processes across allocated nodes. * Exposes a clean interface similar to :py:class:`distributed.LocalCluster` while respecting allocated CPU and memory limits. Working ------- Once ``DaskSlurmCluster`` is initialised on the head node (1st node) of the allocated compute nodes, it starts a local dask scheduler as a subprocess. Then it uses ``srun`` commmand to start local and remote workers on all the nodes, based on user configuration. .. image:: ../_static/daskslurmcluster.drawio.svg :align: center The ``DaskSlurmCluster`` extends the standard :py:class:`distributed.SpecCluster` class. Similar to :py:class:`distributed.LocalCluster`, the ``DaskSlurmCluster`` can take decision on the number of workers, number of threads per worker and memory limit per worker, based on the allocated slurm resources. User can override these values while initialising the constructor. Usage ----- Following slurm (python) script shows basic example of how ``DaskSlurmCluster`` can be used in a python application. .. code-block:: python #!/usr/bin/env python3 #SBATCH -n 2 #SBATCH --nodes 2 #SBATCH --ntasks-per-node 1 #SBATCH --cpus-per-task 12 #SBATCH --mem 32GB from ska_sdp_batchlet.utils.dask_cluster.slurm import DaskSlurmCluster if __name__ == "__main__": with DaskSlurmCluster() as cluster: cluster.wait_for_all_workers() scheduler_address = cluster.scheduler_address client = cluster.get_client() # client code API --- .. autoclass:: ska_sdp_batchlet.utils.dask_cluster.slurm.cluster.DaskSlurmCluster :members: :no-index: