get_dask_client

get_dask_client(timeout=30, n_workers=None, threads_per_worker=None, processes=True, create_cluster=False, memory_limit=None, local_dir='.', with_file=False, scheduler_file='./scheduler.json', dashboard_address=':8787')[source]

Get a Dask.distributed Client to be used in rsexecute

The default operation of rsexecute.set_client is to create a set of workes on one node. Hence if you want to use a cluster it is necessary to use get_dask_client.

The environment variable RASCIL_DASK_SCHEDULER is interpreted as pointing to the Dask distributed scheduler. and a client using that scheduler is returned. Otherwise a client for a LocalCluster is created.

The environment variable RASCIL_DASK_SCHEDULER_FILE is interpreted as pointing to the Dask scheduler file and a client using that scheduler is returned. If RASCIL_DASK_SCHEDULER_FILE is set, with_file option is set to true and scheduler_file name is overridden with the RASCIL_DASK_SCHEDULER_FILE

Parameters:
  • timeout – Time out for creation (30s)

  • n_workers – Number of workers (cores available)

  • threads_per_worker – 1

  • processes – Use processes instead of threads (True)

  • create_cluster – Create a LocalCluster (True)

  • memory_limit – Memory limit per worker (bytes e.g. 8e9) (None)

  • scheduler_file – Scheduler file for Dask (‘./scheduler.json’)

  • dashboard_address – Port used for diagnostics (‘:8787’)

Returns:

Dask client