Utility Functions

This module contains the utility functions.

class sdpbenchmarks.utils.ParamSweeper(persistence_dir, params=None, name=None, randomise=True)[source]

This class is inspired from execo library (http://execo.gforge.inria.fr/doc/latest-stable/) except this is very simplified version of the original. The original one is developed for large scale experiments and thread safety. Here what we are interested is the state of each run that can be tracked and remembered when launching the experiments.

done(combination)[source]

Marks the iterable as done

get_done()[source]

Returns the iterable of finished runs

get_ignored()[source]

Returns the iterable of ignored runs

get_inprogress()[source]

Returns the iterable of runs in progress

get_next()[source]

Returns the iterable next run

get_remaining()[source]

Returns the iterable of remaining to iterate on

get_skipped()[source]

Returns the iterable of skipped runs

get_submitted()[source]

Returns the iterable of submitted jobs (when batch scheduler is used)

get_sweeps()[source]

Returns the iterable of what to iterate on

ignore(combination)[source]

Marks the iterable as ignored

set_sweeps(params=None)[source]

This method sets the sweeps to be performed

skip(combination)[source]

Marks the iterable as skipped

submit(combination)[source]

Marks the iterable as submitted

sdpbenchmarks.utils.create_scheduler_conf(conf, param, bench_name)[source]

Prepares a dict with parameters that will create a job submit

Parameters
  • conf (dict) – A dict containing configuration.

  • param (dict) – A dict containing all parameters for the run

  • bench_name (str) – Name of the benchmark

Returns

A dict with parameters that need to submit a job file

Return type

dict

Raises

KeyNotFoundError – An error occurred while looking for a key in conf or param

sdpbenchmarks.utils.exec_cmd(cmd_str)[source]

This method executes the given command

Parameters

cmd_str (str) – Command to execute

Returns

A subprocess.run output with stdout, stderr and return code in the object

Raises

ExecuteCommandError – An error occurred during execution of command

sdpbenchmarks.utils.execute_command_on_host(cmd_str, out_file)[source]

This method executes the job on host

Parameters
  • cmd_str (str) – Command to execute

  • out_file (str) – Name of the stdout/stderr file

Raises

ExecuteCommandError – An error occurred during execution of command

sdpbenchmarks.utils.execute_job_submission(cmd_str, run_prefix)[source]

This method submits to SLURM job scheduler and returns job ID

Parameters
  • cmd_str (str) – Command string to be submitted

  • run_prefix (str) – Prefix of the bench run

Returns

ID of the submitted job or raises exception in case of failure

Return type

int

Raises

JobSubmissionError – An error occurred during the job submission

sdpbenchmarks.utils.get_job_status(conf, job_id)[source]

Returns the status of the batch job

Parameters
  • conf (dict) – A dict containing configuration of batch scheduler

  • job_id (int) – ID of the job

Returns

current status of the job

Return type

str

sdpbenchmarks.utils.get_project_root()[source]

Get root directory of the project

Returns

Full path of the root directory

Return type

str

sdpbenchmarks.utils.get_sockets_cores(conf)[source]

Returns the number of sockets and cores on the compute nodes. For interactive runs, lscpu can be used to grep the info. When using the script to submit jobs from login nodes, lscpu cannot be used and sinfo for a given partition is used.

Parameters

conf (dict) – A dict containing configuration settings

Returns

Number of sockets on each node, number of physical cores on each node (num_socekts * num_cores per socket), number of threads inside each core

Return type

list

Raises

KeyNotFoundError – An error occurred if key is not found in g5k dict that contains lscpu info for different clusters

sdpbenchmarks.utils.load_modules(module_list)[source]

This function purges the existing modules and loads given modules

Parameters

module_list (str) – List of modules to load

sdpbenchmarks.utils.log_failed_cmd_stderr_file(output)[source]

This method dumps the output to a file when command execution fails

Parameters

output (str) – stdout and stderr from execution of command

Returns

Path of the file

Return type

str

sdpbenchmarks.utils.pull_image(uri, container_mode, path)[source]

This pulls the image from the registry. It returns error if image is not pullable

Parameters
  • uri (str) – URI of the image

  • container_mode (str) – Docker or Singularity container

  • path (str) – Path where image needs to be saved (only for singularity). It will overwrite if image already exists.

Returns

0 - OK 1 - Not OK

Return type

int

sdpbenchmarks.utils.reformat_long_string(ln_str, width=70)[source]

This method reformats command string by breaking it into multiple lines.

Parameters
  • ln_str (str) – Long string to break down

  • width (int) – Width of each line (Default is 70)

Returns

Same string in multiple lines to ease readability

Return type

str

sdpbenchmarks.utils.standardise_output_data(bench_name, conf, param, metrics)[source]

This method saves all the data of the benchmark run in json format. The aim is to put all the info tp be able to reproduce the run.

Parameters
  • bench_name (str) – Name of the benchmark

  • conf (dict) – A dict file containing all configuration info

  • param (dict) – A parameter dict file

  • metrics (dict) – Dict file containing all the metric data

Raises

KeyNotFoundError – An error occurred while looking for a key in conf or param

sdpbenchmarks.utils.submit_job(conf, job_id)[source]

This method submits job to the scheduler

Parameters
  • conf (dict) – A dict containing all parameters needed for job submission

  • job_id (int) – Job ID of the previous job. In case of dependent jobs, this is necessary

Returns

ID of the submitted job

Return type

int

sdpbenchmarks.utils.sweep(parameters)[source]

This method accepts a dict with possible values for each parameter and creates a parameter space to sweep

Parameters

parameters (dict) – A dict containing parameters and its values

Returns

All possible combinations of the parameter space

Return type

list

sdpbenchmarks.utils.which(cmd, modules)[source]

This function loads the given modules and returns of path of the requested binary if found or None

Parameters
  • cmd (str) – Name of the binary

  • modules (str) – modules to load

Returns

Path of the binary or None if not found

Return type

str

sdpbenchmarks.utils.write_oar_job_file(conf_oar)[source]

This method writes a OAR job file to submit with sbatch

Parameters

conf_oar (dict) – A dict containing all OAR job parameters

Returns

Name of the file

Return type

str

Raises

JobScriptCreationError – An error occurred in creation of job script

sdpbenchmarks.utils.write_slurm_job_file(conf_slurm)[source]

This method writes a SLURM job file to submit with sbatch

Parameters

conf_slurm (dict) – A dict containing all SLURM job parameters

Returns

Name of the file

Return type

str

Raises

JobScriptCreationError – An error occurred in creation of job script

sdpbenchmarks.utils.write_tgcc_job_file(conf_slurm)[source]

This method writes a SLURM job file for TGCC Irene machine to submit with ccc_msub

Parameters

conf_slurm (dict) – A dict containing all SLURM job parameters

Returns

Name of the file

Return type

str

Raises

JobScriptCreationError – An error occurred in creation of job script