Utility Functions¶

This module contains the utility functions.

class sdpbenchmarks.utils.ParamSweeper(persistence_dir, params=None, name=None, randomise=True)[source]¶

This class is inspired from execo library (http://execo.gforge.inria.fr/doc/latest-stable/) except this is very simplified version of the original. The original one is developed for large scale experiments and thread safety. Here what we are interested is the state of each run that can be tracked and remembered when launching the experiments.

done(combination)[source]¶: Marks the iterable as done

get_done()[source]¶: Returns the iterable of finished runs

get_ignored()[source]¶: Returns the iterable of ignored runs

get_inprogress()[source]¶: Returns the iterable of runs in progress

get_next()[source]¶: Returns the iterable next run

get_remaining()[source]¶: Returns the iterable of remaining to iterate on

get_skipped()[source]¶: Returns the iterable of skipped runs

get_submitted()[source]¶: Returns the iterable of submitted jobs (when batch scheduler is used)

get_sweeps()[source]¶: Returns the iterable of what to iterate on

ignore(combination)[source]¶: Marks the iterable as ignored

set_sweeps(params=None)[source]¶: This method sets the sweeps to be performed

skip(combination)[source]¶: Marks the iterable as skipped

submit(combination)[source]¶: Marks the iterable as submitted

sdpbenchmarks.utils.create_scheduler_conf(conf, param, bench_name)[source]¶

Prepares a dict with parameters that will create a job submit

Parameters

conf (dict) – A dict containing configuration.
param (dict) – A dict containing all parameters for the run
bench_name (str) – Name of the benchmark

Returns

A dict with parameters that need to submit a job file

Return type

dict

Raises

KeyNotFoundError – An error occurred while looking for a key in conf or param

sdpbenchmarks.utils.exec_cmd(cmd_str)[source]¶

This method executes the given command

Parameters: cmd_str (str) – Command to execute
Returns: A subprocess.run output with stdout, stderr and return code in the object
Raises: ExecuteCommandError – An error occurred during execution of command

sdpbenchmarks.utils.execute_command_on_host(cmd_str, out_file)[source]¶

This method executes the job on host

Parameters

cmd_str (str) – Command to execute
out_file (str) – Name of the stdout/stderr file

Raises

ExecuteCommandError – An error occurred during execution of command

sdpbenchmarks.utils.execute_job_submission(cmd_str, run_prefix)[source]¶

This method submits to SLURM job scheduler and returns job ID

Parameters

cmd_str (str) – Command string to be submitted
run_prefix (str) – Prefix of the bench run

Returns

ID of the submitted job or raises exception in case of failure

Return type

int

Raises

JobSubmissionError – An error occurred during the job submission

sdpbenchmarks.utils.get_job_status(conf, job_id)[source]¶

Returns the status of the batch job

Parameters

conf (dict) – A dict containing configuration of batch scheduler
job_id (int) – ID of the job

Returns

current status of the job

Return type

str

sdpbenchmarks.utils.get_project_root()[source]¶

Get root directory of the project

Returns: Full path of the root directory
Return type: str

sdpbenchmarks.utils.get_sockets_cores(conf)[source]¶

Returns the number of sockets and cores on the compute nodes. For interactive runs, lscpu can be used to grep the info. When using the script to submit jobs from login nodes, lscpu cannot be used and sinfo for a given partition is used.

Parameters: conf (dict) – A dict containing configuration settings
Returns: Number of sockets on each node, number of physical cores on each node (num_socekts * num_cores per socket), number of threads inside each core
Return type: list
Raises: KeyNotFoundError – An error occurred if key is not found in g5k dict that contains lscpu info for different clusters

sdpbenchmarks.utils.load_modules(module_list)[source]¶

This function purges the existing modules and loads given modules

Parameters: module_list (str) – List of modules to load

sdpbenchmarks.utils.log_failed_cmd_stderr_file(output)[source]¶

This method dumps the output to a file when command execution fails

Parameters: output (str) – stdout and stderr from execution of command
Returns: Path of the file
Return type: str

sdpbenchmarks.utils.pull_image(uri, container_mode, path)[source]¶

This pulls the image from the registry. It returns error if image is not pullable

Parameters

uri (str) – URI of the image
container_mode (str) – Docker or Singularity container
path (str) – Path where image needs to be saved (only for singularity). It will overwrite if image already exists.

Returns

0 - OK 1 - Not OK

Return type

int

sdpbenchmarks.utils.reformat_long_string(ln_str, width=70)[source]¶

This method reformats command string by breaking it into multiple lines.

Parameters

ln_str (str) – Long string to break down
width (int) – Width of each line (Default is 70)

Returns

Same string in multiple lines to ease readability

Return type

str

sdpbenchmarks.utils.standardise_output_data(bench_name, conf, param, metrics)[source]¶

This method saves all the data of the benchmark run in json format. The aim is to put all the info tp be able to reproduce the run.

Parameters

bench_name (str) – Name of the benchmark
conf (dict) – A dict file containing all configuration info
param (dict) – A parameter dict file
metrics (dict) – Dict file containing all the metric data

Raises

KeyNotFoundError – An error occurred while looking for a key in conf or param

sdpbenchmarks.utils.submit_job(conf, job_id)[source]¶

This method submits job to the scheduler

Parameters

conf (dict) – A dict containing all parameters needed for job submission
job_id (int) – Job ID of the previous job. In case of dependent jobs, this is necessary

Returns

ID of the submitted job

Return type

int

sdpbenchmarks.utils.sweep(parameters)[source]¶

This method accepts a dict with possible values for each parameter and creates a parameter space to sweep

Parameters: parameters (dict) – A dict containing parameters and its values
Returns: All possible combinations of the parameter space
Return type: list

sdpbenchmarks.utils.which(cmd, modules)[source]¶

This function loads the given modules and returns of path of the requested binary if found or None

Parameters

cmd (str) – Name of the binary
modules (str) – modules to load

Returns

Path of the binary or None if not found

Return type

str

sdpbenchmarks.utils.write_oar_job_file(conf_oar)[source]¶

This method writes a OAR job file to submit with sbatch

Parameters: conf_oar (dict) – A dict containing all OAR job parameters
Returns: Name of the file
Return type: str
Raises: JobScriptCreationError – An error occurred in creation of job script

sdpbenchmarks.utils.write_slurm_job_file(conf_slurm)[source]¶

This method writes a SLURM job file to submit with sbatch

Parameters: conf_slurm (dict) – A dict containing all SLURM job parameters
Returns: Name of the file
Return type: str
Raises: JobScriptCreationError – An error occurred in creation of job script

sdpbenchmarks.utils.write_tgcc_job_file(conf_slurm)[source]¶

This method writes a SLURM job file for TGCC Irene machine to submit with ccc_msub

Parameters: conf_slurm (dict) – A dict containing all SLURM job parameters
Returns: Name of the file
Return type: str
Raises: JobScriptCreationError – An error occurred in creation of job script