Documentation of modules

Utility functions for SDP benchmark tests

class modules.utils.GenerateHplConfig(num_nodes, num_procs, nb, mem, alpha=0.8)[source]

Class to generate HPL.dat file for LINPACK benchmark

estimate_n()[source]: Estimate N based on memory and alpha

estimate_pq()[source]: Get best combination of P and Q based on nprocs

get_config()[source]: Write HPL.dat file to outfile location

modules.utils.sdp_benchmark_tests_root()[source]

Returns the root directory of SKA SDP Benchmark tests

Returns: Path of the root directory
Return type: str

modules.utils.parse_path_metadata(path)[source]

Return a dict of reframe info from a results path

Parameters: path – ReFrame stage/output path
Returns: ReFrame system/partition info
Return type: dict

modules.utils.find_perf_logs(root, benchmark)[source]

Get perflog file names for given test

Parameters

root – Root where perflogs exist
benchmark – Name of the benchmark

Returns

List of perflog file names

Return type

list

modules.utils.read_perflog(path)[source]

Return a pandas dataframe from a ReFrame performance log. NB: This currently depends on having a non-default handlers_perflog.filelog.format in reframe’s configuration. See code. The returned dataframe will have columns for:

all keys returned by parse_path_metadata()

all fields in a performance log record, noting that: - ‘completion_time’ is converted to a datetime.datetime - ‘tags’ is split on commas into a list of strs

‘perf_var’ and ‘perf_value’, derived from ‘perf_info’ field

<key> for any tags of the format “<key>=<value>”, with values converted to int or float if possible

Parameters: path (str) – Path to log file
Returns: Dataframe of perflogs
Return type: pandas.DataFrame

modules.utils.load_perf_logs(root='.', test=None, extras=None, last=False, aggregate_multi_runs=<function median>)[source]

Convenience wrapper around read_perflog().

Parameters

root (str) – Path to root of tree containing perf logs
test (str) – Shell-style glob pattern matched against last directory component to restrict loaded logs, or None to load all in tree
extras (list) – Additional dataframe headers to add
last (bool) – True to only return the most-recent record for each system/partition/enviroment/testname/perf_var combination.
aggregate_multi_runs (Callable) – How to aggregate the perf-values of multiple runs. If None, no aggregation is applied. Defaults to np.median

Returns

Single pandas.dataframe concatenated from all loaded logs, or None if no logs exist

Return type

pandas.DataFrame

modules.utils.tabulate_last_perf(test, root='../../perflogs', extras=None, **kwargs)[source]

Retrieve last perf_log entry for each system/partition/environment.

Parameters

test (str) – Shell-style glob pattern matched against last directory component to restrict loaded logs, or None to load all in tree
root (str) – Path to root of tree containing perf logs of interest - default assumes this is called from an apps/<application>/ directory
extras (list) – Additional dataframe headers to add

Returns

A dataframe with columns: - case: name of the system, partition and environ - perf_var: Performance variable - add_var: Any additional variable passed as argument

Return type

pandas.DataFrame

modules.utils.tabulate_partitions(root)[source]

Tabulate the list of partitions defined with ReFrame config file and high level overview of each partition. We tabulate only partitions that are found in the perflog directory

Parameters: root (str) – Perflog root directory
Returns: A dataframe with all partition details
Return type: pandas.DataFrame

modules.utils.filter_systems_by_name(patterns)[source]

Filter systems based on patterns in the name. If all patterns are found in the name, the system is chosen.

Parameters: patterns (list) – List of patterns to be searched
Returns: List of partitions that match the pattern
Return type: list

modules.utils.filter_systems_by_env(envs)[source]

Filter systems based on valid environments defined for them.

Parameters: envs (list) – List of environments to be searched
Returns: List of partitions that match the envs
Return type: list

modules.utils.git_describe(dir)[source]

Return a string describing the state of the git repo in which the dir is. See git describe –dirty –always for full details.

Parameters: dir (str) – Root path of git repo
Returns: Git describe output
Return type: str

modules.utils.generate_random_number(n)[source]

Generate random integer of n digits

Parameters: n (int) – Length of the desired random number
Returns: Generated random number
Return type: int

modules.utils.get_scheduler_env_list(scheduler_name)[source]

Return the environment variables that stores different job details of different schedulers

Parameters: scheduler_name (str) – Name of the workload scheduler
Returns: Environment variables dict
Return type: dict

modules.utils.emit_conda_init_cmds()[source]: This function emits the command to initialize conda. It temporarily clears the PYTHONPATH, so that no pre-installed dependencies from external or external/perfmon are used. # todo: test if this works even with perfmon

modules.utils.emit_conda_env_cmds(env_name, py_ver='3.8')[source]

This function emits all the commands to create/activate a conda environment. This function assumes conda is installed in the system

Parameters

env_name (str) – Name of the conda env to create/activate
py_ver (str) – Version of python to be used in conda environment (Default: 3.8)

Returns

List of commands to create/activate conda env

Return type

list

modules.utils.merge_spack_configs(input_file, output_file)[source]

This function merges all spack config files by replacing include keyword with respective yaml file

Parameters

input_file – Path to input spack.yaml file
output_file – Path to output merged spack.yaml file

Returns

None