Documentation of modules

Utility functions for SDP benchmark tests

class modules.utils.GenerateHplConfig(num_nodes, num_procs, nb, mem, alpha=0.8)[source]

Class to generate HPL.dat file for LINPACK benchmark

estimate_n()[source]

Estimate N based on memory and alpha

estimate_pq()[source]

Get best combination of P and Q based on nprocs

get_config()[source]

Write HPL.dat file to outfile location

modules.utils.sdp_benchmark_tests_root()[source]

Returns the root directory of SKA SDP Benchmark tests

Returns

Path of the root directory

Return type

str

modules.utils.parse_path_metadata(path)[source]

Return a dict of reframe info from a results path

Parameters

path – ReFrame stage/output path

Returns

ReFrame system/partition info

Return type

dict

modules.utils.find_perf_logs(root, benchmark)[source]

Get perflog file names for given test

Parameters
  • root – Root where perflogs exist

  • benchmark – Name of the benchmark

Returns

List of perflog file names

Return type

list

modules.utils.read_perflog(path)[source]

Return a pandas dataframe from a ReFrame performance log. NB: This currently depends on having a non-default handlers_perflog.filelog.format in reframe’s configuration. See code. The returned dataframe will have columns for:

  • all keys returned by parse_path_metadata()

  • all fields in a performance log record, noting that: - ‘completion_time’ is converted to a datetime.datetime - ‘tags’ is split on commas into a list of strs

  • ‘perf_var’ and ‘perf_value’, derived from ‘perf_info’ field

  • <key> for any tags of the format “<key>=<value>”, with values converted to int or float if possible

Parameters

path (str) – Path to log file

Returns

Dataframe of perflogs

Return type

pandas.DataFrame

modules.utils.load_perf_logs(root='.', test=None, extras=None, last=False, aggregate_multi_runs=<function median>)[source]

Convenience wrapper around read_perflog().

Parameters
  • root (str) – Path to root of tree containing perf logs

  • test (str) – Shell-style glob pattern matched against last directory component to restrict loaded logs, or None to load all in tree

  • extras (list) – Additional dataframe headers to add

  • last (bool) – True to only return the most-recent record for each system/partition/enviroment/testname/perf_var combination.

  • aggregate_multi_runs (Callable) – How to aggregate the perf-values of multiple runs. If None, no aggregation is applied. Defaults to np.median

Returns

Single pandas.dataframe concatenated from all loaded logs, or None if no logs exist

Return type

pandas.DataFrame

modules.utils.tabulate_last_perf(test, root='../../perflogs', extras=None, **kwargs)[source]

Retrieve last perf_log entry for each system/partition/environment.

Parameters
  • test (str) – Shell-style glob pattern matched against last directory component to restrict loaded logs, or None to load all in tree

  • root (str) – Path to root of tree containing perf logs of interest - default assumes this is called from an apps/<application>/ directory

  • extras (list) – Additional dataframe headers to add

Returns

A dataframe with columns: - case: name of the system, partition and environ - perf_var: Performance variable - add_var: Any additional variable passed as argument

Return type

pandas.DataFrame

modules.utils.tabulate_partitions(root)[source]

Tabulate the list of partitions defined with ReFrame config file and high level overview of each partition. We tabulate only partitions that are found in the perflog directory

Parameters

root (str) – Perflog root directory

Returns

A dataframe with all partition details

Return type

pandas.DataFrame

modules.utils.filter_systems_by_name(patterns)[source]

Filter systems based on patterns in the name. If all patterns are found in the name, the system is chosen.

Parameters

patterns (list) – List of patterns to be searched

Returns

List of partitions that match the pattern

Return type

list

modules.utils.filter_systems_by_env(envs)[source]

Filter systems based on valid environments defined for them.

Parameters

envs (list) – List of environments to be searched

Returns

List of partitions that match the envs

Return type

list

modules.utils.git_describe(dir)[source]

Return a string describing the state of the git repo in which the dir is. See git describe –dirty –always for full details.

Parameters

dir (str) – Root path of git repo

Returns

Git describe output

Return type

str

modules.utils.generate_random_number(n)[source]

Generate random integer of n digits

Parameters

n (int) – Length of the desired random number

Returns

Generated random number

Return type

int

modules.utils.get_scheduler_env_list(scheduler_name)[source]

Return the environment variables that stores different job details of different schedulers

Parameters

scheduler_name (str) – Name of the workload scheduler

Returns

Environment variables dict

Return type

dict

modules.utils.emit_conda_init_cmds()[source]

This function emits the command to initialize conda. It temporarily clears the PYTHONPATH, so that no pre-installed dependencies from external or external/perfmon are used. # todo: test if this works even with perfmon

modules.utils.emit_conda_env_cmds(env_name, py_ver='3.8')[source]

This function emits all the commands to create/activate a conda environment. This function assumes conda is installed in the system

Parameters
  • env_name (str) – Name of the conda env to create/activate

  • py_ver (str) – Version of python to be used in conda environment (Default: 3.8)

Returns

List of commands to create/activate conda env

Return type

list

modules.utils.merge_spack_configs(input_file, output_file)[source]

This function merges all spack config files by replacing include keyword with respective yaml file

Parameters
  • input_file – Path to input spack.yaml file

  • output_file – Path to output merged spack.yaml file

Returns

None