optimise
Module to optimise the hardware configuration of a simulation.
- ska_sdp_resource_model.optimise.optimise.configure_hardware(budgets)[source]
Suggest hardware configuration parameters based on normalised costs.
Given a fixed budget and costs for compute, capacity and performance storage.
- Parameters:
budgets (dict) – The budget for each component. Must include the following: compute, capacity, data_product and performance.
- Returns:
hardware_configuration (dict) – The hardware configuration, including compute_nodes, capacity_storage_pb and performance_storage_tb.
- ska_sdp_resource_model.optimise.optimise.get_budget_split(trial, budget)[source]
Get the budget split for each component.
- Parameters:
trial – The optuna trial object. budget (int): Total budget in EUR.
- Returns:
budget_allocation (dict) – A dictionary with the budget allocated for each component.
- ska_sdp_resource_model.optimise.optimise.get_hardware_cost(hardware_config)[source]
Compute the total cost of a hardware configuration in EUR.
- Parameters:
hardware_config (dict) – Hardware configuration. Must include the following: capacity_storage_pb, performance_storage_tb and compute_nodes
- Returns:
cost (dict) – Dictionary with the cost of each component and the total cost.
- ska_sdp_resource_model.optimise.optimise.group_pipelines(pipelines_config, n_groups, seed=42)[source]
Group pipelines into n_groups.
Uses K-means clustering to group pipelines into n_groups based on node_hours_mean, pct_parallelism_min and pct_parallelism_max.
- Parameters:
pipelines_config (dict) – The pipelines configuration.
n_groups (int) – The number of groups to create.
- Returns:
grouped_pipelines (dict) – Dictionary of pipelines with assigned group number.
- ska_sdp_resource_model.optimise.optimise.main()[source]
Optimise the hardware configuration.
Runs optuna on the hardware configuration to minimise the total time taken.
- Returns:
None
- ska_sdp_resource_model.optimise.optimise.objective(trial, budget, observations_list, pipelines_config, num_monte_carlo_iterations=0, num_workers=1, shuffle=False, n_groups=6)[source]
Objective function for optimisation.
- Parameters:
trial – The optuna trial object.
observations_list (list) – List of observations to simulate.
budget (float) – The budget for the hardware.
pipelines_config (dict) – Pipelines configuration data.
scheduling_block_types_config_path (Path) – Path to the scheduling block types file.
num_monte_carlo_iterations (int) – Number of Monte Carlo iterations to run. Defaults to 0.
shuffle (bool) – Whether to shuffle the observations list before running the simulation. Defaults to False.
n_groups (int) – The number of groups for clustering pipelines. Defaults to 6.
- Returns:
float – Total simulation time in days.
- ska_sdp_resource_model.optimise.optimise.optimise_hardware(budget=8000000.0, n_trials=10, verbose=False, observing_schedule_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/ska-telescope-ska-sdp-resource-model/checkouts/latest/src/ska_sdp_resource_model/data/schedules/observing_schedule.json'), pipelines_config_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/ska-telescope-ska-sdp-resource-model/checkouts/latest/src/ska_sdp_resource_model/data/config/pipelines.json'), scheduling_block_types_config_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/ska-telescope-ska-sdp-resource-model/checkouts/latest/src/ska_sdp_resource_model/data/config/scheduling_block_types.json'), storage='sqlite:///hardware-optimisation.db', study_name=None, num_monte_carlo_iterations=0, shuffle=False, n_groups=6, client=None)[source]
Optimise the hardware configuration.
If num_monte_carlo_iterations is greater than 0, the simulation will be run multiple times with different values for pct_parallelism for each pipeline and the average time taken will be used to optimise hardware.
- Parameters:
budget (float) – The budget for the hardware.
n_trials (int) – The number of trials to run.
verbose (bool) – Whether to print verbose output.
observing_schedule_path (Path) – Path to the observing schedule file.
pipelines_config_path (Path) – Path to the pipelines configuration file.
scheduling_block_types_config_path (Path) – Path to the scheduling block types file.
storage (str) – The name of the database for the optuna study. Defaults to ‘sqlite:///hardware-optimisation.db’.
study_name (str) – The name of the optuna study. Defaults to ‘budget-{budget}’.
num_monte_carlo_iterations (int) – Number of Monte Carlo iterations to run. Defaults to 0.
shuffle (bool) – Whether to shuffle the observations list before running the simulation. Defaults to False.
n_groups (int) – The number of groups for clustering pipelines. Defaults to 6.
client (dask.Client) – Optional, Dask client to utilise for parallelism.
- Returns:
best_params (dict) – The best parameters.
- ska_sdp_resource_model.optimise.optimise.parameterise_pipelines_num_nodes(trial, max_compute_nodes, pipelines_config, n_groups=6)[source]
Parameterise the pipelines configuration.
Reads in the pipelines config file and uses optuna to sample the number of compute nodes for each pipeline, replacing the “num_nodes” parameter.
- Parameters:
trial – The optuna trial object.
max_compute_nodes (int) – The maximum number of compute nodes.
pipelines_config_path (Path) – Path to the pipelines configuration file.
n_groups (int) – The number of groups for clustering pipelines. Defaults to 6.
- Returns:
pipelines_config (dict) – The parameterised pipelines configuration.
trial – The updated optuna trial object.
- ska_sdp_resource_model.optimise.optimise.update_scheduling_blocks(observations_list, pipelines_config)[source]
Update the number of nodes for each pipeline in the scheduling blocks.
This function iterates through the observation list and updates the num_nodes parameter for each pipeline in the pipeline_steps dictionary with the value from the pipelines_config dictionary.
- Parameters:
observations_list (list) – List of observations.
pipelines_config (dict) – Dictionary containing configuration for each pipeline.
- Returns:
None