Running a simulation
Once you have configured the inputs for the simulator (see Configuring the inputs for a simulation), you can run the simulation.
The simulation can be run interactively via a web interface or programmatically as part of a Python script or Jupyter notebook. It can also be run on the commandline.
See also the Interpreting simulation outputs section for information on interpreting the logs and dataframes produced by the simulator.
This page covers running a basic simulation. See the Tutorials section for ways to customise the simulation, optimise the hardware configuration, run monte carlo simulations and run end-to-end experiments.
Using the web interface
To use the web interface, run the application locally using Poetry:
poetry run resource_model
The app will run on localhost at port 8050 by default: http://127.0.0.1:8050/. Use Ctrl + C in
the terminal to stop the server.
Plots
The web interface provides several interactive example plots on the front-end:
A tree map of scheduling block metrics read from
data/config/scheduling_block_types.json.A Gantt-style chart of scheduled block instances read from
data/schedules/observing_schedule.json.A summary of the total run time.
A line plot of the compute node usage over time.
A line plot of the capacity storage usage (raw visibilities and data products) over time.
A line plot of the performance storage usage (pre-processed visibilities) over time.
A Gantt-style chart showing the start and end of batch processing for each scheduling block instance that triggers (where start is defined as the time when performance storage has been allocated and end is when all processing has completed and performance storage has been released).
A strip plot showing individual wait times for each scheduling block instance. This includes waiting for capacity storage (either for raw visibilities or data products), performance storage and compute nodes.
Using the CLI
To use the CLI, run the application locally using Poetry. You can find more detailed usage information by running:
poetry run resource_usage --help
This will output the following help message:
usage: __main__.py [-h] [--observing_schedule_path OBSERVING_SCHEDULE_PATH]
[--generate_observing_schedule_hrs GENERATE_OBSERVING_SCHEDULE_HRS]
[--hardware_path HARDWARE_PATH] [--hardware HARDWARE]
[--scheduling_block_types_path SCHEDULING_BLOCK_TYPES_PATH]
[--pipelines_path PIPELINES_PATH]
[--num_monte_carlo_iterations NUM_MONTE_CARLO_ITERATIONS]
[--num_workers NUM_WORKERS] [--shuffle_observations]
[--output_path OUTPUT_PATH] [--verbose] [--debug]
Run the resource usage simulation.
options:
-h, --help show this help message and exit
--observing_schedule_path OBSERVING_SCHEDULE_PATH
Path to CSV file containing the observing schedule.
--generate_observing_schedule_hrs GENERATE_OBSERVING_SCHEDULE_HRS
Generate a new observing schedule with the specified
number of hours of observations. Samples from the
scheduling block types config file and generate
datetimes and IDs. Replaces the observing schedule
file specified by --observing_schedule_path.
--hardware_path HARDWARE_PATH
Path to JSON file containing hardware configuration.
--hardware HARDWARE Name of the hardware configuration to use.
--scheduling_block_types_path SCHEDULING_BLOCK_TYPES_PATH
Path to JSON file containing scheduling block types
configuration.
--pipelines_path PIPELINES_PATH
Path to JSON file containing pipelines configuration.
--num_monte_carlo_iterations NUM_MONTE_CARLO_ITERATIONS
Number of Monte Carlo iterations to run.
--num_workers NUM_WORKERS
Number of worker processes to use for running parallel
simulations. If `None` or `-1`, a suitable default
number of workers will be chosen. For a small number
of iterations (`n_iter < 500`), the simulations will
be run sequentially on a single worker. For larger
simulations, the default value is set to four times
the cpu count of the system at import time.
--shuffle_observations
Shuffle the observations list before running the
simulation.
--output_path OUTPUT_PATH
Path to output directory.
--verbose Print logging to console.
--debug Set logging level to DEBUG for detailed logging.
Using the API
The following code snippet demonstrates how to run the simulator in a Python script. This will output two pandas DataFrames containing resource usage and event logs. You can then use these DataFrames to generate plots or perform further analysis.
from ska_sdp_resource_model.simulate.resource_usage import ResourceUsageSimulator
from ska_sdp_resource_model.simulate.process_inputs import process_inputs
# Define a path to your observing schedule JSON file
path_to_observing_schedule = "data/schedules/observing_schedule.json"
# Process the inputs
observations_list, hardware_config_data = process_inputs(
observing_schedule_path=path_to_observing_schedule
)
# Create an instance of the simulator
simulator = ResourceUsageSimulator()
# Run the simulation
output = simulator.run_simulation(observations_list, hardware_config_data["50_50"])
Using a Jupyter notebook
We have included example Jupyter notebooks in the notebooks directory of the repository.
01-basic-simulation.ipynbdemonstrates how to run the simulation for multiple hardware configurations and generate plots to compare the results.