Interpreting simulation outputs

The SDP resource model produces logs to standard output and two pandas dataframes. The logs are used to monitor the progress of the simulation. The dataframes are used to analyse the results of the simulation.

Logs

Logs are written to standard output while the simulation is running and can be redirected to a file. Each log message is prefixed with the simulation time, SBI_ID from the observing schedule file (see observing schedule) and, when running a pipeline step, the pipeline step ID from the pipeline config (documented here). Messages include information about the simulation progress, such as starting and completing an observation, scheduling block instance, pipeline step, and batch processing as well as when storage and compute nodes are requested and allocated. You will see messages like the following:

0: T001_001: Starting scheduling block instance...
0: T001: Requesting 100 GB of capacity storage for raw visibilities...
0: T001: 100 GB capacity storage allocated for raw visibilities.
0: T001_001: Waiting for telescope...
0: T001_001: Starting observation...
3600: T001_001: Observation complete!
3600: T001_001: Starting batch processing...
3600: T001_001: Requesting 50 GB of performance storage for pre-processed visibilities...
3600: T001_001: 50 GB performance storage allocated for pre-processed visibilities.
3600: T001_001 - Step 1: Requesting 5 compute nodes...
3600: T001_001 - Step 1: 5 compute nodes allocated.
4320: T001_001 - Step 1: 5 compute nodes released.
4320: T001_001 - Step 1: Pipeline completed!
4320: T001_001: Batch processing complete!
4320: T001: Retaining 106 GB of data for 24.0 hours.
91440: T001: Deleted 106 GB of data from capacity storage.

Dataframes

Two pandas DataFrames are produced by the SDP resource model. These are used to generate plots in the web interface and can be used for further analysis using the API (see API documentation). They are also output to CSV files when running the simulation via the CLI.

event_log

This dataframe contains information about events that occurred during the simulation. Each row represents a single event. Columns are as follows:

  • batch_name (string): Scheduling block instance ID from the observing schedule.

  • step (string): The name of the event.

  • start (int): The simulation time (s) at which the event started.

  • end (int): The simulation time (s) at which the event ended.

Events include:

  • observing: The observation time for the telescope.

  • capacity_storage_wait_data_products: The time spent waiting for capacity storage for data products.

  • capacity_storage_wait_raw_visibilities: The time spent waiting for capacity storage for raw visibilities.

  • performance_storage_wait: The time spent waiting for performance storage.

  • {pipeline_name}_compute_wait: The time spent waiting for compute nodes.

  • {pipeline_name}_execution: The time spent executing a pipeline step.

  • {pipeline_name}_total: The time spent on a pipeline step, including wait times.

  • data_retention: The time data is retained in capacity storage after batch processing.

  • batch: The total time spent on batch processing - this is used for the Gantt-style chart in the dashboard.

The following shows example rows you might see in the event log dataframe:

batch_name

step

start

end

T001_001

observing

0

3600

T001_001

capacity_storage_wait

3600

4320

T001_001

performance_storage_wait

4320

91440

T001_001

Step 1

4320

91440

resource_usage

This dataframe contains information about the resources used during the simulation. Resource usage is logged before and after every change (i.e. a resource is requested or released).

Each row represents time point in the simulation.

Columns are as follows:

  • time_s (int): The simulation time in seconds (s).

  • compute_nodes_in_use (int): The number of compute nodes in use.

  • capacity_storage_in_use_gb (float): The amount of capacity storage in use (GB).

  • performance_storage_in_use_gb (float): The amount of performance storage in use (GB).

  • time_h (float): The simulation time in hours (h)

  • time_d (float): The simulation time in days (d)

The following shows example rows you might see in the resource usage dataframe:

time_s

compute_nodes_in_use

capacity_storage_in_use_gb

performance_storage_in_use_gb

time_h

time_d

0

0

0.0

0.0

0.0

0.0

3600

5

0.1

0.05

1.0

0.04

4320

0

0.1

0.05

1.2

0.05