How to run an experiment
========================

This tutorial takes you through the steps required to run an experiment, which optimises the
hardware configuration and generates the mean run time, with 95% confidence, from a Monte Carlo
simulation on a given set of inputs.

Ensure you have followed the :doc:`installation instructions <../installation>` and successfully run
the simulation with default settings before continuing.

Read the following sections to ensure you are familiar with the simulation's inputs:

- :doc:`Hardware configuration file <../usage/inputs/configuration/hardware_configuration>`
- :doc:`Scheduling block types configuration file
  <../usage/inputs/configuration/scheduling_block_types_configuration>`
- :doc:`Pipeline configuration file <../usage/inputs/configuration/pipeline_configuration>`
- :doc:`Observing schedule file <../usage/inputs/configuration/observing_schedule>`

Also read the following tutorials to understand how to run the simulation and how to optimise the
hardware configuration:

- :doc:`Running the simulation <../usage/run_simulation>`
- :doc:`Running a hardware optimisation <../tutorials/optimise_hardware_configuration>`

Requirements for running an experiment
--------------------------------------

To run an experiment you will need the following inputs:

1. **An observing schedule you want to simulate (JSON).**

   See :doc:`observing schedule </usage/inputs/configuration/observing_schedule>`.

   You will need the path to this file to run the experiment.

2. **Configuration of scheduling block types that make up your observing schedule (JSON).**

   See :doc:`Scheduling block types configuration file
   <../usage/inputs/configuration/scheduling_block_types_configuration>`.

   Any scheduling block type that is in the observing schedule file must be in this JSON file with
   all key value pairs.

   You will need the path to this file to run the experiment.

3. **Configuration for each pipeline (JSON).**

   See :doc:`Pipeline configuration file <../usage/inputs/configuration/pipeline_configuration>` for
   details on the pipeline configuration.

   Any pipeline that is in the scheduling block types configuration file must be in this JSON file
   with all key value pairs.

   You will need the path to this file to run the experiment.

   .. note::

       The hardware optimisation will search for the best ``num_nodes`` parameters to use for each
       pipeline (see :doc:`Running a hardware optimisation
       <../tutorials/optimise_hardware_configuration>`).

   .. note::

       The ``pct_parallelism`` parameter will be sampled from a uniform distribution between
       ``pct_parallelism_min`` and ``pct_parallelism_max`` and the ``node_hours`` parameter will be
       sampled from a zero-truncated normal distribution with a mean of ``node_hours_mean`` and a
       standard deviation of ``node_hours_mean * node_hours_uncertainty``.

   .. note::

       To avoid sampling parameters that are not needed, the pipeline configuration file should only
       contain the pipelines that are used in the scheduling block types configuration file.

6. **Your hardware budget (euros).**

   The optimisation process will optimise the split of your budget between compute nodes, capacity
   storage and performance storage. If you don't supply your own budget, a default value of 8M Euros
   will be used.

   .. note::

       The cost per compute node, capacity storage and performance storage are currently hardcoded.

7. **The number of optimisation trials you want to run.**

   This is a key parameter in the optimisation process that determines how many times parameters are
   sampled and a simulation is run.

   Default is 10.

4. **The number of Monte Carlo iterations you want to run.**

   You must set this to a value greater than 0 (the default is 0, which skips the Monte Carlo runs).
   A Monte Carlo simulation is run for each optimisation trial plus one additional run for the best
   configuration found.

5. **Whether or not to shuffle the order of observations in the observing schedule.**

   This is an optional part of the Monte Carlo simulation. If set it will shuffle the order of
   observations at the start of each Monte Carlo iteration.

   Default is False.

8. **The storage name for the database where the optimisation results will be stored**

   Default is "sqlite:///hardware-optimisation.db".

   If this database already exists then the results will be appended to it. If it does not exist, a
   new database will be created.

   .. note::

       The database will be created in the current working directory.

   .. note::

       The historical values of the optimisation results will be used by the optimisation algorithm
       to guide the sampling of parameter values. If you don't want this behaviour, you can delete
       the database and a new one will be created.

9. **The output directory where the results will be saved.**

   The results will be saved in this directory as a JSON file named "results-YYYYMMDD-hhmmss.json".

Running the experiment
----------------------

Once you have the required inputs above, you can run the experiment using the CLI:

.. code-block:: bash

    poetry run run_experiment \
       --observing_schedule_path $OBSERVING_SCHEDULE_PATH \
       --scheduling_block_types_path $SCHEDULING_BLOCK_TYPES_PATH \
       --pipelines_path $PIPELINES_PATH \
       --storage $STORAGE_PATH \
       --output_dir $OUTPUT_DIR \
       --n_trials $N_TRIALS \
       --n_iter $N_ITER \
       --shuffle_observation_order

Get more details on each of the CLI arguments by running:

.. code-block:: bash

    poetry run run_experiment --help

Interpreting the results
~~~~~~~~~~~~~~~~~~~~~~~~

The results file in the output directory will contain the following:

- Summary statistics from the final Monte Carlo simulation using the optimised hardware configuration:
      - ``runtime_mean_days``: The mean run time.
      - ``runtime_min_days``: The minimum run time.
      - ``runtime_max_days``: The maximum run time.
      - ``runtime_95_ci_days``: The 95% confidence intervals for the mean run time.
- Configuration found by the optimisation step:
      - ``hardware_config``: The optimised hardware configuration found by the optimisation
        algorithm used to generate the results.
      - ``pipelines_config``: The pipelines configuration with the optimised ``num_nodes`` for each
        pipeline. Note that the Monte simulation will have sampled different values for the
        ``pct_parallelism`` and ``node_hours`` parameters.
- Configuration of the experiment:
      - ``args``: The commandline arguments used to run the experiment.

Running an experiment using SLURM
---------------------------------

You can also run the experiment using SLURM. This is useful if you want to run the experiment on a
cluster. We have included an example SLURM script in the `scripts` directory of the repository. You
will need to clone the repository, set up a poetry environment and modify the variables used in the
script.

Running an experiment using Jupyter notebooks
---------------------------------------------

We have included example experiments that use Jupyter notebooks in the `notebooks directory
<https://gitlab.com/ska-telescope/sdp/ska-sdp-resource-model/-/tree/main/notebooks/experiments>`_ of
the repository. This is useful to inspect detailed outputs of each simulation and the results of the
optimisation.