Script Development

When developing a new script, the following steps need to be taken:

generate script files
update script code and write tests
update processing block parameter model and export it into JSON
move the JSON schema into the tmdata directory to the correct location
test your script
update the documentation and set up automated testing
set up the script for a release

These steps are described below. Development happens on a branch of the ska-sdp-script repository, and it requires merge request reviews.

Script Generator

The ska-sdp-script repository provides a python script, which generates all the files necessary for a processing script to work. It uses templates for batch and realtime scripts, which you will need to modify to meet your needs. It automatically creates the directory for the new script in src and adds all of the supporting files as well. The files it currently generates within the src/ska-sdp-script-<my-script> directory (replace <my-script> with your script’s name):

<my-script>.py: the processing script
<my-script>_params.py: pydantic model for processing block parameters
.release
Dockerfile
Makefile
pyproject.toml
README.md
CHANGELOG.md
skart.toml

It also creates an empty directory in tmdata/ska-sdp/scripts for the processing block JSON schema (see later). Finally, it adds an rst file to docs/src to point to the new script’s README, CHANGELOG and Pydantic model. You will have to update docs/src/script.rst (or move the file and update docs/src/test-script.rst if you created a test script) to include a call to the new .rst file, else it will not appear in the documentation.

Once generated, review and update the files as needed, especially focusing on the script, its _params.py file, and its dependencies.

To run the script generator:

python scripts/templates/create_script.py <kind> <name>

where <kind> is either realtime or batch and <name> is the name of the new script.

For example:

python scripts/templates/create_script.py realtime my-realtime

Usage

usage: create_script.py [-h] kind name

Create source files for real-time or batch script.

positional arguments:
  kind        Kind of script (realtime or batch)
  name        Name of the script to be created, e.g. test-pipeline

options:
  -h, --help  show this help message and exit

The script generator does not create test files, however we recommend that as you develop your script, you add unit tests as well.

You can create the files manually as well, but make sure you create all of the necessary files. To understand what you need, look up one of the example scripts: Real-time script (Test Real-Time Script) and Batch script (Test Batch Script). These are meant to give you a general idea of the structure real-time and batch scripts should have, and help develop your own.

List of available Helm charts, which can be used for the execution engines deployed by scripts, and their documentation can be found at SDP Helm Deployer Charts.

Processing block parameters and JSON schema

Processing scripts are configured via processing block parameters in the AssignResources configuration string (see example).

It is important that we document these parameters so that users know how to use the script. Ideally, parameters would take sensible defaults where possible and users would only need to provide essential information.

The parameters are documented using pydantic models, which are found in the <my-script>_params.py python file. This is also generated by the script generator, but you will need to update it to match the parameters your script uses. Make sure you correctly document these. The model is used internally by the processing script to validate its incoming parameters.

Next, generate the JSON schema from the pydantic model by running

scripts/export_schemas.py <name> <version>

where <name> is the name of the script you are exporting, e.g. my-test-script and <version> is the version of the JSON schema, which is an integer starting at 1 for the first version.

export_schemas.py will need to be updated: the new script will need to be included in the ALLOWED_SCRIPT_NAMES global variable.

You may also omit the version number, in which case the next available version number will be used. You may also invoke the export_schemas.py script without arguments, in which case the schemas for all available processing scripts will be updated sequentially (use this option with care since most use cases will require only updating a single schema).

The updated schema file will appear in tmdata/ska-sdp/scripts/<my-script> directory, make sure you add it to git.

The JSON files are used by external users to understand what parameters a specific version of a processing script takes and by the SDP subarray for high-level validation of the AssignResources configuration string.

Test the script locally

Build the script image. If you are using minikube to deploy the SDP, run from the directory of your script (replace <my-script> with the script name):

eval $(minikube -p minikube docker-env)
docker build -t <my-script>-test .

else, just run the docker build command. This will add the image to your minikube or local Docker daemon where it can be used for testing with a local deployment of the SDP.

Deploy SDP locally and start a shell in the console pod.

Add the new script to the configuration DB. This will tell the SDP where to find the Docker image to run the script:

ska-sdp create script <kind>:<name>:<version> '{"image": "<docker-image:version>"}'

where the values are:

<kind>: batch or realtime, depending on the kind of script
<name>: name of your script
<version>: version of your script
<docker-image:version>: Docker image you just built from your script, including its version tag.

If you have multiple scripts to add, you can import the definitions with:

ska-sdp import scripts my-scripts.yaml

An example file for importing scripts can be found at: Example Script Definitions

To run the script, create a processing block, either via the Tango interface, or by creating it directly in the configuration DB with ska-sdp create pb.

Finishing touches and release

It is important that your processing script is well documented. This you can do in the README.md file generated by the script generator. Make sure you provide a changelog as well in CHANGELOG.md. Your documentation should reflect the current status of the script. If there are features that are not yet developed, but you want to mention, make it clear that this is future development and not yet part of the script.

Make sure you add the call to <my-script>.rst file to docs/src/script.rst (or move the file and update docs/src/test-script.rst if you created a test script), else it will not appear in the documentation. It is not necessary to update the table containing the list of available scripts since that will be done automatically during the documentation build by reading the data from the script definition file scripts.yaml at tmdata/ska-sdp/scripts.

If you developed unit tests, automate them in gitlab-ci.yaml (replace <my-script> with your script’s name):

<my-script>-test:
  extends: .test
  script:
    - cd src/ska-sdp-script-<my-script>
    - poetry install
    - pytest -vv tests

When you are ready to make a new release, update the version number in:

.release
pyproject.toml
CHANGELOG.md (make sure all items are added)

Add the script to the script definition file scripts.yaml in tmdata/ska-sdp/scripts. By default the SDP uses this file to populate the script definitions in the configuration DB when it starts up. The definition needs to contain the following information (make sure you replace the placeholders):

- kind: <realtime | batch>
  name: <script-name>
  version: <script-version>
  image: artefact.skao.int/ska-sdp-script-<script-name>:<script-version>
  sdp_version: "<version of SDP the script works with>"
  schema: <script-name>/<script-name>-params-<schema-version>.json

sdp_version: this is a the version(s) of SDP the script was tested and works with. Specify the version using operators. For example:

">=0.21.0": compatible with versions starting from 0.21.0 (including 0.21.0)
"<0.22.0": compatible with versions less than 0.22.0 (excluding 0.22.0)
"==0.23.0": only compatible with version 0.23.0
">=0.19.0, <0.22.0": compatible with 0.19.0 and above but less than 0.22.0

schema refers to the processing block JSON schema you created in the previous steps and moved to tmdata.