Delivery workflow

This workflow provides a basic implementation of an SDP Delivery mechanism. It uploads data from SDP buffer reservations to Google Cloud Platform (GCP). It uses Dask as an execution engine.

Parameters

The workflow parameters are:

  • bucket: name of the GCP storage bucket in which to upload the data

  • buffers: list of buffers to upload to the storage bucket, each contains

    • name: name of the buffer reservation

    • destination: location to upload it in the bucket

  • service_account: location of the GCP service account key (stored in a Kubernetes secret)

    • secret: name of the secret

    • file: filename of the service account key

  • n_workers: number of Dask workers to deploy

For example:

{
  "bucket": "delivery-test",
  "buffers": [
    {
      "name": "buff-pb-20200523-00000-test",
      "destination": "buff-pb-20200523-00000-test"
    }
  ],
  "service_account": {
    "secret": "delivery-gcp-service-account",
    "file": "service-account.json"
  },
  "n_workers": 1
}

Running the workflow using the ska-sdp CLI

After SDP is deployed in Minikube, and you started the sdp-console, run:

ska-sdp create pb batch:delivery:0.1.0 '<parameters-json>'

Replace <parameters-json> with the above string and the appropriate values. Once executed, a processing block pod will be created in the sdp namespace, which will run the delivery workflow. (See ska-sdp CLI usage.)

Note that each workflow may come with multiple versions. Always use the latest number, unless you know a specific version that suits your needs. (The Changelog at the end of this page may help to decide.)

Creating a GCP storage bucket to receive the data

The steps to create a GCP storage bucket for the delivery workflow are as follows. GCP has an ample documentation, so each step is linked to the relevant section:

  1. Create a project.

  2. Create a storage bucket in the project.

  3. Create a service account and download a key:

* The service account must have the role "Storage Object Creator".
* Create and download a key in JSON format.

Adding the GCP service account key as a Kubernetes secret

To make the service account key available to the delivery workflow, it needs to be uploaded to the cluster as a Kubernetes secret. The command to do this is:

kubectl create secret generic <secret-name> --from-file=<service-account-key> -n <sdp-namespace>

Using the values from the example parameters above and assuming the namespace for the SDP dynamic deployments is sdp (the default), the command would be:

kubectl create secret generic delivery-gcp-service-account --from-file=service-account.json -n sdp

To check the secret has been created, you can use the command:

kubectl describe secret delivery-gcp-service-account -n sdp

and the output should look like:

Name:         delivery-gcp-service-account
Namespace:    sdp
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
service-account.json:  2382 bytes

Changelog

0.1.1

  • use python:3.9-slim as the base docker image