Delivery workflow
This workflow provides a basic implementation of an SDP Delivery mechanism. It uploads data from SDP buffer reservations to Google Cloud Platform (GCP). It uses Dask as an execution engine.
Parameters
The workflow parameters are:
bucket
: name of the GCP storage bucket in which to upload the databuffers
: list of buffers to upload to the storage bucket, each containsname
: name of the buffer reservationdestination
: location to upload it in the bucket
service_account
: location of the GCP service account key (stored in a Kubernetes secret)secret
: name of the secretfile
: filename of the service account key
n_workers
: number of Dask workers to deploy
For example:
{
"bucket": "delivery-test",
"buffers": [
{
"name": "buff-pb-20200523-00000-test",
"destination": "buff-pb-20200523-00000-test"
}
],
"service_account": {
"secret": "delivery-gcp-service-account",
"file": "service-account.json"
},
"n_workers": 1
}
Running the workflow using the ska-sdp
CLI
After SDP is deployed in Minikube, and you started the sdp-console, run:
ska-sdp create pb batch:delivery:0.1.0 '<parameters-json>'
Replace <parameters-json>
with the above string and the appropriate values. Once executed,
a processing block pod will be created in the sdp
namespace, which will run the delivery workflow.
(See ska-sdp CLI usage.)
Note that each workflow may come with multiple versions. Always use the latest number, unless you know a specific version that suits your needs. (The Changelog at the end of this page may help to decide.)
Creating a GCP storage bucket to receive the data
The steps to create a GCP storage bucket for the delivery workflow are as follows. GCP has an ample documentation, so each step is linked to the relevant section:
* The service account must have the role "Storage Object Creator".
* Create and download a key in JSON format.
Adding the GCP service account key as a Kubernetes secret
To make the service account key available to the delivery workflow, it needs to be uploaded to the cluster as a Kubernetes secret. The command to do this is:
kubectl create secret generic <secret-name> --from-file=<service-account-key> -n <sdp-namespace>
Using the values from the example parameters above and assuming the namespace
for the SDP dynamic deployments is sdp
(the default), the command would be:
kubectl create secret generic delivery-gcp-service-account --from-file=service-account.json -n sdp
To check the secret has been created, you can use the command:
kubectl describe secret delivery-gcp-service-account -n sdp
and the output should look like:
Name: delivery-gcp-service-account
Namespace: sdp
Labels: <none>
Annotations: <none>
Type: Opaque
Data
====
service-account.json: 2382 bytes
Changelog
0.1.1
use python:3.9-slim as the base docker image