How to integrate ska-tango-event-monitor with your CI pipelines

Currently, this utility is manually applied within shell sessions on pods deployed in both local and ITF clusters.

Multiple Tango device servers can be monitored at once and it is recommend to monitor them all with a single monitoring process so that they are sampled at the same time. This makes it easier to correlate the event system query data between devices.

Each device must be running against cppTango/pytango 10.1.0 or later.

To streamline usage, we propose adding a dedicated job to the test stage. This job will:

  • Stream summarized event processing information during test execution.

  • Capture and save events transmitted over the ZMQ wire to a file called zmq-events.json

This file will contain detailed event data, including device (server) publications, client subscriptions to attributes, and their respective callback counts.

To further simplify management, a make target will be integrated into the ska-cicd-makefile. In the interim, the pipeline structure from this example pipeline can be replicated (for integration environment, wait for make target).

Steps to setup

  • Update or Add a Dockerfile to install ska-tango-event-monitor:

# ./build-with-custom-whl/Dockerfile
RUN pip install ska-tango-event-monitor --extra-index-url https://artefact.skao.int/repository/pypi-internal/simple
  • Point your docker build command to the new dockerfile

# ./Makefile
OCI_IMAGE_FILE_PATH = build-with-custom-whl/Dockerfile
  • Add jobs to stream and store the events

# .gitlab-ci.yml
stream-processed-zmq-events:
  tags:
    - ${SKA_K8S_RUNNER}
  variables:
    KUBE_NAMESPACE: 'ci-$CI_PROJECT_NAME-$CI_COMMIT_SHORT_SHA'
    TARGET_POD_NAME: <the-pod-to-run-the-sampling-script>
    DEVICE_NAMES: "<foo/bar/1> <foo/bar/2> ..."
  allow_failure: true
  when: always
  stage: test
  script:
    - git clone https://gitlab.com/ska-telescope/sdi/ska-cicd-makefile.git
    - cd ska-cicd-makefile
    - KUBE_APP=<k8s-app-label-name> make k8s-wait
    - echo "Starting ZMQ event monitoring on $DEVICE_NAME in namespace $KUBE_NAMESPACE"
    - kubectl exec -i $TARGET_POD_NAME -n $KUBE_NAMESPACE -- sudo touch zmq-events.json.xz
    - kubectl exec -i $TARGET_POD_NAME -n $KUBE_NAMESPACE -- sudo chown tango zmq-events.json.xz
    - kubectl exec -i $TARGET_POD_NAME -n $KUBE_NAMESPACE -- /app/bin/ska-tango-event-monitor $DEVICE_NAMES --monitor-perf --append --output zmq-events.json.xz

stop-streaming-and-store-zmq-events:
  tags:
    - ${SKA_K8S_RUNNER}
  variables:
    KUBE_NAMESPACE: 'ci-$CI_PROJECT_NAME-$CI_COMMIT_SHORT_SHA'
    TARGET_POD_NAME: <the-pod-to-run-the-sampling-script>
  stage: test
  when: always
  allow_failure: true
  script:
    - echo "Test run has completed, collecting events recorded"
    - mkdir -p build
    - kubectl exec -i $TARGET_POD_NAME -n $KUBE_NAMESPACE -- cat zmq-events.json.xz >> build/zmq-events.json.xz
    - kubectl exec -i $TARGET_POD_NAME -n $KUBE_NAMESPACE -- pkill -f "/app/bin/ska-tango-event-monitor"
  needs:
    - k8s-test-runner
  artifacts:
    name: "$CI_JOB_NAME-$CI_JOB_ID-recorded-events"
    paths:
      - build/