ska-sw-integration-testing-badger

This repository contains the deployments and tests for software integration testing of the SKA Mid and SKA Low telescopes.

Repository Content Overview

This repository includes:

  • Kubernetes charts for deploying SKA Mid and SKA Low telescopes for software-only integration testing.

  • A standard-ish Makefile to manage deployments and tests, supporting the differences between SKA Mid and SKA Low.

  • A standard-ish GitLab CI/CD pipeline to manage deployments and tests.

    • Location: .gitlab-ci.yml as the entry point, .gitlab/ for specific, slightly customised jobs for the two telescopes.

  • A set of system-level tests, written using pytest and Gherkin syntax (via pytest-bdd), to validate the deployments.

  • Automatic publishing of test results to Jira Xray using the ska-ser-xray library.

  • Automatic publishing of test results to a MariaDB database for comprehensive test data collection, analysis, and reporting across multiple pipeline runs and repositories.

NOTE: When we say “standard-ish” we mean standard as per the current SKA DevOps practices at the time of setting up this repository in PI28, inspired by repositories such as:

Below is a more detailed overview of each of these components.

Charts Overview

Currently, there are two main charts: one for SKA Mid and one for SKA Low:

Both charts essentially consist of the following set of sub-charts:

  • Common Tango base charts and related utilities, such as:

    • ska-tango-base

    • ska-tango-util

    • ska-tango-tangogql

    • ska-ser-skuid

    • Taranta-related charts

    • Archiver-related charts

  • Common or similar SKA Mid and Low sub-system charts that include:

    • TMC

    • SDP

    • CSP.LMC

    • CBF

  • Mid and Low specific sub-system charts that include:

    • For Mid: Dish LMCs

    • For Low: MCCS

Generally speaking, for Mid we are currently working with just one subarray, while for Low we are working with two subarrays. For Mid, at present, four dishes are used. This may change in future as we expand the tests and the deployed SUT.

These charts and value files are partially inherited from the repositories:

To make them work in this deployment and to have initial green pipelines as a baseline, we had to make a few modifications and some temporary patches, such as disabling certain non-essential faulty components. Also, we cannot yet claim to have full ownership or understanding of all the configuration options and settings used in these charts, so further work is required:

  1. Determine exactly what we consider the SUT (System Under Test) for our software integration testing purposes.

  2. Identify which extra components and services are needed for testing purposes (even if they are not strictly part of the SUT).

  3. Understand, clean up, and rationalise the chart and value file configurations to reflect the above two points.

  4. Clearly document the choices made and the reasoning behind them.

Also, a protocol to control and manage changes to these charts and values needs to be established. We have not yet decided whether to version the charts in this repository or keep them in separate repositories, as both approaches have advantages and disadvantages. Keeping them in separate repositories may help to isolate changes to the charts and validate different versions of the SUT with different versions of the tests (ideally), but at the same time it also introduces the practical complexity of managing multiple repositories, ambiguity about where to place particular configurations, and the difficulty of debugging processes that require dynamically changing both the tests and the charts.

Makefile Structure for Mid and Low

The standard Makefile is the entry point for commands to deploy and test. The basic Makefile includes all the agnostic logic and structure, while the telescope-specific details are delegated to the Makefile-mid.mk and Makefile-low.mk files.

The standard-ish part is essentially what we import from:

  • .make/k8s.mk

  • .make/helm.mk

  • .make/python.mk

  • .make/raw.mk

  • .make/base.mk

  • .make/release.mk

OCI image building is necessary only for repositories that have source code with new Tango devices or subsystems to deploy and build images from. This repository does not have any such source code (it only has already built charts and tests), so we do not need support for OCI image building. Therefore, we do not include .make/oci.mk (nor do we have a Dockerfile or any image build logic). Instead, we build a Helm chart, which is then deployed to Kubernetes but does not require building new container images.

The most relevant customisations are the following:

  • Through a TELESCOPE variable, we select the target, which can be either SKA-low or SKA-mid (defaulting to SKA-low). According to this variable’s value, we include the relevant telescope-specific Makefile-*.mk file (which can point to either Makefile-low.mk or Makefile-mid.mk).

    You can set this variable in one of the following ways:

    • From the command line when calling make, e.g.,

      make ... TELESCOPE=SKA-mid
      
    • By exporting it in your shell environment before calling make, e.g.,

      export TELESCOPE=SKA-mid
      make ...
      
    • By setting it in the GitLab CI/CD pipeline jobs (see below for details)

    This will activate all the necessary customisations for the selected telescope (test markers, chart to deploy, namespace to use, etc.). In the pipeline jobs, we also override the namespace to use (to have a unique namespace per pipeline run or to have more control in persistent deployment namespaces; see below for details).

    NOTE: This variable only affects deployment and test commands. Other commands that are not telescope-specific (e.g., code formatting, linting, docs building, helm linting and building, etc.) are not affected by this variable, so you can run them as usual without setting it.

  • The K8S_CHARTS are both ska-low-sw-test and ska-mid-sw-test, but the used K8S_CHART and HELM_CHART are set according to the selected telescope in the relevant Makefile-*.mk file.

  • In general, KUBE_NAMESPACE and HELM_CHART are set by the specific Makefile-*.mk file.

  • Since SDP needs a separate namespace to execute the scripts, we dynamically build a KUBE_NAMESPACE_SDP namespace and create it through some hooks that are executed when make k8s-install is called (see k8s-pre-install-chart, etc.)

  • In the K8S_CHART_PARAMS we inject some value customisations, such as:

    • Cluster domain override

    • Tango host

    • Namespace for SDP

    • Kafka host for SDP (generated dynamically in Makefile-*.mk)

    • Taranta params

    • Archiver params (with a path and a DB name defined in Makefile-*.mk)

    • Extra params for Mid and Low respectively

    • (Some others, see the files for details)

  • Regarding (integration) test runs, there are a few relevant customisations worth mentioning:

    • We set a standard K8S_TEST_IMAGE_TO_TEST (since we do not build images here)

    • K8S_TEST_RUNNER depends on the CI_JOB_ID

      • Not sure if this is strictly needed

    • Before executing the tests, we export a requirements.txt file from the pyproject.toml (see test-requirements target)

      • If we do not do this, we do not see all the dependencies installed

    • We have to set a few variables that indicate that we are not running any pairwise tests (see *_SIMULATION_ENABLED variables)

      • This is inherited from the old test harness; we have to keep this until we refactor the tests to not require the previous test harness infrastructure

    • Through PYTHON_VARS_BEFORE_PYTEST we need to expose a few environment variables to the test execution environment, such as:

      • PYTHONPATH

      • KUBE_NAMESPACE and KUBE_NAMESPACE_SDP

      • TANGO_HOST

      • The *_SIMULATION_ENABLED variables mentioned above

    • Through PYTHON_VARS_AFTER_PYTEST we:

      • Execute only the tests tagged with MARK (where MARK is set to mid or to low in the telescope-specific Makefile-*.mk files, to distinguish between Mid and Low tests)

      • We exit at the first failure (-x flag)

      • (Temporarily) we skip tests with two subarrays, since they still need to be updated to work with the newer SUT component versions

    • MARK is used also to set the JSON reports file path and the XRAY config files for Jira Xray publishing (see below for details).

IMPORTANT NOTE: At present, both deployments are not supposed to run locally with Minikube, since the resource requirements are too high (see below). Therefore, you will likely not be able to run make k8s-install locally nor will you be able to run the tests locally (make k8s-test). Instead, locally you may still be able to:

  • Format the code (make python-format)

  • Lint the code (make python-lint)

  • Lint the charts (make helm-lint)

  • Build the Sphinx documentation (make docs-build html, as soon as we set up the ReadTheDocs standard documentation)

  • (We will add some other meaningful local targets in the future)

Instead, most of the useful processes will now run in the GitLab CI/CD pipelines.

Tests

The tests in this repository are system-level integration tests that interact with the telescope through TMC and verify the emitted events, mainly from TMC, but also from other high-level components, such as:

  • CSP.LMC (controller and subarrays)

  • SDP (controller and subarrays)

  • MCCS (for Low only, controller and subarrays)

  • Dish LMCs (for Mid only)

The verifications mainly involve state changes, on the Telescope State, on the Subarray Observation State, and on other relevant Tango attributes. Generally speaking, we do not go too deep into the subsystems’ details.

All the tests are located in the tests/ directory, and are written using pytest and Gherkin syntax via the pytest-bdd plugin.

Old tests

This set of tests is inherited from the repository this one forked from (ska-sw-integration-testing). We are gradually refactoring and improving them. See the next section.

At present, Mid and Low tests are distinct and:

  • They reside respectively in the tests/mid/ and tests/low/ directories,

  • They are selected through the MARK variable in the respective Makefile-*.mk files (mid for Mid and low for Low), which permit you to run only the tests tagged with the relevant mark.

    NOTE: As a consequence, it is important that all new tests that are added are labelled with @pytest.mark.mid or @pytest.mark.low accordingly, otherwise they will not be executed!

Each of the two sets of folders at present contains:

  • A features/ sub-directory with the Gherkin feature files,

  • A data/ sub-directory with any test data needed (mainly JSON files that are used as command parameters),

  • A tests/ sub-directory with the step implementations (likely to be reorganised in the future),

  • A resources/ sub-directory with test harness pieces (mainly related to the old test harness, which will likely be removed in the future)

  • A conftest.py file with common fixtures and hooks (NOTE: Be careful, there may be other conftest.py files in sub-directories that may override or add to the main one)

  • A common pytest.ini file with common configuration for pytest and pytest-bdd.

NOTE: The tests/ and features/ directories still contain a leftover system_level_tests/ sub-directory structure, which comes from when we also had pairwise tests. This will be cleaned up in the future.

Even if the implementation of the tests is different for Mid and Low, the orchestration and high-level logic are quite similar:

  • all the tests assume the telescope to be in a certain fixed initial state (telescope state OFF, subarrays EMPTY)

  • in the given step, they execute a sequence of commands through TMC to prepare the system in the desired state for the next step (generally the telescope is turned on, and then the subarray operational flow is executed)

  • the when step executes a command (without waiting for completion)

  • the then steps both wait for the command to complete and verify the state changes using event-based assertions (generally implemented through ska-tango-testing TangoEventTracer or similar utilities)

  • at the end of each test execution, a tear down procedure is executed to bring the telescope back to the initial state (telescope state OFF, subarrays EMPTY)

  • the interaction between the tests and the SUT passes through a test harness, that essentially serves three main purposes:

    • represent the SUT and its structure and provide access to its components in a structured way

    • encapsulate some orchestration logic (e.g., waiting for commands to complete, moving the telescope to a certain state, tear down procedure, etc.)

A few further notes about these tests:

  • Mid tests at present depend on ska-integration-test-harness, in particular on the so-called “Monolithic Harness” that is hardcoded around TMC. Having it in a separate repository is not ideal, and we plan to move towards having our own test harness directly in this repository in the future, as well as making a few updates to simplify some aspects, make it more flexible and support new test scenarios (e.g., multiple subarrays).

  • For Mid tests, there may also still be some leftover code and fixtures for the old test harness that supported pairwise tests. This will be cleaned up in the future.

  • Low tests at present have their own test harness code directly in this repository, in the tests/low/resources/ directory. In the future, we plan to replace it with the same one we are using for Mid, once we have adapted it to stay in this repository, be more flexible and support more test scenarios (e.g., multiple subarrays).

New tests

The old tests are slowly being rewritten with the purpose of improving at the same time:

  1. the robustness of the tests themselves (we want more robust tests that require less implicit preconditions on the SUT state and that are able to “set themselves up” instead of relying on some fixed initial state)

  2. the execution time and the performance of the tests (we want faster tests, in terms of optimised orchestration flow and better re-use of eventually previous test runs’ state when possible and useful, instead of always starting from the same initial state)

  3. the readability and maintainability of the tests (we want more readable, less duplicated, more rationalised tests, that leverage steps reusability and modular generic components)

This re-engineering process is based on the approach of the Test Harness as a Platform, where the tests themselves rely on two layers of modular components;

  1. generic core components: those exposed by the Python library ska-integration-test-harness, which are generic and reusable across different SKA testing projects, and expected to remain relatively stable over time

  2. custom components built on top of the generic ones to serve specific testing needs of this project (hosted in src/ska_sw_integration_testing_badger/ith) that evolve dynamically as the tests and needs evolve

The tests themselves are now hosted in tests/low_new/ and tests/mid_new/ directories, and they are executed immediately after the old ones in the same test job. Gradually we will skip and then remove the old tests that are replaced by the new ones.

Some important principles about the new tests and the related test harness customisation layer are the following.

Minimal modelling of the system and encapsulation through a test harness

TODO: describe here, 1 small paragraph (1-3 phrases) for each point, better to not directly mention the code itself but rather the concepts and the approach:

  • telescope and subsystem classes to represent the SUT structure, regardless if it’s Mid or Low (point out to src/ska_sw_integration_testing_badger/ith/ and describe in 1 phrase what each module is for)

  • configuration mechanism to inject device names and variables from ENV

  • principle of not wrapping what does not need to be wrapped, to avoid unnecessary abstraction and complexity

Feature -> Scenario -> (Common Reusable and Generic Enough) Steps

TODO: describe here, 1 small paragraph (1-3 phrases) for each point, better to not directly mention the code itself but rather the concepts and the approach:

  • Feature and scenarios are Mid-Low specific, but steps are mostly generic and reusable

  • Leverage scenario outlines to avoid duplication of tests

  • Leverage steps reusability and modularity to avoid duplication of python code (point out to tests/conftest.py for common fixtures and tests)

  • Steps leverage the test harness for keeping the implementation elastic enough to support SUT variations (e.g., Mid vs Low)

Adaptive Given Steps Principle

TODO: describe here, 2 paragraphs (3-5 phrases), better to not directly mention the code itself but rather the concepts and the approach:

  • each given step should be elastic enough to both:

    • leverage eventual leftover state from previous test runs, to speed up the execution and avoid unnecessary state resets

    • be able to set up the necessary state from scratch if the expected state is not already present, to make the tests more robust and less dependent on implicit preconditions on the SUT state

  • leverage pytest.mark.order to define an ideal test execution order that optimises the above point, but without making it a strict requirement; old tests come first because they are more picky about the initial state, while new tests come later because they are more robust and adaptive

State Machine graph-based modelling to reset Subarray state

TODO: describe here, 1 small paragraph (1-3 phrases) for each point, better to not directly mention the code itself but rather the concepts and the approach:

Scheduling Blocks to represent inputs

TODO: describe here, 1 small paragraph (1-3 phrases) for each point, better to not directly mention the code itself but rather the concepts and the approach:

  • a state machine alone is insufficient because a subarray may be operating with different sets of inputs

  • model the inputs through the concept of “Scheduling Block”, which concretely is a set of JSON files representing compatible inputs for AssignResources, Configure and Scan commands (explain directory structure)

  • scenarios mention Scheduling Blocks in the given steps, if we are operating with the currently active Scheduling Block we proceed with state machine-based orchestration, otherwise we reset to EMPTY in the fastest way possible and then we load the new Scheduling Block to set up the new inputs

TangoEventTracer-based event assertions and logging for better observability

TODO: describe here, 1 small paragraph (1-3 phrases) for each point, better to not directly mention the code itself but rather the concepts and the approach:

  • TangoEventTracer as a tool to collect and assert events emitted by the SUT in response to commands

  • subscribe in advance whenever the SUT is sufficiently ready

  • use active waits with timeouts to wait for expected events

  • log events from the start to track SUT state evolution and simplify debugging if test failures occur

Create modular utilities as needed

TODO: describe here, 1 small paragraph (1-3 phrases) for each point, better to not directly mention the code itself but rather the concepts and the approach:

  • create utilities whenever needs arise

  • technical functions versus generic infrastructure versus test-specific utilities (it is important to keep these three aspects separate and well documented to avoid complexity escalation)

Pipelines for Mid and Low

Pipeline Jobs Overview

The GitLab pipeline entry point is the standard .gitlab-ci.yml file, which imports:

From the standard templates, we disable:

  • k8s-test (and stop-k8s-test), since we have our own telescope-specific test jobs,

  • deploy-dev-environment (and similar related job steps), since we have our own telescope-specific deployment jobs,

  • xray-publish, since we have our own telescope-specific Xray publishing jobs.

At present, the pipeline has the following jobs:

New pipeline job files .gitlab/xray-publish-low.yml and .gitlab/xray-publish-mid.yml define Xray publishing for each telescope. Similarly, .gitlab/publish-db.yml provides a common base job (.publish-test-results-to-db-base) that is extended by publish-test-results-to-db-low and publish-test-results-to-db-mid jobs in their respective test files. This DRY approach uses GitLab’s extends keyword and YAML anchors to avoid duplication of MariaDB credentials, dependencies installation, and publishing logic.

Pipeline

As you can see:

  • We have distinct manual jobs to deploy Mid and Low telescopes (totally optional, tests run independently of those deployments)

  • We have distinct test jobs for Mid and Low telescopes, which instead run automatically on each pipeline run.

At present, the pipeline jobs are the main way to deploy and test the telescopes. Use of Coder will be formalised in future.

NOTE: If you want to disable either the Mid or Low jobs temporarily (e.g., because you are working on just one of the two telescopes’ tests and are temporarily not interested in the other telescope jobs), you can set either the ENABLE_LOW_JOBS or ENABLE_MID_JOBS variable to false in the .gitlab-ci.yml file. Important: Remember to set the variable to true again before merging to main!

Here we discuss our custom telescope-specific jobs and some relevant aspects useful to understand how to use them correctly.

Mid and Low k8s-test Jobs

The testing jobs (k8s-test-low and k8s-test-mid) inherit from the standard k8s-test job and customise it as follows:

  • Override the TELESCOPE variable and set it to SKA-low or SKA-mid to select the telescope (see Makefile section above for details)

  • Set a KUBE_NAMESPACE value that depends on 1+3 parts:

    1. ci-test-low or ci-test-mid prefix to indicate the namespace is for running tests for the Low or Mid telescope respectively

    2. A short version of the project name (CI_SHORT_PROJECT_NAME: "sw-integration", see later for the reason a short version is needed)

    3. The commit short SHA (CI_COMMITS_SHORT_SHA), to easily distinguish between different code versions

    4. The pipeline ID (CI_PIPELINE_ID), to avoid conflicts if multiple pipelines are running for the same commit at the same time

    This fourth part is particularly important, because in the repo setup we often encountered issues related to conflicts between separate test runs (see SKB-1113). See below for further guidelines on how to use these jobs correctly, especially in the tricky case where a test run fails and you want to rerun it.

    Also, since Kubernetes namespaces have a maximum length of 63 characters, and since the project name is quite long, we needed to shorten it (hence point 2 above). The short version is defined in the CI_PROJECT_NAME_SHORT variable in the .gitlab-ci.yml file.

    The Kubernetes namespace issue can be even more tricky if we consider that SDP needs a separate namespace to execute its scripts, and that namespace is dynamically created during the deployment, adding 4 more characters to the base namespace length (i.e., -sdp suffix). If in the future we encounter further issues related to namespace length, we may need to further shorten the project name and removing point 3 may be necessary (I would not remove point 4, as it is important to avoid conflicts).

    For simplicity, the job environment is named in the same way.

  • The job is active only if ENABLE_LOW_JOBS or ENABLE_MID_JOBS variables are set to true respectively (see below for details).

The test job execution behaves as you expect from most SKA pipelines that run integration tests. It essentially calls the following sequence of commands:

make k8s-install  # deploy the chart in the selected (unique) namespace
make k8s-wait     # wait for the deployment to be ready
make k8s-test     # run integration tests

Since the namespace is unique per pipeline run, if you retry the same test job it will attempt to redeploy the telescope in the same namespace. If the deployment is still active from the previous attempt and the pods are running, since the deployed configuration is the same as the already deployed one, the pods will very likely not be re-deployed. Therefore, the tests will run again against the already running deployment, which may not be in a clean state if the previous test run failed halfway through and left some stateful components in an inconsistent state.

In addition to the k8s-test-* jobs, there are also two other jobs: stop-k8s-test-mid and stop-k8s-test-low, which are manual jobs and can be run to uninstall the Helm chart (make k8s-uninstall-chart) and clean up the namespace (make k8s-delete-namespace) used by the test runs. It may be advisable to run these jobs if a test run fails and leaves the deployment in a faulty state, before re-running the test job to have a fresh deployment, otherwise the re-run may just interact with the existing deployment (sometimes this may be the desired behaviour, sometimes not).

The stop-k8s-test-* jobs are automatically called after all test jobs in a pipeline pass. If stop-k8s-test-* is not called (i.e., tests failed, have been cancelled, and the jobs have not been run manually), the test deployment will remain active until GitLab’s auto_stop_in timeout elapses (currently 1 minute for test environments).

Test results and artifacts

As per the standard k8s-test job, test results are collected and stored in a build folder inside the job workspace, and then archived as job artefacts. The whole build/ folder is then accessible to subsequent jobs in the same pipeline (e.g., for Xray publishing, see below).

After each test run, a script moves all files from build/ (except low or mid) into build/low/ or build/mid/. This keeps results separate and avoids overwriting.

Mid and Low Manual Deployment Jobs

The (dev) deployment jobs and the related ones (deploy-low-environment, deploy-mid-environment, etc.) are also inherited from the standard deploy-dev-environment job and essentially perform similar customisations to the test jobs above to select the telescope and set unique namespaces, etc.

These jobs are useful if you want to have a persistent deployment of the telescope for debugging purposes, or to run some manual tests against it.

When you run deploy-*-environment jobs, the Helm chart for the selected telescope is deployed (make k8s-install-chart) in a Kubernetes namespace that has a fixed name (i.e., it does not depend on the commit hash, branch, or pipeline ID), and then the pipeline waits for the deployment to be ready (make k8s-wait).

The fact that we use a fixed namespace means that the deployment is persistent across multiple pipeline runs, and if you redeploy it from another pipeline, the previous deployment is simply updated to the new version (re-deploying the pods, regardless of whether the code version has changed or not). A different situation occurs if you retry the same deployment job from the same pipeline run: in that case, since the built chart is the same as the already deployed one, the pods are not re-deployed. Concretely, for a tester this means that:

  • A retry from the same pipeline will at most attempt to re-deploy any faulty pods that failed to start correctly in the previous attempt, but will not change the internal state of the already running pods (and so will not reset any stateful component);

  • A re-deployment from another pipeline will completely re-deploy all the pods, resetting the internal state of any stateful component.

After the deployment job completes, a test-*-environment job automatically runs, but this should not be confused with the integration test run (k8s-test-* jobs). Instead, the test-*-environment job will simply verify pod readiness (another make k8s-wait).

If you are in any pipeline and want to see if there is a deployment active (e.g., from a different pipeline), you can call the manual info-*-environment job, which will report the status of the deployment (pod status, etc. - make k8s-info). If there is no deployment active, this job will likely report that no resources are found in the namespace.

Finally, to clean up the deployment, you can call the manual stop-*-environment job, which will uninstall the Helm chart and delete the namespace (make k8s-uninstall-chart, make k8s-delete-namespace). If you do not do that, the deployment will remain active until GitLab’s auto_stop_in timeout elapses (currently 12 hours for manual deployment environments). Also, remember the stop job can be run from any pipeline, independently of which pipeline deployed the telescope.

Procedures to connect to those deployments through Coder have been trialled (see This document ), but are not yet fully formalised.

Jira-Xray Test Result Publishing

The pipeline includes xray-publish-low and xray-publish-mid jobs that automatically publish test results to Jira Xray. These jobs are triggered only:

  1. when the test pipelines runs on the main branch

  2. or when a merge request is merged into main,

so it does not run on every pipeline execution. The FORCE_XRAY_PUBLISH variable allows publishing to Xray on any branch, not only on main or after merges. Set it to true for testing or forced publishing.

The publishing happens automatically after each respective k8s-test-* job completes, regardless of whether the tests passed or failed. However, if you re-run the k8s-test-* job manually, at present you will need to also re-run the xray-publish-* job manually, since there is no automatic trigger for that case. This is a GitLab limitation, as downstream jobs are not automatically re-triggered when a job is re-run manually. If you disable either the Mid or Low jobs temporarily (ENABLE_*_JOBS variables), the respective Xray publishing job will also be disabled.

The Xray jobs extend the base xray-publish job from the standard templates, customising the JSON reports file paths that are used accordingly through the MARK variable (i.e., low or mid). The Makefile now sets report and config file paths using the MARK variable, ensuring correct files are used for each telescope. Also the XRAY config files are selected accordingly, because Mid and Low tests use different config files kept in the respective tests/mid/ and tests/low/ directories:

After a pipeline, you should see a new entry in the relevant Xray page. At present, the best thing you can do is to check the main test plan ticket’s last executions and find those which reference ska-sw-integration-testing-badger in the description. The main test plan tickets at present are:

At present, the test project and identifiers are shared with the release team repo, and for some tests even with TMC. In future, we may create a dedicated test project and test cases for this repository. Another planned improvement is to enrich the log output and the ticket descriptions with more detailed cross-links to the pipeline run, to the created test execution, etc. to improve traceability.

The Xray publishing is implemented using the ska-ser-xray library.

MariaDB Test Result Publishing

DB Jobs Overview

In addition to Xray publishing, the pipeline also publishes test results to a MariaDB database for comprehensive test data collection and analysis.

The pipeline includes publish-test-results-to-db-low and publish-test-results-to-db-mid jobs that automatically publish parsed test artefacts to the MariaDB database. Similar to Xray publishing, these jobs are triggered only:

  1. when the test pipeline runs on the main branch

  2. or when a merge request is merged into main

The FORCE_DB_PUBLISH variable allows publishing to the database on any branch, not only on main or after merges. Set it to true for testing or forced publishing. This is particularly useful during development and testing of the database publishing functionality itself.

Publishing happens automatically after each respective k8s-test-* job completes, regardless of whether the tests passed or failed. The jobs use the ska-test-analysis package to:

  1. Parse test artefacts from the build/low/ or build/mid/ directories

  2. Extract test execution details, results, and metadata

  3. Insert the structured data into the MariaDB database

For more details on the database schema, the data being stored, and the available scripts, please refer to the ska-test-analysis package documentation.

The FACILITY variable is particularly important for categorising test runs. It distinguishes between:

  • test-dev: Development environment where software integration tests are developed and refined

  • test-official-run: Official test runs on stable tests (main branch or merged changes)

A verify-mariadb-connection job runs in the lint stage to verify database connectivity before the test jobs execute. This job checks that the pipeline can reach the MariaDB server and that necessary dependencies are available. It is configured with allow_failure: true so that connectivity issues do not block the entire pipeline. This job is configured to run in the same way as the DB publishing jobs.

If you disable either the Mid or Low jobs temporarily (ENABLE_*_JOBS variables), the respective database publishing job will also be disabled.

The database publishing jobs are configured with allow_failure: true, meaning database publishing failures will not cause the entire pipeline to fail. This ensures that test execution and Xray publishing can continue even if database publishing encounters issues.

DB Credentials

The database connection details are configured via the following variables in .gitlab-ci.yml:

  • DB_HOST: MariaDB server hostname (currently 10.100.222.218 - set from .gitlab-ci.yml file)

  • DB_NAME: Database name (currently badger - also set from .gitlab-ci.yml file)

  • DB_USER: Database username (set from GitLab CI/CD protected variable)

  • DB_PASSWORD: Database password (set from GitLab CI/CD protected variable)

About DB_USER and DB_PASSWORD: For security reasons, these credentials are not hardcoded in the repository. Instead, they should be set as protected variables in the GitLab CI/CD settings for this repository. For now, they are accessible only in the main branch pipelines. If you need to access them in other branches for testing purposes, or change them, please check here.

At present, the database in use is deployed in the SKA cluster. More information on these STS tickets and vault links:

To connect to the database manually for debugging or verification purposes, you should be on the SKA VPN or in a Coder Environment.

Additional Notes

Resource Requirements and Local Execution

At present, both SKA Mid and SKA Low deployments for software integration testing are not intended to run locally with Minikube, because:

  1. The resource requirements are too high;

  2. The makefile and the deployment probably do not have all the necessary customisations to run locally.

Regarding the resource requirements, as a rough indication:

  • SKA Mid deployment:

    • Consists of 95 pods,

    • Uses more than 12 CPUs,

    • Requires around 24.3 GiB of memory, and consumes around 10.6 GiB just being deployed (without running any tests).

  • SKA Low deployment:

    • Consists of 94 pods,

    • Uses an entire CPU just for being deployed,

    • Requires around 23.5 GiB of memory, and consumes around 10 GiB just being deployed (without running any tests).

These data were captured through Grafana after a fresh deployment of the respective telescope, without running any tests. Also, SDP-specific namespaces are not included in these counts.

Parallel Pipeline Test Runs, Flakiness, and Guidelines

At the moment, the pipeline test jobs are designed to allow multiple pipeline test runs in parallel, even for the same commit or branch. This means that:

  • Multiple testers can run tests simultaneously without interfering with each other, in the same or in different branches.

  • A tester can re-run a failed test job from the same pipeline without interfering with any other test runs.

  • A tester that has a failed test job can either:

    1. re-run the same test job from the same pipeline if they wish to retry the test against the same deployment;

    2. call the stop-k8s-test-* job to clean up the deployment and then re-run the test job from the same pipeline to have a fresh deployment;

    3. run a new pipeline (e.g., from the same branch) to obtain a fresh deployment and run the tests against that (even while the previous test job is still active).

    It is important to note that option 1 may not be ideal if the previous deployment is in a faulty state due to the failed test run, so options 2 or 3 are likely preferable in that case.

    Also, be careful not to overlap a test job run with a stop-k8s-test-* job, as that may tear down some pods while the test job is trying to interact with them.

During pipeline setup, we encountered some reliability issues. In part, some of these issues were due to test run overlaps and conflicts (see SKB-1113), but some were genuine flakiness in the deployments or tests.

Even though multiple parallel test runs are currently supported, it is still advisable to avoid having too many parallel test runs at the same time, to avoid overloading the systems and encountering resource shortages and increased flakiness.

To save resources:

  • You may consider disabling either the Mid or Low jobs temporarily (e.g., if you are working on just one of the two telescopes’ tests and are temporarily not interested in the other telescope jobs), by setting either the ENABLE_LOW_JOBS or ENABLE_MID_JOBS variable to false in the .gitlab-ci.yml file.

  • We/you may consider the improvements listed in BAD-40.

Telescope-specific guides