Logging Solution
Logging is one of the components in a “Developer centered” tooling set, facilitating the analysis of infrastructure and application behaviour. Please refer to the centralised monitoring and logging to understand what solutions are available and how they integrate.
Logging in SKA is handled with Elasticsearch, bundled with Kibana as a frontend. This frontend is more suitable to creating visualisations than actually searching logs.
Note
Examples refer and redirect to central logging URLs. If your datancentre is a production one (ie, low-aa, mid-aa, etc) please refer to the list of URLs and use that URL instead
Set up Kibana/Elasticsearch access
If you simply want to navigate in Kibana, you can go to Kibana and Continue as Guest. This enables you to see logs and dashboards, but every other operation is limited.
If you need to access logs locally or have other types of permissions, we advise you to create your own account. To do so, follow:
Create an STS ticket
- Log in with your:
Username: “Username” in your JIRA Profile
Password: Password provided by ST in response to your STS ticket
- Create API Key
On the left pane, navigate to Stack Management -> API keys
Create API Key in Kibana
Optionally you can add an expiration date
You can customize the permissions of this API Key
Test API Key
$ curl -k -H "Authorization: ApiKey <your API Key>" https://logging.stfc.skao.int:9200/_cat/health
Keep your password and API Key safe
Filter logs in Kibana
Kibana is not the best tool to work with logs, as its strength is more on visualisations based on data present in logs. Nonetheless, we search logs in Kibana in the same manner we search in Elasticsearch, using log metadata fields.
Discover logs
In Kibana we search logs in the Discover view, selectable on the left pane. There we can operate some top-level controls:
Data view - Change which data view to use (comprising of one or more index patterns)
Date selection - Select start and end time to look for logs
Filters - Filter logs based on fields, allowing for and/or operations with is/is one of/exists operators and their negative counterparts
Search Bar - Kibana query language (KQL) expressions to filter logs. This gets and’ed with the filters
Field List - List of fields existing in the selected logs. We can search field names (_*_) and select them (using ‘+’) to be displayed Document view
Document - List of documents matching the search criteria. You sort and further filter by a field’s value (using ‘+’) of interest or exclude similar results (using ‘-‘)
To know more about it, please refer to the official documentation.

Kibana discover view
This is a useful page to understand and quickly search what logs and fields are available, usually through auto-complete. Another useful feature is expanding a document where we can see all the metadata and fields available. As an example, lets look for a Device server log in the staging-ska-tango-examples namespace in the STFC CICD (stfc-techops datacentre) cluster. Later in this page we will go over some of the built-in and custom fields you can filter with:

Kibana filter
Although you can do everything with KQL, we suggest you do most of your filtering with filters and only use KQL to search for string values that are not exact matches (_i.e._, lines containing a substring). As we use Statefulsets to run Device Servers, we can try to find some fields to help us filter logs. For each filter, we get a very handy list of most occurring values (within the selected timeframe):

Kibana field values
We can include (using ‘+’) or exclude (using ‘-’) other documents where this field has an equal value. Inspecting a document’s JSON content, we can see the whole set of fields present. Note that, not every document has all of this fields, but mostly all documents related to Kubernetes logs do:
{
"_index": ".ds-filebeat-8.17.4-2025.05.09-006885",
"_id": "ayiMtZYB3UmOeKG2FrhB",
"_version": 1,
"_source": {
"container": {
"image": {
"name": "artefact.skao.int/ska-tango-images-tango-db:11.0.2"
},
"runtime": "containerd",
"id": "38ebd034e936c4df7368442be632c2455e2f3c5f92da0f290a8c0cd18394d206"
},
"input": {
"type": "container"
},
"kubernetes": {
"pod": {
"uid": "cdefc1fe-7a2b-4bba-b232-2bd421797308",
"ip": "10.10.192.42",
"name": "databaseds-tangodb-tango-databaseds-0"
},
"statefulset": {
"name": "databaseds-tangodb-tango-databaseds"
},
"namespace": "staging-ska-tango-examples",
"namespace_uid": "9e932d52-5a61-45d7-9a03-579bdb874204",
"namespace_labels": {
"cicd_skao_int/project": "ska-tango-examples",
"cicd_skao_int/author": "matteo1981",
"cicd_skao_int/jobId": "9974477502",
"cicd_skao_int/projectPath": "ska-telescope-ska-tango-examples",
"kubernetes_io/metadata_name": "staging-ska-tango-examples",
"cicd_skao_int/pipelineId": "1805215409",
"cicd_skao_int/mrId": "",
"cicd_skao_int/projectId": "9673989",
"cicd_skao_int/job": "deploy-staging",
"cicd_skao_int/environmentTier": "staging",
"cicd_skao_int/branch": "master",
"cicd_skao_int/authorId": "3003086",
"cicd_skao_int/team": "system",
"cicd_skao_int/commit": "7cabaa1f4d5a697e89e87980dfb359fe375105b9",
"cicd_skao_int/pipelineSource": "web"
},
"labels": {
"cicd_skao_int/project": "ska-tango-examples",
"cicd_skao_int/author": "matteo1981",
"cicd_skao_int/jobId": "9960905992",
"apps_kubernetes_io/pod-index": "0",
"controller-revision-hash": "databaseds-tangodb-tango-databaseds-5555d574fd",
"app_kubernetes_io/managed-by": "DatabaseDSController",
"cicd_skao_int/pipelineId": "1805215409",
"cicd_skao_int/projectId": "9673989",
"app_kubernetes_io/name": "databaseds-tangodb",
"cicd_skao_int/job": "deploy-staging",
"app_kubernetes_io/instance": "tango-databaseds",
"cicd_skao_int/branch": "master",
"cicd_skao_int/team": "system",
"cicd_skao_int/authorId": "3003086",
"cicd_skao_int/commit": "7cabaa1f4d5a697e89e87980dfb359fe375105b9",
"statefulset_kubernetes_io/pod-name": "databaseds-tangodb-tango-databaseds-0"
}
},
"stream": "stderr",
"host": {
"name": "stfc-techops-production-cicd-md-0-2nwqg-gd5g8"
},
"ska": {
"datacentre": "stfc-techops",
"environment": "production",
"prometheus_datacentre": "stfc-ska-monitor",
"application": "kubernetes",
"service": "clusterapi"
},
"message": "2025-05-09 14:56:16 58199 [Warning] Aborted connection 58199 to db: 'unconnected' user: 'unauthenticated' host: '10.100.1.251' (This connection closed normally without authentication)"
}
}
Note
Some data was omitted from the JSON document for brevity
You can use any combination of these fields to filter your logs and pin-point the timing you are looking for. The timeline in Kibana is particularly useful for that, as we can see an aggregated count of documents which can help us narrow down a time window. For an effective log search we need to know which fields we can use. To drill that down we suggest:
Kubernetes fields, of which we highlight:
kubernetes.namespace
-> Kubernetes namespacekubernetes.pod.name
-> Kubernetes Pod namekubernetes.statefulset.name
-> Kubernetes Statefulset name, useful for Device Serverskubernetes.node.name
-> Kubernetes Nonethelessode name
SKA infrastructure fields:
ska.datacentre
-> Datacentre the log came fromska.environment
-> Environment in the datacentre the log came fromska.application
-> The log source (one of syslog, journald, docker, podman or kubernetes)
SKA standard fields (not in widespread use, unfortunately):
kubernetes.labels.domain
-> Application domainkubernetes.labels.function
-> Application functionkubernetes.labels.system
-> System application is part ofkubernetes.labels.subsystem
-> Subsystem application is part ofkubernetes.labels.telescope
-> Telescope application is part of
SKA CICD fields (prefixed with
kubernetes.labels
orkubernetes.namespace_labels
):cicd_skao_int/projectId
-> Gitlab project idcicd_skao_int/project
-> Gitlab project namecicd_skao_int/projectPath
-> Sanitized Gitlab project path (“/” replaced with “-“)cicd_skao_int/authorId
-> Author Gitlab idcicd_skao_int/author
-> Author namecicd_skao_int/team
-> Author’s SKA Team (might not always be available, depends on the contents of the People’s database)cicd_skao_int/commit
-> Commitcicd_skao_int/branch
-> Branchcicd_skao_int/pipelineId
-> Gitlab pipeline idcicd_skao_int/jobId
-> Gitlab job idcicd_skao_int/job
-> Gitlab job namecicd_skao_int/mrId
-> Gitlab merge request id (if applicable)cicd_skao_int/environmentTier
-> Gitlab environment tiercicd_skao_int/pipelineSource
-> Gitlab pipeline (trigger) source
It becomes very easy to track-down your logs using fields like kubernetes.namespace
and kubernetes.labels.cicd_skao_int/jobId
. As this is really specific, we were able to include prebuilt URLs on the pipeline logs to make it easier for developers to find relevant logs.
Filter logs on your machine
Being able to access Elasticsearch - and not Kibana - is what we need in order to be able to query and parse logs locally in our machines, which seems to be the preferred and most efficient way. We can craft our queries in Kibana using ES|QL, the newer query language supported in both Kibana and Elasticserch. Optionally, we can also use KQL.
curl
All we need is to do a curl command and use Elasticsearch’s native query API. For this scenario, we are giving preference to ES|QL as it is more readable and easy to structure.
As an example, lets query the ska-ser-namespace-manager REST API logs in the CICD cluster (stfc-techops-production). We don’t really know how to filter for the REST API pod logs, so lets use Discover to find that out. Starting with what we know, we can open up a document where the log looks like it is coming from the REST API and inspect:
{
...
"kubernetes": {
"container": {
"name": "api"
},
...
"namespace": "ska-ser-namespace-manager",
"namespace_uid": "d4639c15-52ef-4541-99f0-03b4c432bea7",
"replicaset": {
"name": "ska-ser-namespace-manager-api-7899b46c86"
},
"namespace_labels": {
"kubernetes_io/metadata_name": "ska-ser-namespace-manager",
"name": "ska-ser-namespace-manager"
},
"labels": {
"app_kubernetes_io/managed-by": "Helm",
"helm_sh/chart": "ska-ser-namespace-manager-0.1.4",
"pod-template-hash": "7899b46c86",
"app_kubernetes_io/version": "0.1.4",
"app_kubernetes_io/part-of": "ska-ser-namespace-manager",
"app_kubernetes_io/component": "api",
"app_kubernetes_io/instance": "ska-ser-namespace-manager"
}
}
...
}
Clearly, we could use both kubernetes.container.name
or kubernetes.labels.app_kubernetes_io/component
. Converting this to ES|QL, we get:
Note
In ES|QL, fields with forward slashes (“/”) need to be used escaped as `<field>`
$ API_KEY=<your api key>
$ curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" -H "Authorization: ApiKey $API_KEY" -H 'Content-Type: application/json' \
-d'
{
"query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" AND kubernetes.namespace == \"ska-ser-namespace-manager\" AND `kubernetes.labels.app_kubernetes_io/component` == \"api\" | KEEP message | LIMIT 100"
}
' 2>/dev/null | jq -r ".values[][]"
Another common use-case is seeing the logs from a namespace deployed by a Gitlab job, where we know its id and the target cluster:
$ API_KEY=<your api key>
$ curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" -H "Authorization: ApiKey $API_KEY" -H 'Content-Type: application/json' \
-d'
{
"query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" AND `kubernetes.labels.cicd_skao_int/jobId` == \"10006002600\" | KEEP message | LIMIT 1000"
}
' 2>/dev/null | jq -r ".values[][]"
To view the same logs but from the oldest, we add SORT @timestamp ASC
:
$ API_KEY=<your api key>
$ curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" -H "Authorization: ApiKey $API_KEY" -H 'Content-Type: application/json' \
-d'
{
"query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" AND `kubernetes.labels.cicd_skao_int/jobId` == \"10006002600\" | SORT @timestamp ASC | KEEP message | LIMIT 1000"
}
' 2>/dev/null | jq -r ".values[][]"
To scope it to a specific timeframe we have multiple options, like adding WHERE @timestamp > TO_DATETIME("2025-05-13T05:00:00Z") AND @timestamp <= NOW()
:
$ API_KEY=<your api key>
$ curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" -H "Authorization: ApiKey $API_KEY" -H 'Content-Type: application/json' \
-d'
{
"query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" | WHERE @timestamp > TO_DATETIME(\"2025-05-13T05:00:00Z\") AND @timestamp <= NOW() | SORT @timestamp ASC | KEEP message | LIMIT 1000"
}
' 2>/dev/null | jq -r ".values[][]"
Which outputs:
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Get:4 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]
May 13 05:00:00 stfc-techops-production-storage-mon-i02 bash[384977]: cluster 2025-05-13T05:00:00.000117+0000 mon.stfc-techops-production-storage-mon-i00 (mon.0) 421986 : cluster [INF] overall HEALTH_OK
May 13 05:00:00 stfc-techops-production-storage-mon-i02 bash[384977]: cluster 2025-05-13T05:00:00.135745+0000 mgr.stfc-techops-production-storage-mon-i01.wpgxxy (mgr.56847325) 51399 : cluster [DBG] pgmap v49339: 321 pgs: 321 active+clean; 794 GiB data, 2.3 TiB used, 2.4 TiB / 4.8 TiB avail; 14 KiB/s rd, 1.2 MiB/s wr, 125 op/s
2025-05-13T05:00:00.622Z [INFO] handler: Request received: Method=POST URL=/mutate?timeout=30s
{"script": "/scripts-23023437-10014411580/get_sources"}
I0513 05:00:00.639560 1 eventhandlers.go:186] "Add event for scheduled pod" pod="ska-ser-namespace-manager/check-namespace-6f0004f3-29118540-x29fh"
I0513 05:00:00.639115 1 eventhandlers.go:186] "Add event for scheduled pod" pod="ska-ser-namespace-manager/check-namespace-6f0004f3-29118540-x29fh"
10.100.1.65 - - [13/May/2025:05:00:00 +0000] "GET / HTTP/1.1" 200 540 "-" "kube-probe/1.32"
I0513 05:00:00.652638 1 eventhandlers.go:206] "Update event for scheduled pod" pod="ska-ser-namespace-manager/check-namespace-6f0004f3-29118540-x29fh"
I0513 05:00:00.652656 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="97.192µs" userAgent="kube-probe/1.32" audit-ID="" srcIP="10.100.2.197:50468" resp=200
I0513 05:00:00.653696 1 eventhandlers.go:206] "Update event for scheduled pod" pod="ska-ser-namespace-manager/check-namespace-6f0004f3-29118540-x29fh"
[backend] | 2025-05-13T05:00:00.658Z info: [main] Fetched gitlab.com/ska-telescope/ska-low-sps-smm/ska-low-sps-smm-kernel in 0.87s
[backend] |
[backend] | 2025-05-13T05:00:00.658Z info: [main] Indexing gitlab.com/ska-telescope/ska-low-sps-smm/ska-low-sps-smm-kernel.
If we want just the runner logs, we add WHERE kubernetes.container.name LIKE "test-runner*"
:
$ API_KEY=<your api key>
$ curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" -H "Authorization: ApiKey $API_KEY" -H 'Content-Type: application/json' \
-d'
{
"query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" AND `kubernetes.labels.cicd_skao_int/jobId` == \"10006002600\" | WHERE kubernetes.container.name LIKE \"test-runner*\" | KEEP message"
}
' 2>/dev/null | jq -r ".values[][]"
Note that the most efficient way to build the relevant ES|QL queries is directly in Kibana, where you can simultaneously build the query and inspect the returned documents. After you’ve made your queries, you can then convert them to a curl command, or even create a Python script to do this and automate your debugging workflow.

Kibana ESQL toggle
elktail
elktail is a command line based interface for querying Elasticsearch, that provides a basic search and templated output analog of Kibana.
Installation
Binaries are provided for Linux, MacOS, and Windows - see instructions .
Configuration
elktail can be fully driven from the command line, but it is easier to use a configuration file to simplify use. Because Elasticsearch uses TLS encryption, it is necessary to obtain an up to date sample configuration file from the Systems Team. Or run the following to generate a template from the cental logging facility that supports the pipeline machinery and the ITF environments (Contact Platform Support for access to each of the telescope specific production Elasticsearch instances): https://gitlab.com/ska-telescope/ska-snippets/-/snippets/4842521. Don’t forget to replace the APIKey value in the resulting sample-config.yaml with your own.
How To Query
elktail uses the same KQL query syntax used by Kibana. Full details on the options and use are available here https://gitlab.com/piersharding/elktail#queries . Queries are essentially filters that enable targeted log extraction based on metadata in the JSON documents in Elasticsearch. The following example finds the last (-n 1) log entry that contains application=syslog, datacentre=mid-itf and the message body contains dnsmasq somewhere:
$ elktail -n 1 ska.application: syslog AND ska.datacentre: mid-itf AND message: dnsmasq
[2025-05-13T05:38:28.404Z] [l:] za-itf-gateway :: May 13 07:38:28 za-itf-gateway dnsmasq[3370]: config error is REFUSED (EDE: not ready)
Note
To help with determining what metadata is available to use in a query, elktail provides the -r raw dump, and -p pretty print switches to inspect the entire JSON document associated with a log message inserted in Elasticsearch:
$ elktail -p -n 1 ska.application: syslog AND ska.datacentre: mid-itf AND message: dnsmasq
[
{
"@timestamp": "2025-05-13T05:47:54.201Z",
"_Id": "-o8vyJYBJIvwjgX0cO4o",
"agent": {
"ephemeral_id": "2b3e2cb4-18a8-4f09-89ab-0f8487ddb992",
"id": "76fbd991-4dcb-4f24-9176-bf20b9ab8feb",
"name": "za-itf-gateway",
"type": "filebeat",
"version": "8.4.3"
},
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "za-itf-gateway"
},
"input": {
"type": "log"
},
"log": {
"file": {
"path": "/var/log/syslog"
},
"offset": 138122570
},
"message": "May 13 07:47:53 za-itf-gateway dnsmasq[3370]: query[A] hpiers.obspm.fr.svc.miditf.internal.skao.int from 10.20.0.21",
"ska": {
"application": "syslog",
"datacentre": "mid-itf",
"environment": "production",
"service": "k8s"
}
}
]
Looking at this sample JSON document dump, we can see that there are values that can for instance be used to select the host name associated with log messages eg: host.name: za-itf-gateway .
Looking for Container Logs
There are key metadata elements associated with logs that associate them with a cluster, namespace and container. The core values are:
Field |
Use |
Example |
---|---|---|
input.type |
Log source - syslog, containers |
log, container |
ska.datacentre |
Identifies the datacentre logs come from |
stfc-techops, stfc-dp, mid-tf, low-itf |
kubernetes.namespace |
Namespace |
staging-dish-lmc-ska100 |
kubernetes.statefulset.name |
StatefulSet name |
ds-dish-logger-100 |
kubernetes.container.name |
Container name |
deviceserver |
ska_severity |
SKA custom log field for log severity |
INFO, ERROR |
ska_tags_field |
SKA custom dynamic log message tags |
ska_tags_field.tango-device: ska100/spfrxpu/controller |
Note
there are more SKA custom fields available - for more details on these go here https://confluence.skatelescope.org/display/SWSI/SKA+Log+Message+Format . Also note that more details are available for the builtin Kubernetes metadata here https://www.elastic.co/docs/reference/beats/filebeat/exported-fields-kubernetes-processor .
Putting this all together, we can create a query like this:
$ elktail -n 1 "ska.datacentre: mid-itf AND kubernetes.namespace: staging-dish-lmc-ska100 AND kubernetes.statefulset.name: ds-dish-logger-100 AND kubernetes.container.name: deviceserver AND ska_tags_field.tango-device: ska100/spfrxpu/controller"
[2025-05-13T06:59:28.867Z] [l:INFO] za-itf-cloud03 :: 1|2025-05-13T06:59:28.866Z|INFO|unknown_thread||unknown_file#0|tango-device:ska100/spfrxpu/controller|[/usr/local/src/ska-mid-spfrx-controller-ds/src/SkaMidSpfrxControllerDs.cpp:8205] SkaMidSpfrxControllerDs::monitor_ping(): Ping received from client
Adjust Template Output
Once records are filtered for output, the output format can be adjusted by providing a template referencing fields:
$ elktail -n 1 -F "%ska_log_timestamp :: SEV: [%ska_severity] TAGS: %ska_tags :: MSG -> %ska_message" "ska.datacentre: mid-itf AND kubernetes.namespace: staging-dish-lmc-ska100 AND kubernetes.statefulset.name: ds-dish-logger-100 AND kubernetes.container.name: deviceserver AND ska_tags_field.tango-device: ska100/spfrxpu/controller"
2025-05-13T07:23:09.607Z :: SEV: [INFO] TAGS: tango-device:ska100/spfrxpu/controller :: MSG -> [/usr/local/src/ska-mid-spfrx-controller-ds/src/SkaMidSpfrxControllerDs.cpp:8205] SkaMidSpfrxControllerDs::monitor_ping(): Ping received from client
More details available here https://gitlab.com/piersharding/elktail#format-output .
Monitoring & Logging Dashboards
To facilitate the correlation of application state with resource or infrastructural resources, we also provide the same logs in several Grafana dasboards. Please refer to how to use the monitoring solution for a deep dive on the capabilities of these dashboards and how they should be used.

Grafana pod logs
Note that the timeframe of the logs is automatically adjusted for the timeframe selected in the Grafana dashboard. For these logs, we get the same fields (that we can filter with, in this Grafana dashboard itself) we do in Kibana:

Grafana document fields
This provides a unique view into the application’s status and how it affects its own resource usage as well as the underlying infrastructure.
Logging Service URLs
Datacentre |
Kibana |
Elasticsearch |
stfc-techops (cicd) |
||
stfc-dp (cicd) |
||
aws-* |
||
mid-itf/low-itf |
||
mid-aa |
||
low-aa |
For the environments in central logging, you can filter with:
Datacentre |
ska.datacentre |
ska.environment |
stfc-techops (cicd) |
stfc-techops |
production |
stfc-dp (cicd) |
stfc-dp |
production |
aws-* |
N/A |
N/A |
mid-itf |
mid-itf |
production |
low-itf |
low-itf |
production |
psi-mid |
psi-mid |
production |
digital-signal-psi |
digital-signal-psi |
production |