Monitoring & Logging How-to Guides#

Practical tasks for working with logs and metrics.

Note

Examples use central logging URLs. For production datacentres (low-aa, mid-aa), use the URLs listed in the Monitoring & Logging Reference.

Set up Kibana access#

Guest access (read-only):

Go to Kibana and click Continue as Guest.

Full access:

  1. Create an STS ticket

  2. Log in with your JIRA username and the password from the STS response

  3. Create an API key: Stack ManagementAPI keys

Create API Key in Kibana

Creating an API key in Kibana#

Test your API key:

curl -k -H "Authorization: ApiKey <your-api-key>" \
  https://logging.stfc.skao.int:9200/_cat/health

Filter logs in Kibana#

Kibana’s Discover view provides log searching with these controls:

  1. Data view — Select index patterns

  2. Date selection — Set time range

  3. Filters — Filter by fields with AND/OR operators

  4. Search bar — KQL expressions (combined with filters using AND)

  5. Field list — Available fields to display

  6. Document view — Matching log entries

Kibana discover view

Kibana Discover view#

To find Device Server logs in a specific namespace:

Kibana filter

Filtering logs by namespace#

Use filters for exact matches and KQL for substring searches. Each filter shows the most common values:

Kibana field values

Field value suggestions#

Query Elasticsearch locally#

For efficient log analysis, query Elasticsearch directly using ES|QL.

Find logs by namespace and component:

API_KEY=<your-api-key>
curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" \
  -H "Authorization: ApiKey $API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" AND kubernetes.namespace == \"ska-ser-namespace-manager\" AND `kubernetes.labels.app_kubernetes_io/component` == \"api\" | KEEP message | LIMIT 100"
  }' 2>/dev/null | jq -r ".values[][]"

Find logs by GitLab job ID:

curl -qk -X POST "https://logging.stfc.skao.int:9200/_query?format=json&pretty" \
  -H "Authorization: ApiKey $API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "FROM filebeat-8* | WHERE ska.datacentre == \"stfc-techops\" AND ska.environment == \"production\" AND `kubernetes.labels.cicd_skao_int/jobId` == \"10006002600\" | KEEP message | LIMIT 1000"
  }' 2>/dev/null | jq -r ".values[][]"

Sort by oldest first:

Add SORT @timestamp ASC to the query.

Filter by time range:

Add WHERE @timestamp > TO_DATETIME("2025-05-13T05:00:00Z") AND @timestamp <= NOW().

Filter by container name:

Add WHERE kubernetes.container.name LIKE "test-runner*".

Tip

Build queries in Kibana’s ES|QL mode first, then convert to curl commands.

Kibana ESQL toggle

Enable ES|QL in Kibana#

Use elktail for command-line queries#

elktail provides command-line log queries using KQL syntax.

Installation:

Download binaries from the releases page.

Configuration:

Generate a config file using this snippet, then replace the APIKey value.

Basic query:

elktail -n 1 ska.application: syslog AND ska.datacentre: mid-itf AND message: dnsmasq

Query container logs:

elktail -n 1 "ska.datacentre: mid-itf AND kubernetes.namespace: staging-dish-lmc-ska100 AND kubernetes.statefulset.name: ds-dish-logger-100 AND kubernetes.container.name: deviceserver"

Custom output format:

elktail -n 1 -F "%ska_log_timestamp :: SEV: [%ska_severity] MSG -> %ska_message" \
  "ska.datacentre: mid-itf AND kubernetes.namespace: staging-dish-lmc-ska100"

Inspect raw JSON:

elktail -p -n 1 ska.application: syslog AND ska.datacentre: mid-itf

Use Grafana dashboards#

Access Grafana:

  1. Go to https://monitoring.skao.int

  2. Click Sign in with Azure AD

  3. Use <jira-username>@<community.skao.int or assoc.skao.int> with your JIRA password

Browse dashboards:

STFC Dashboards Browsing page

Browsing Grafana dashboards#

Create and share dashboards:

Follow the Grafana dashboard guide to create custom dashboards.

Dashboard sharing example

Sharing a Grafana dashboard#

View logs in Grafana#

Several Grafana dashboards include integrated log views:

Grafana pod logs

Pod logs in Grafana#

The log timeframe adjusts automatically to match the dashboard selection. You can filter using the same fields available in Kibana:

Grafana document fields

Log fields in Grafana#

Check Prometheus alerts#

Alert Manager UI (requires VPN):

STFC Alert Manager homepage

Prometheus Alert Manager#

Slack alert channels:

STFC datacentre:

DP datacentre:

Monitor namespace health#

The SKA Namespace Manager monitors namespace health and sends Slack notifications when issues occur:

Marvin Namespace Manager

Namespace health alerts from Marvin#

Alerts include affected resources, suggestions, and runbooks. They link to the GitLab job that deployed the namespace.

View namespace status in Headlamp:

Headlamp namespace status

Namespace status in Headlamp#

Use make targets#

These make targets support monitoring and logging workflows:

# Show pods, images, and Helm dependencies
KUBE_NAMESPACE=<namespace> HELM_RELEASE=<release> make k8s-namespace-info

# Generate prebuilt monitoring URLs
make k8s-namespace-links

# Output logs for all pods (use VERBOSE_WAIT=true in pipelines)
VERBOSE_WAIT=true KUBE_NAMESPACE=<namespace> make k8s-podlogs

The pipeline machinery calls these targets automatically.