Debugging with Coder

Coder is an open-source platform for creating cloud-based development environments, with the IDEs that developers already use. By nature, it is a remote environment living inside the Kubernetes cluster, giving secure access to cluster resources.

Note

Coder is a highly configurable platform, so if you need something that is not currently provided (_i.e._, image or other compute options), please reach out to the System Team

Datacentre

Coder

stfc-techops (cicd)

https://coder.k8s.stfc.skao.int/login

stfc-dp (cicd

N/A

aws-*

N/A

mid-itf/low-itf

N/A

mid-aa

N/A

low-aa

N/A

Note that production environments are not expected to have such a tool.

Getting started

First we need to log in. Currently, we provide Gitlab as an authentication method:

Coder home page

Coder home page

You can now navigate to see your workspaces. If you don’t have one already - which is probably the case - you can create one.

Coder create workspace

Coder create workspace

You need to give it a name and some compute requirements (ie: CPU, RAM and Disk). Note that some of these requirements are immutable but some are not.

Coder workspace configuration

Coder workspace configuration

When it is ready, you need to connect to it using one of the offered connection options:

Coder connection options

Coder connection options

Currently, you can access it with:

  • VS Code Desktop app

  • Browser-based JupyterLab

  • Browser-based VS Code

  • Browser-based terminal

  • SSH

Management

After your workspace is created, you can manage it by clicking on it in the workspaces page. You can stop and start the environment, at any time, or restart it. For the settings drop-down we can configure the workspace’s parameters and scheduling rules.

Parameters

As mentioned earlier, we can adjust the environment’s compute resources, except for the disk space:

Coder workspace parameters

Coder workspace parameters

As you can see, you can only use one of the predefined values for each compute property.

Schedule

If you want to better manage the lifecycle of your workspace and save on resource usage, you can schedule it to autostart and autostop for any day of the week and also control the autostop of the workspace:

Coder workspace scheduling

Coder workspace scheduling

As you can see, you can only use one of the predefined values for each compute property.

Access and Tools

Note

Any package installation done outside of the home directory is currently NOT persisted

Using any connection option you can access the terminal and any tools installed there. If you want a package that is not available, you can do apt commands to install them. Also, you can open multiple connections with different options at once:

Coder multiple connections

Coder multiple simultaneous connections

Using Coder is therefore the primary way for developers to access the development clusters for debugging, as it has built-in security and uses short-lived credentials.

By default, the following cli tools are provided:

  • kubectl

  • helm

  • k9s

  • tango_admin

  • jupyter’s toolset

  • poetry

  • pip

Other useful tools could be installed:

$ sudo apt update
$ sudo apt install -y jq # work with json documents
$ sudo apt install -y netcat # brings 'nc' for debugging TCP connections
$ sudo apt install -y dnsutils # brings 'dig' and 'nslookup' for debugging DNS

From this terminal, you can connect to any Pod or Service in the Kubernetes cluster. As an example, we will use some TANGO Device Server pods, but this can be applied to any application:

Note

When accessing resources in any namespace by its domain name, you must use the <namespace>.svc top-level domain

$ kubectl get svc -n staging-ska-tango-examples

NAME                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
databaseds-tangodb-tango-databaseds   ClusterIP   10.105.219.155   <none>        3306/TCP                        30d
ds-asynctabata-asyncounters           ClusterIP   10.98.180.191    <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-asynctabata-tabata                 ClusterIP   10.110.76.23     <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-eventreceiver-01                   ClusterIP   10.99.48.188     <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-forattrtabata-test                 ClusterIP   10.103.62.54     <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-longrunning-lrcontroller           ClusterIP   10.108.70.136    <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-longrunning-stations               ClusterIP   10.106.243.134   <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-longrunning-tiles                  ClusterIP   10.98.155.103    <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-tabata-counters                    ClusterIP   10.108.225.106   <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-tabata-tabata                      ClusterIP   10.108.45.10     <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-theexample-test                    ClusterIP   10.110.94.135    <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-theexample-test2                   ClusterIP   10.97.129.245    <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-timer-counters                     ClusterIP   10.96.211.236    <none>        45450/TCP,45460/TCP,45470/TCP   30d
ds-timer-timer                        ClusterIP   10.102.128.246   <none>        45450/TCP,45460/TCP,45470/TCP   30d
tango-databaseds                      ClusterIP   10.104.197.88    <none>        10000/TCP                       30d

$ nc -vz databaseds-tangodb-tango-databaseds.staging-ska-tango-examples.svc 3306

Connection to databaseds-tangodb-tango-databaseds.staging-ska-tango-examples.svc (10.105.219.155) 3306 port [tcp/*] succeeded!

$ nc -vz tango-databaseds.staging-ska-tango-examples.svc 10000

Connection to tango-databaseds.staging-ska-tango-examples.svc (10.104.197.88) 10000 port [tcp/*] succeeded!

$ nslookup tango-databaseds.staging-ska-tango-examples.svc

;; Got recursion not available from 10.96.0.10
;; Got recursion not available from 10.96.0.10
;; Got recursion not available from 10.96.0.10
Server:         10.96.0.10
Address:        10.96.0.10#53
Name:   tango-databaseds.staging-ska-tango-examples.svc.techops.internal.skao.int
Address: 10.104.197.88
;; Got recursion not available from 10.96.0.10

We can also use tango_admin to do some queries:

$ export TANGO_HOST=tango-databaseds.staging-ska-tango-examples.svc:10000
$ tango_admin --ping-database

0

$ tango_admin --server-list

asynctabata DataBaseds eventreceiver forattrtabata longrunning tabata TangoAccessControl TangoRestServer TangoTest theexample timer

$ tango_admin --server-instance-list asynctabata

asyncounters tabata

If you fancy, you can install itango using pip:

$ pip install itango
$ export TANGO_HOST=tango-databaseds.staging-ska-tango-examples.svc:10000
$ itango3
Coder itango3 session

Coder itango3 session

Cluster access

The permissions you have within a Coder workspace are limited but should give you access to everything you need. If you know the namespace you are targetting, you can do most list and view operations:

$ kubectl get pods -n staging-ska-tango-examples

NAME                                     READY   STATUS    RESTARTS       AGE
databaseds-ds-tango-databaseds-0         1/1     Running   6 (9d ago)     9d
databaseds-tangodb-tango-databaseds-0    1/1     Running   0              2d8h
ds-asynctabata-asyncounters-0            1/1     Running   0              46h
ds-asynctabata-tabata-0                  1/1     Running   0              46h
ds-eventreceiver-01-0                    1/1     Running   0              46h
ds-forattrtabata-test-0                  1/1     Running   0              46h
ds-longrunning-lrcontroller-0            1/1     Running   0              46h
ds-longrunning-stations-0                1/1     Running   0              46h
ds-longrunning-tiles-0                   1/1     Running   0              46h
ds-tabata-counters-0                     1/1     Running   38 (25m ago)   46h
ds-tabata-tabata-0                       1/1     Running   0              46h
ds-theexample-test-0                     1/1     Running   0              46h
ds-theexample-test2-0                    1/1     Running   0              46h
ds-timer-counters-0                      1/1     Running   0              46h
ds-timer-timer-0                         1/1     Running   0              46h
ska-tango-base-itango-console            1/1     Running   0              30d
theexample-admin-test-5c7dd96859-w97gs   1/1     Running   0              2d9h

$ kubectl delete pod ds-eventreceiver-01-0 -n staging-ska-tango-examples

Error from server (Forbidden): pods "ds-eventreceiver-01-0" is forbidden: User "system:serviceaccount:coder:coder-dev" cannot delete resource "pods" in API group "" in the namespace "staging-ska-tango-examples"

$ kubectl describe pod ds-eventreceiver-01-0 -n staging-ska-tango-examples

Name:             ds-eventreceiver-01-0
Namespace:        staging-ska-tango-examples
Priority:         0
Service Account:  default
Node:             stfc-techops-production-cicd-md-0-2nwqg-9j264/10.100.3.225
Start Time:       Mon, 12 May 2025 12:21:50 +0000
Labels:           app=ska-tango-examples
                  app.kubernetes.io/instance=eventreceiver-01
                  app.kubernetes.io/managed-by=DeviceServerController
                  app.kubernetes.io/name=deviceserver
                  apps.kubernetes.io/pod-index=0
                  cicd.skao.int/author=matteo1981
                  cicd.skao.int/authorId=3003086
                  cicd.skao.int/branch=master
                  cicd.skao.int/commit=210b47e9e118af23f12dc68008e3bb4c35d7cf83
                  cicd.skao.int/job=deploy-staging
                  cicd.skao.int/jobId=10006002652
                  cicd.skao.int/pipelineId=1812880305
                  cicd.skao.int/project=ska-tango-examples
                  cicd.skao.int/projectId=9673989
                  cicd.skao.int/team=system
                  component=eventreceiver-01
                  controller-revision-hash=ds-eventreceiver-01-778587966b
                  domain=ska-tango-examples
                  function=ska-tango-examples-eventreceiver
                  statefulset.kubernetes.io/pod-name=ds-eventreceiver-01-0
                  subsystem=ska-tango-examples
Annotations:      cni.projectcalico.org/containerID: 344e004650cf1ce5c61cc588296f748b791d88eca88d443545a123171a2c17db
                  cni.projectcalico.org/podIP: 10.10.246.132/32
                  cni.projectcalico.org/podIPs: 10.10.246.132/32
Status:           Running
IP:               10.10.246.132
IPs:
IP:           10.10.246.132
Controlled By:  StatefulSet/ds-eventreceiver-01
Containers:
deviceserver:
   Container ID:  containerd://92a03b82112d454ee1f288b0520c391906cbe06d33a03ebb02ea6d8a7ddd4dff
   Image:         registry.gitlab.com/ska-telescope/ska-tango-examples/ska-tango-examples:0.5.2-dev.c210b47e9
   Image ID:      registry.gitlab.com/ska-telescope/ska-tango-examples/ska-tango-examples@sha256:2885c0550fbdd15edf9537de3af328b5752b4d444e21f817fcb3178835eed791
   Ports:         45450/TCP, 45460/TCP, 45470/TCP
   Host Ports:    0/TCP, 0/TCP, 0/TCP
   Command:
      /start-deviceserver.sh
   State:          Running
      Started:      Mon, 12 May 2025 12:21:52 +0000
   Ready:          True
   Restart Count:  0
   Limits:
      cpu:     100m
      memory:  100Mi
   Requests:
      cpu:      50m
      memory:   50Mi
   Liveness:   tcp-socket :45450 delay=3s timeout=3s period=10s #success=1 #failure=6
   Readiness:  tcp-socket :45450 delay=1s timeout=3s period=3s #success=1 #failure=20
   Environment:
      TANGO_HOST:                tango-databaseds.staging-ska-tango-examples.svc.techops.internal.skao.int:10000
      TANGO_ZMQ_EVENT_PORT:      45470
      TANGO_ZMQ_HEARTBEAT_PORT:  45460
   Mounts:
      /eventreceiver.py from ds-configuration (rw,path="eventreceiver.py")
      /start-deviceserver.sh from ds-configuration (rw,path="start-deviceserver.sh")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cqp86 (ro)
Conditions:
Type                        Status
PodReadyToStartContainers   True
Initialized                 True
Ready                       True
ContainersReady             True
PodScheduled                True
Volumes:
ds-configuration:
   Type:      ConfigMap (a volume populated by a ConfigMap)
   Name:      ds-configs-eventreceiver-01
   Optional:  false
kube-api-access-cqp86:
   Type:                    Projected (a volume that contains injected data from multiple sources)
   TokenExpirationSeconds:  3607
   ConfigMapName:           kube-root-ca.crt
   ConfigMapOptional:       <nil>
   DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                           node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

Unlike other solutions, Coder gives persistent access to all allowed namespaces in the cluster. In the future we expect to streamline the process of crafting fine-grained permissions so that users can have higher levels of access to their own namespaces, while protecting other people’s namespaces.

Remote interactive debugging

When using the code-server connection, you can use the browser-based VS Code features fully, including installing any extensions you might want. Also, most of the hotkeys remain the same. This is specially useful if you have a local debugging workflow that you want to re-use in a deployment made to a remote Kubernetes cluster.

As a mere example, we are going to demonstrate this using two separate Coder workspaces, but we could be targetting any Pod in the cluster.

First, we place the following script on both machines - which is akin to doing a git checkout of the target application’s repository. On the “application” workspace, it will be placed under /home/tango:

import time

def greet(name):
      print(f"Hello, {name}!")

def calculate_square(number):
   return number ** 2

def main():
   name="Alice"
   while True:
      greet(name)
      result = calculate_square(5)
      print(f"The square of 5 is {result}")
      time.sleep(5)

if __name__ == "__main__":
   main()

We can start it with a debugging session by doing:

$ pip install debugpy
$ python -m debugpy --listen 0.0.0.0:5678 --wait-for-client script.py

(waiting for client connection)

Now, on other the workspace where we run code-server, we can follow this guide to set up a launch file to create the remote debugging session. For that, we need to figure out our application’s IP (in this case, the other Coder workspace):

$ kubectl get pods -n coder -o wide | grep posorio-test

coder-pedroosorio-posorio-test-75cbf45ddf-zktnp            1/1     Running   0               15m     10.10.236.81    stfc-techops-production-cicd-md-0-2nwqg-hx58h   <none>           <none>

We can create the launch file under .vscode/launch.json with:

{
   "version": "0.2.0",
   "configurations": [
      {
         "name": "Attach",
         "type": "debugpy",
         "request": "attach",
         "connect": {
               "host": "10.10.236.81",
               "port": 5678
         },
         "pathMappings": [
               {
                  "localRoot": "${workspaceFolder}",
                  "remoteRoot": "/home/tango"
               }
         ]
      }
   ]
}

Note that for breakpoints to work properly, we need to have all the required pathMappings in place. Once we attach using the “Attach” configuration, the program starts:

Coder VSCode attach debug

Coder VSCode attach debug

After a few runs, we can set a break-point and change the name we are greeting with:

Coder Breakpoint context change

Coder Breakpoint context change

We can see that this affects the session on both workspaces (as we would expect):

Coder remote context change

Coder remote context change

Although a simple example, it showcases what can be done to do really powerful debug sessions on actual applications deployed to a Kubernetes cluster.