Restarting Kubernetes pods using a CronJob

Restarting Kubernetes pods using a CronJob

In an absolutely perfect world, we’d never have to restart our server software ever, because it would be flawless. There would be no bugs, memory leaks, state locks, and we’d all get along with each other!

Unfortunately, we live in a much crappier world where our software is imperfect, and we don’t have enough time or resources to fix it properly.

In the olden days, we’d just put a one-liner in root’s crontab, and be done with it:

0 0 * * *     service foobar restart

But thanks to the wonderful complexity of Kubernetes, it takes a bit more finesse. Advanced cluster schedulers make the hard things easy, but they also make the easy things hard!!

Back in my day, it was one line in the crontab!

For better or for worse, we’ve moved past the simple Debian Days.

Here’s how to achieve the same thing that one crontab entry used to do.

  1. Define cluster permissions for a Service Account
  2. Create a ConfigMap to inject the script into the cluster
  3. Use a CronJob resource to execute the daily restart task
  4. Sit back and watch your pods restart…

Assumptions

First of all, let’s get some assumptions clear:

  • The Deployment that must be restarted daily is called foobar
  • The CronJob & ServiceAccount that execute this restart are called foobar-restart
  • The foobar deployment is capable of performing a rolling-restart without any disruption, or disruption at midnight UTC is acceptable
  • Everything can safely be placed in the Default namespace

Cluster permissions

For a cronjob to restart a deployment, it must have a Role and ServiceAccount to have sufficient permissions. By default, a pod cannot alter the kubernetes environment itself, so it must be specifically granted this access. This Role is then associated with the ServiceAccount, which is an identity that the Pod assumes at runtime. This means that the Pod can act a bit like somebody sitting at a PC using the Kubectl tool, but with a much more restricted set of allowed commands.

So in summary, we must create:

  • A ServiceAccount - an identity for our CronJob
  • A Role - The components of our Cluster that the Pod may alter
  • A RoleBinding - An association of the ServiceAccount to the Role
---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: foobar-restart
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: foobar-restart
  namespace: default
rules:
  - apiGroups: ["apps", "extensions"]
    resources: ["deployments"]
    resourceNames: ["foobar"]
    verbs: ["get", "patch", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: foobar-restart
  namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: foobar-restart
subjects:
  - kind: ServiceAccount
    name: foobar-restart
    namespace: default

Next, the restart script may be wrapped in the configmap.

Restart script

To execute the restart, a very simple script is created that uses the kubectl tool to perform a rolling restart of the foobar deployment. This bash script has a pair of very important safety features, lines 2&3, that I would recommend everybody use when possible.

Instead of simply scp‘ing the shell script into the /opt directory of a Debian box somewhere, in the Kubernetes world it’s inserted into a ConfigMap resource. This is essentially a generic wrapper for any text or binary data that must be passed through to a Pod at a given filesystem mount path. Instead of using it to inject a config file, it is being used to place a shell script in our Job Pod.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: foobar-restart
data:
  restart.sh: |
    #!/bin/bash
    set -euo pipefail
    trap 'echo "Error restarting!"' ERR 
    kubectl rollout restart deploy/foobar
    kubectl rollout status  deploy/foobar    

CronJob template

With the authorization & config pieces ready, we can deploy the CronJob for the restart.

A ConJob on its own simply creates Job resources at a given interval, which themselves are just Pods with different handling for their exit codes. When a typical Pod exits with a return code 0 it is restarted immediately, but with a Job it is marked as Complete.

---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: foobar-restart
  namespace: default
spec:
  concurrencyPolicy: "Forbid"
  schedule: '0 0 * * *' # Midnight UTC
  jobTemplate:
    spec:
      backoffLimit: 5
      activeDeadlineSeconds: 300
      template:
        spec:
          nodeSelector:
            type: "private"
          containers:
          - name: kubectl
            image:  alpine/k8s:1.33.0
            command: [ "/bin/bash", "/scripts/restart.sh" ]
            volumeMounts:
            - name: foobar-restart
              mountPath: /scripts
          restartPolicy: Never
          serviceAccountName: foobar-restart
          volumes:
          - name: foobar-restart
            configMap:
              defaultMode: 0644
              name: foobar-restart

If using a service mesh like Isito, make sure to include an additional annotation to exclude the sidecar for this pod, or else the job will be “in progress” forever since the sidecar process won’t ever exit.

spec:
  jobTemplate:
    spec: 
      template:
        metadata:
          annotations:
            sidecar.istio.io/inject: "false"

During execution

When the CronJob triggers at midnight, it will output a log more or less like this:

deployment.apps/foobar restarted
Waiting for deployment spec update to be observed...
Waiting for deployment "foobar" rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for deployment "foobar" rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for deployment "foobar" rollout to finish: 0 of 1 updated replicas are available...
Waiting for deployment "foobar" rollout to finish: 0 of 1 updated replicas are available...
deployment "foobar" successfully rolled out

When complete, you’ll have a fresh new pod in place of the old one. And with this in place, you can simply ignore the leaky memory situation that caused this in the first place! Hurrah!