Run database migrations using native Kubernetes
October 14th, 2022
At work, I recently had to migrate the database of a project from a managed NoSQL solution to PostgreSQL. Besides our data model changing entirely (thank you JSONB
), the second biggest change was to handle database schema migrations properly.
There are many valid solutions to that problem. One may choose to let their Continuous Deployment (CD) system run the migrations. It is also possible to let the application itself run them. In any case, the two only rules are:
The migration need to be run before the application is starting
If the migration fails, new version of the application must not be deployed
In our case, we decided to go with the path of least resistance: our CD did not have credentials to our database cluster, but the Kubernetes cluster of our application did. We then decided to go with native Kubernetes constructs to move fast, or so we thought...
Using initContainers
A big part of what Kubernetes does is to schedule and run Docker containers. To do so, it wraps them within Pods
which are the smallest components that can be scheduled. A Pod
is not a simple Docker container though. It is a wrapper around one or more containers. A useful use case for running two containers inside a single Pod
is to use one of them as a sidecar to your application, to forward logs for instance.
While multiple containers may work concurrently within the same Pod
, it is not possible to control their order of execution. This is where initContainers
comes to the rescue. By declaring initContainers
, you can run Docker instances to execute tasks which need to complete successfully before the main containers
. The only change we had to make was to bundle our migration directory within our Docker production image.
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Values.appName }}
spec:
replicas: 3
template:
spec:
initContainers:
- name: db-migration
image: {{ .Values.image }}
command:
- npx
- prisma
- migrate
- deploy
env:
- name: "DATABASE_URL"
value: {{ .Values.databaseUrl }}
# ...
containers:
- name: app
image: {{ .Values.image }}
env:
- name: "DATABASE_URL"
value: {{ .Values.databaseUrl }}
# ...
By doing so, each new Pod
will first attempt to run migrations. The first one applying them will succeed whereas the other ones will either wait for the database migration lock or do nothing since migrations were already applied.
This should be a solution for 99% of database migration usecases. Sadly, because we ran into a specific limitation of Prisma, our ORM and migration tool, we were not able to execute concurrent migrations from different pods. From there, we had to find a solution to execute the migration from a single place.
Using a dedicated Job
with Helm Hooks
Because our project was described as an Helm Chart, we discovered from the Helm documentation that it is possible to trigger the execution of a Kubernetes Job
before the deployment is actually made.
All we had to do was to extract the initContainer
inside a dedicated Kubernetes Job
resource, and add the "helm.sh/hook": pre-install,pre-upgrade
annotation to it.
apiVersion: batch/v1
kind: Job
metadata:
name: db-migration
annotations:
"helm.sh/hook": pre-install,pre-upgrade
spec:
template:
spec:
restartPolicy: Never
containers:
- name: db-migration
image: {{ .Values.image }}
command:
- npx
- prisma
- migrate
- deploy
env:
- name: DATABASE_URL
value: {{ .Values.databaseUrl }}
# ...
And voilà. Every time our project was deployed with helm upgrade
, that job would be created and our pods would get redeployed... But again no.
Because it quacks like duck does not mean it's a duck. In our case, our Kubernetes files were declared within a Helm Chart, but we were only using Helm for its templating capabilities. Our deployment is executing kubectl apply -f
but not helm upgrade
under the hood of our internal deployment CLI.
But limitations foster creativity right?
Using a dedicated Job
with initContainers
At this point, we knew we needed to use native Kubernetes constructs, and to run migrations from a single place.
Our final approach was the right one: use a combination of a single Job
to run the migration, and to use initContainers
in our project Deployment
to make our Pods
' main containers wait for the Job
to be done.
As of October 2022, if you ever need this, there is a MIT licensed open-source project named k8s-wait-for maintained by GitHub user groundnuty
. However, we could not use it easily because we could only use Docker images from our private Docker registry for security reasons. Depending on a relatively new and untrusted Docker image was not an option for us.
Because we only needed to wait for a Kubernetes Job
, implementing a non-generic version of it was doable without too much of a hassle. All we had to do was to create a new ServiceAccount
with the right permissions to access the Kubernetes API to introspect , and to request the right endpoint with an authenticated call.
Long story short, it looked like this.
apiVersion: v1
kind: ServiceAccount
metadata:
name: wait-for-migration
namespace: {{ .Values.namespace }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: wait-for-migration
namespace: {{ .Values.namespace }}
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: wait-for-migration
namespace: {{ .Values.namespace }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: wait-for-migration
subjects:
- kind: ServiceAccount
name: wait-for-migration
namespace: {{ .Values.namespace }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Values.appName }}
spec:
replicas: 3
template:
spec:
initContainers:
- name: db-migration
image: {{ .Values.image }}
command: [ "/bin/bash" ]
args:
- "-c"
- |
SERVICE_ACCOUNT="/var/run/secrets/kubernetes.io/serviceaccount"
CACERT="${SERVICE_ACCOUNT}/ca.crt"
TOKEN="$(cat ${SERVICE_ACCOUNT}/token)"
URL="https://{{ .Values.cluster }}/apis/batch/v1/namespaces/{{ .Values.namespace }}/jobs/db-migration"
for i in `seq 60`; do
sleep 1
STATUS=$(curl --cacert ${CACERT} \
--header "Authorization: Bearer ${TOKEN}" \
--silent \
${URL} | jq --raw-output '.status.conditions[0].type')
[ "${STATUS}" = "Complete" ] && exit 0
done
exit 1
containers:
- name: app
image: {{ .Values.image }}
env:
- name: "DATABASE_URL"
value: {{ .Values.databaseUrl }}
# ...
Conclusion
Even though we would have preferred that our original Prisma approach worked and It Took More Time Than Expected™️, we are quite happy with that compromise. Our database migration process only relies on native Kubernetes constructs, which makes it easier to understand and leaves our team in total control over what is happening. Also, I deeply think that the journey is more important than the destination. Along the way, our team learned about initContainers
, Helm, how to make authenticated calls to the Kubernetes API, RBAC authorization, and to test all of that, we learned how to run a local Kubernetes cluster using k3d.io. As with many things, learning is failing, and we did learn for this migration!