Introduction
Deploying databases within a containerized environment, such as Kubernetes, requires careful consideration of persistence, consistency, and scalability. Unlike stateless deployments where containers can be destroyed and recreated without losing any critical data, databases store state that must persist beyond the lifecycle of individual pod instances. This is where StatefulSets come into play. In this tutorial, we’ll explore how to deploy and scale databases with StatefulSets in Kubernetes.
StatefulSets are a Kubernetes workload API object that manages stateful applications. They provide unique, sequential, stable identifiers (hostnames) and stable, persistent storage for each pod that makes up the StatefulSet. These features make StatefulSets an ideal choice for deploying distributed databases and other stateful applications within Kubernetes environments.
Prerequisites
- A basic understanding of Kubernetes concepts
- A Kubernetes cluster up and running
- Access to the command line interface (CLI) of Kubernetes (kubectl)
- Access to Persistent Volumes in your Kubernetes cluster
Understanding StatefulSets
Before diving into deploying a database, it’s essential to understand the fundamentals of StatefulSets:
- Stable Storage: Each pod in a StatefulSet is matched with a PersistentVolume, ensuring the data persists across pod (re)scheduling.
- Stable Network Identities: Each pod gets a unique hostname derived from the name of the StatefulSet.
- Ordered, Graceful Deployment and Scaling: Pods are started serially in order, and they are deleted inversely in order when scaling down.
- Ordered, Automated Rolling Updates: When the StatefulSet definition is updated, changes are rolled out incrementally to each pod.
Defining a StatefulSet
To illustrate deploying a StatefulSet, we’ll use the example of a MongoDB replica set. Let’s start by defining the StatefulSet in a YAML file. Create a file named mongo-statefulset.yaml
and add the following content:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
replicas: 3
selector:
matchLabels:
role: mongo
template:
metadata:
labels:
role: mongo
spec:
containers:
- name: mongo
image: mongo:4.4
command:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- 0.0.0.0
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
This YAML file describes a StatefulSet with three MongoDB replica set members. Each pod will have its PersistentVolume to store data and maintain its state across restarts and rescheduling.
Creating the StatefulSet
To create the StatefulSet, apply the YAML file using kubectl
:
kubectl apply -f mongo-statefulset.yaml
Once applied, Kubernetes will initiate the pods defined in the StatefulSet. You can monitor the status with the following command:
kubectl get statefulsets
Accessing the Stateful Pods
The pods created by the StatefulSet will have a stable network identity in the form of <statefulset-name>-<ordinal-index>
. For the given example, the pods would be mongo-0
, mongo-1
, and mongo-2
. You can interact with a specific pod like so:
kubectl exec -it mongo-0 -- mongo
Scaling the StatefulSet
To scale your StatefulSet, simply update the replicas
field in the YAML file, or use the kubectl scale
command:
kubectl scale statefulsets mongo --replicas=5
Kubernetes will incrementally add the additional pods, maintaining the order and identity guarantees that define a StatefulSet.
Updating the StatefulSet
To update the configuration of your database, modify the corresponding fields in your StatefulSet YAML file and apply the changes:
kubectl apply -f mongo-statefulset.yaml
Kubernetes performs a rolling update, updating pod instances one at a time, and respecting the ordering guarantees.
Deleting the StatefulSet
When it is time to delete your StatefulSet, be aware that by default the PersistentVolumes created will not be deleted. This prevents accidental data loss. To delete the StatefulSet and its PersistentVolumes, execute:
kubectl delete statefulset mongo
kubectl delete pvc -l role=mongo
In conclusion, StatefulSets are an exceptionally powerful tool within Kubernetes that enable the deployment and management of stateful applications such as databases. With the steps provided in this tutorial, you can deploy, scale, update, and manage databases confidently within a Kubernetes environment.