Deploying and Scaling Databases with StatefulSets in Kubernetes

Updated: February 1, 2024 By: Guest Contributor Post a comment

Introduction

Deploying databases within a containerized environment, such as Kubernetes, requires careful consideration of persistence, consistency, and scalability. Unlike stateless deployments where containers can be destroyed and recreated without losing any critical data, databases store state that must persist beyond the lifecycle of individual pod instances. This is where StatefulSets come into play. In this tutorial, we’ll explore how to deploy and scale databases with StatefulSets in Kubernetes.

StatefulSets are a Kubernetes workload API object that manages stateful applications. They provide unique, sequential, stable identifiers (hostnames) and stable, persistent storage for each pod that makes up the StatefulSet. These features make StatefulSets an ideal choice for deploying distributed databases and other stateful applications within Kubernetes environments.

Prerequisites

  • A basic understanding of Kubernetes concepts
  • A Kubernetes cluster up and running
  • Access to the command line interface (CLI) of Kubernetes (kubectl)
  • Access to Persistent Volumes in your Kubernetes cluster

Understanding StatefulSets

Before diving into deploying a database, it’s essential to understand the fundamentals of StatefulSets:

  • Stable Storage: Each pod in a StatefulSet is matched with a PersistentVolume, ensuring the data persists across pod (re)scheduling.
  • Stable Network Identities: Each pod gets a unique hostname derived from the name of the StatefulSet.
  • Ordered, Graceful Deployment and Scaling: Pods are started serially in order, and they are deleted inversely in order when scaling down.
  • Ordered, Automated Rolling Updates: When the StatefulSet definition is updated, changes are rolled out incrementally to each pod.

Defining a StatefulSet

To illustrate deploying a StatefulSet, we’ll use the example of a MongoDB replica set. Let’s start by defining the StatefulSet in a YAML file. Create a file named mongo-statefulset.yaml and add the following content:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo
spec:
  serviceName: "mongo"
  replicas: 3
  selector:
    matchLabels:
      role: mongo
  template:
    metadata:
      labels:
        role: mongo
    spec:
      containers:
      - name: mongo
        image: mongo:4.4
        command:
        - mongod
        - "--replSet"
        - rs0
        - "--bind_ip"
        - 0.0.0.0
        ports:
        - containerPort: 27017
        volumeMounts:
        - name: mongo-persistent-storage
          mountPath: /data/db
  volumeClaimTemplates:
  - metadata:
      name: mongo-persistent-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

This YAML file describes a StatefulSet with three MongoDB replica set members. Each pod will have its PersistentVolume to store data and maintain its state across restarts and rescheduling.

Creating the StatefulSet

To create the StatefulSet, apply the YAML file using kubectl:

kubectl apply -f mongo-statefulset.yaml

Once applied, Kubernetes will initiate the pods defined in the StatefulSet. You can monitor the status with the following command:

kubectl get statefulsets

Accessing the Stateful Pods

The pods created by the StatefulSet will have a stable network identity in the form of <statefulset-name>-<ordinal-index>. For the given example, the pods would be mongo-0, mongo-1, and mongo-2. You can interact with a specific pod like so:

kubectl exec -it mongo-0 -- mongo

Scaling the StatefulSet

To scale your StatefulSet, simply update the replicas field in the YAML file, or use the kubectl scale command:

kubectl scale statefulsets mongo --replicas=5

Kubernetes will incrementally add the additional pods, maintaining the order and identity guarantees that define a StatefulSet.

Updating the StatefulSet

To update the configuration of your database, modify the corresponding fields in your StatefulSet YAML file and apply the changes:

kubectl apply -f mongo-statefulset.yaml

Kubernetes performs a rolling update, updating pod instances one at a time, and respecting the ordering guarantees.

Deleting the StatefulSet

When it is time to delete your StatefulSet, be aware that by default the PersistentVolumes created will not be deleted. This prevents accidental data loss. To delete the StatefulSet and its PersistentVolumes, execute:

kubectl delete statefulset mongo
kubectl delete pvc -l role=mongo

In conclusion, StatefulSets are an exceptionally powerful tool within Kubernetes that enable the deployment and management of stateful applications such as databases. With the steps provided in this tutorial, you can deploy, scale, update, and manage databases confidently within a Kubernetes environment.