Sling Academy
Home/DevOps/Deploying and Scaling Databases with StatefulSets in Kubernetes

Deploying and Scaling Databases with StatefulSets in Kubernetes

Last updated: February 01, 2024

Introduction

Deploying databases within a containerized environment, such as Kubernetes, requires careful consideration of persistence, consistency, and scalability. Unlike stateless deployments where containers can be destroyed and recreated without losing any critical data, databases store state that must persist beyond the lifecycle of individual pod instances. This is where StatefulSets come into play. In this tutorial, we’ll explore how to deploy and scale databases with StatefulSets in Kubernetes.

StatefulSets are a Kubernetes workload API object that manages stateful applications. They provide unique, sequential, stable identifiers (hostnames) and stable, persistent storage for each pod that makes up the StatefulSet. These features make StatefulSets an ideal choice for deploying distributed databases and other stateful applications within Kubernetes environments.

Prerequisites

  • A basic understanding of Kubernetes concepts
  • A Kubernetes cluster up and running
  • Access to the command line interface (CLI) of Kubernetes (kubectl)
  • Access to Persistent Volumes in your Kubernetes cluster

Understanding StatefulSets

Before diving into deploying a database, it’s essential to understand the fundamentals of StatefulSets:

  • Stable Storage: Each pod in a StatefulSet is matched with a PersistentVolume, ensuring the data persists across pod (re)scheduling.
  • Stable Network Identities: Each pod gets a unique hostname derived from the name of the StatefulSet.
  • Ordered, Graceful Deployment and Scaling: Pods are started serially in order, and they are deleted inversely in order when scaling down.
  • Ordered, Automated Rolling Updates: When the StatefulSet definition is updated, changes are rolled out incrementally to each pod.

Defining a StatefulSet

To illustrate deploying a StatefulSet, we’ll use the example of a MongoDB replica set. Let’s start by defining the StatefulSet in a YAML file. Create a file named mongo-statefulset.yaml and add the following content:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo
spec:
  serviceName: "mongo"
  replicas: 3
  selector:
    matchLabels:
      role: mongo
  template:
    metadata:
      labels:
        role: mongo
    spec:
      containers:
      - name: mongo
        image: mongo:4.4
        command:
        - mongod
        - "--replSet"
        - rs0
        - "--bind_ip"
        - 0.0.0.0
        ports:
        - containerPort: 27017
        volumeMounts:
        - name: mongo-persistent-storage
          mountPath: /data/db
  volumeClaimTemplates:
  - metadata:
      name: mongo-persistent-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

This YAML file describes a StatefulSet with three MongoDB replica set members. Each pod will have its PersistentVolume to store data and maintain its state across restarts and rescheduling.

Creating the StatefulSet

To create the StatefulSet, apply the YAML file using kubectl:

kubectl apply -f mongo-statefulset.yaml

Once applied, Kubernetes will initiate the pods defined in the StatefulSet. You can monitor the status with the following command:

kubectl get statefulsets

Accessing the Stateful Pods

The pods created by the StatefulSet will have a stable network identity in the form of <statefulset-name>-<ordinal-index>. For the given example, the pods would be mongo-0, mongo-1, and mongo-2. You can interact with a specific pod like so:

kubectl exec -it mongo-0 -- mongo

Scaling the StatefulSet

To scale your StatefulSet, simply update the replicas field in the YAML file, or use the kubectl scale command:

kubectl scale statefulsets mongo --replicas=5

Kubernetes will incrementally add the additional pods, maintaining the order and identity guarantees that define a StatefulSet.

Updating the StatefulSet

To update the configuration of your database, modify the corresponding fields in your StatefulSet YAML file and apply the changes:

kubectl apply -f mongo-statefulset.yaml

Kubernetes performs a rolling update, updating pod instances one at a time, and respecting the ordering guarantees.

Deleting the StatefulSet

When it is time to delete your StatefulSet, be aware that by default the PersistentVolumes created will not be deleted. This prevents accidental data loss. To delete the StatefulSet and its PersistentVolumes, execute:

kubectl delete statefulset mongo
kubectl delete pvc -l role=mongo

In conclusion, StatefulSets are an exceptionally powerful tool within Kubernetes that enable the deployment and management of stateful applications such as databases. With the steps provided in this tutorial, you can deploy, scale, update, and manage databases confidently within a Kubernetes environment.

Next Article: Kubernetes: Handling Persistent Storage in StatefulSets

Previous Article: Managing Stateful Applications with StatefulSets in Kubernetes

Series: Kubernetes Tutorials

DevOps

You May Also Like

  • How to reset Ubuntu to factory settings (4 approaches)
  • Making GET requests with cURL: A practical guide (with examples)
  • Git: What is .DS_Store and should you ignore it?
  • NGINX underscores_in_headers: Explained with examples
  • How to use Jenkins CI with private GitHub repositories
  • Terraform: Understanding State and State Files (with Examples)
  • SHA1, SHA256, and SHA512 in Terraform: A Practical Guide
  • CSRF Protection in Jenkins: An In-depth Guide (with examples)
  • Terraform: How to Merge 2 Maps
  • Terraform: How to extract filename/extension from a path
  • JSON encoding/decoding in Terraform: Explained with examples
  • Sorting Lists in Terraform: A Practical Guide
  • Terraform: How to trigger a Lambda function on resource creation
  • How to use Terraform templates
  • Understanding terraform_remote_state data source: Explained with examples
  • Jenkins Authorization: A Practical Guide (with examples)
  • Solving Jenkins Pipeline NotSerializableException: groovy.json.internal.LazyMap
  • Understanding Artifacts in Jenkins: A Practical Guide (with examples)
  • Using Jenkins with AWS EC2 and S3: A Practical Guide