Understanding Persistent Storage with Volumes in Kubernetes

Updated: January 30, 2024 By: Guest Contributor Post a comment

Introduction

Kubernetes has grown to become a powerful platform for deploying, managing, and scaling containerized applications. In any application ecosystem, managing data and ensuring it persists across pod lifecycle changes and restarts is essential. This is where Kubernetes volumes come in. They allow for storage to persist, and in this guide, we will explore how to use them for managing persistent storage.

Understanding Volumes in Kubernetes

A volume in Kubernetes represents a way of mounting storage to your pods. Unlike local disk storage which is ephemeral, that is, tied to the pod’s lifecycle, a volume can outlive the pod, ensuring data persistence. Kubernetes supports several types of volumes, and here we’ll focus on persisting data beyond the lifecycle of individual pods.

EmptyDir and HostPath

We’ll start with the simplest: the EmptyDir and HostPath volumes.

EmptyDir: An EmptyDir volume is created when a pod is assigned to a node and exists for as long as that pod is running on the node. As the name suggests, it starts as an empty directory. If the pod is removed for any reason, the data in the EmptyDir is deleted forever.

HostPath: A HostPath volume mounts a file or directory from the host node’s filesystem into your pod. This can be used for node-specific data or for pods to modify the state of the node itself.

Persistent Volumes and Persistent Volume Claims

Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) are key to understanding persistent storage in Kubernetes. A PV is a piece of storage that’s been provisioned by an administrator or dynamically provisioned using Storage Classes. A PVC, on the other hand, is a request for storage by a user. It specifies the size, access mode, and a StorageClass.

Let’s dive into some code examples to showcase how we use volumes in Kubernetes.

Example 1: Using an EmptyDir Volume

apiVersion: v1
kind: Pod
metadata:
 name: example-pod
spec:
 containers:
 - name: example-container
 image: nginx
 volumeMounts:
 - mountPath: /cache
 name: cache-volume
 volumes:
 - name: cache-volume
 emptyDir: {}

In this example, we create a basic pod with an EmptyDir volume named cache-volume. Any files written to /cache in the pod will be stored on the EmptyDir volume as long as the pod is running.

Example 2: Using a HostPath Volume

apiVersion: v1
kind: Pod
metadata:
 name: example-pod
spec:
 containers:
 - name: example-container
 image: nginx
 volumeMounts:
 - mountPath: /host-data
 name: host-data-volume
 volumes:
 - name: host-data-volume
 hostPath:
 path: /data
 type: Directory

This example shows a pod with a HostPath volume. Data written to /host-data in the pod will be saved onto the /data directory on the worker node where the pod is running.

Example 3: Using Persistent Volumes and Claims

Before creating a PVC, you need to have a PV available in your cluster, or a dynamic provisioner set up. Below are examples of both a PV and a corresponding PVC.

Creating a Persistent Volume

apiVersion: v1
kind: PersistentVolume
metadata:
 name: example-pv
spec:
 capacity:
 storage: 1Gi
 accessModes:
 - ReadWriteOnce
 persistentVolumeReclaimPolicy: Retain
 hostPath:
 path: /mnt/data

In this example, we defined a 1-gigabyte persistent volume using a directory on the host system with a HostPath.

Creating a Persistent Volume Claim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: example-pvc
spec:
 accessModes:
 - ReadWriteOnce
 resources:
 requests:
 storage: 1Gi

Here’s the PVC that will bind to our previously created PV. It requests a PV with at least 1Gi of storage and the ReadWriteOnce access mode.

Once you have a PVC, you can reference it in your pod definition and Kubernetes will handle the mounting for you:

apiVersion: v1
kind: Pod
metadata:
 name: example-pod
spec:
 containers:
 - name: example-container
 image: nginx
 volumeMounts:
 - mountPath: /persistent-storage
 name: storage
 volumes:
 - name: storage
 persistentVolumeClaim:
 claimName: example-pvc

Example 4: Dynamic Volume Provisioning

Dynamic provisioning is a feature that intuitively allows you to create PVs on-demand. It is driven by the StorageClass resource.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: example-storageclass
provisioner: kubernetes.io/aws-ebs
parameters:
 type: gp2
reclaimPolicy: Retain
allowVolumeExpansion: true

The above manifest defines a StorageClass that communicates with the AWS EBS provisioner to dynamically create gp2 volumes.

A PVC using this StorageClass would look like:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: example-pvc-dynamic
spec:
 accessModes:
 - ReadWriteOnce
 storageClassName: example-storageclass
 resources:
 requests:
 storage: 2Gi

When you apply this PVC, the dynamic provisioner will see the request and create a new volume that matches the criteria, then bind it to the PVC.

You can then use this PVC exactly like any other, by referencing it in a deployment or pod specification.

Conclusion

Understanding and utilizing persistent storage efficiently can ensure data safety and availability in your Kubernetes environments. With this tutorial, you should be comfortable setting up and managing volumes of varying complexity, from EmptyDir to more advanced dynamic provisioning with StorageClasses.

By appropriating these concepts in practice, you can enjoy the full benefits of cloud-native architectures and take your container orchestration to the next level.