Provision Persistent Volumes in Kubernetes: A Developer’s Guide

Updated: January 30, 2024 By: Guest Contributor Post a comment

Introduction

Containerization has transformed the way we deploy applications, and Kubernetes is leading the charge as a platform for automating deployment, scaling, and operations of application containers across clusters of hosts. However, one of the challenges faced during application development is how to manage persistent storage for stateful applications. This guide will dive into provisioning persistent volumes in Kubernetes, which allows your applications to store data persistently, beyond the lifecycle of individual Pods.

The Basics of Persistent Storage in Kubernetes

In Kubernetes, a PersistentVolume (PV) is a piece of storage that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It’s a resource in the cluster just like a node is a resource. PVs can have different attributes depending on the underlying storage system, such as being file-system-based or block-storage-based, and can also have different access modes.

A PersistentVolumeClaim (PVC) is a request for storage by a user. It’s similar to a Pod. Pods consume node resources and PVCs consume PV resources. Here’s how they work together:

  • A developer creates a PVC that specifies size and access modes.
  • Kubernetes finds a PV that matches the criteria and binds them together.
  • The Pod can use the claim as a volume and access the provisioned storage.

Creating a PersistentVolume

Your first step as a developer is to define a PersistentVolume (PV). The following is a simple example of a PV definition file.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: standard
  hostPath:
    path: /tmp/data

This YAML file defines a PV named example-pv with storage capacity of 10Gi and an access mode of ReadWriteOnce, which means the volume can be mounted as read-write by a single node. In this case, we’re using a hostPath for local storage. Typically, hostPath is not used in production and is just for testing purposes. Use the following command to create the PV:

kubectl create -f pv-definition.yaml

Configuring a PersistentVolumeClaim

Once your PersistentVolume is available, you can create a PersistentVolumeClaim. Below is an example of a PVC definition file.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

This YAML file creates a PVC named example-pvc, requesting a volume with at least 5Gi of storage and the same ReadWriteOnce access mode. Use the following command to create the PVC:

kubectl create -f pvc-definition.yaml

Once created, Kubernetes attempts to find a suitable PersistentVolume and, if found, binds it to the PVC, ensuring that the volume is reserved for that claim and cannot be bound to another claim.

Provisioning Dynamic PersistentVolumes Using StorageClasses

Instead of manually creating PersistentVolumes, Kubernetes can dynamically provision PVs based on StorageClasses. A StorageClass provides a way for administrators to describe the “classes” of storage they offer. This enables dynamic volume provisioning.

Here’s an example of a StorageClass that uses AWS’s EBS.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zones: us-west-2a, us-west-2b, us-west-2c
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - debug

Use the following command to create the StorageClass:

kubectl create -f storageclass-definition.yaml

Using StorageClass in a PersistentVolumeClaim

To dynamically provision a PersistentVolume using a StorageClass, modify your PVC definition to include the storage className as shown below:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dynamic-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: slow

This PVC will trigger the dynamic provisioning of a PV using the “slow” StorageClass. When you create this PVC:

kubectl create -f dynamic-pvc-definition.yaml

This will result in a PV that uses AWS EBS as the underlying storage, which matches the requested specifications in the PVC.

Advanced Storage Considerations

As you get more comfortable with persistent volumes, there are several advanced aspects you might engage with. Here are some examples:

  • Setting up ReadWriteMany or ReadOnlyMany access modes to enable shared storage across multiple nodes.
  • Using volumeSnapshots to create point-in-time snapshots of PersistentVolumes for backup or for new environments.
  • Employing node affinity rules to constrain volumes to nodes with specific labels (below is a complete example for this).

Example: Employing Node Affinity Rules to Constrain Volumes

1. Label the Node: First, label the node where you want to constrain the volume. For example, if you want the volume to be on a node labeled as disktype=ssd, you would label the node like this:

kubectl label nodes <node-name> disktype=ssd

2. Create a Persistent Volume with Node Affinity Rules: Define a Persistent Volume that includes node affinity constraints. In the PV definition, specify the nodeAffinity required rules.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: fast
  local:
    path: /mnt/disks/ssd1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: disktype
          operator: In
          values:
          - ssd

This PV will only bind to nodes that have the label disktype=ssd.

3. Create a Persistent Volume Claim: Create a PVC to request the storage. The PVC will bind to the PV that satisfies its requirements and the node affinity rules.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: example-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast
  resources:
    requests:
      storage: 10Gi

he storageClassName should match the one defined in your PV.

4. Deploy a Pod Using the PVC: Finally, create a pod that uses the PVC. The pod’s volumes section references the PVC.

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
  - name: mycontainer
    image: nginx
    volumeMounts:
    - mountPath: "/usr/share/nginx/html"
      name: myvolume
  volumes:
  - name: myvolume
    persistentVolumeClaim:
      claimName: example-pvc

This setup ensures that the volume (example-pv) is only available to nodes that satisfy the node affinity rules (in this case, nodes labeled with disktype=ssd). The pod mypod will use the PVC example-pvc, which is bound to example-pv, thus ensuring that it runs on a node with an SSD disk.

Conclusion

Kubernetes Persistent Volumes offer a powerful way to provision storage for your containerized applications. Whether manually creating PersistentVolumes, or leveraging dynamic provisioning with StorageClasses, Kubernetes provides flexibility to match the storage needs of your workloads.