Kubernetes: How to Scale Applications Using ReplicaSets

Overview
1. What is a ReplicaSet?
2. Prerequisites
Defining Your First ReplicaSet
Scaling Manually
Scaling Automatically
Verifying Scaling
Inspecting the Effect of Scaling
Advanced: Managing Scaling with Policies
Conclusion

Overview

Welcome to the world of Kubernetes, the powerful system for automating application deployment, scaling, and management. A key benefit of Kubernetes (K8s) is its ability to easily scale applications in response to demand. This tutorial guides you through the core concepts, and provides practical code examples to scale your applications using ReplicaSets.

What is a ReplicaSet?

A ReplicaSet is a Kubernetes object that ensures a specific number of pod replicas are running at any given time. Unlike a simple replication controller, a ReplicaSet supports selectors that are set-based, enabling more flexibility in precisely how it identifies the pods it’s supposed to manage.

Prerequisites

A working Kubernetes cluster
Basic familiarity with kubectl, the command line tool for Kubernetes
Basic understanding of YAML manifests for Kubernetes

Defining Your First ReplicaSet

To define a ReplicaSet, you start by creating a YAML configuration file. Below is an example of a simple ReplicaSet designed to keep three replicas of a nginx pod running.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-replicaset
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

To create this ReplicaSet, save the above YAML to a file and use kubectl:

kubectl apply -f replicaset.yaml

Kubectl will output:

replicaset.apps/nginx-replicaset created

Scaling Manually

To manually scale the number of replicas, edit the replicas field in the YAML file and apply it:

kubectl apply -f replicaset.yaml

Or, you can use the following kubectl command:

kubectl scale --replicas=5 replicaset nginx-replicaset

You should see:

replicaset.apps/nginx-replicaset scaled

Scaling Automatically

Kubernetes also allows you to automatically scale your ReplicaSet up or down based on CPU usage. The following HorizontalPodAutoscaler will scale your nginx ReplicaSet based on the average CPU usage of the pods:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-autoscaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: ReplicaSet
    name: nginx-replicaset
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50

Apply it with:

kubectl apply -f autoscaler.yaml

Output:

horizontalpodautoscaler.autoscaling/nginx-autoscaler created

Verifying Scaling

To see your ReplicaSet and the number of replicas, use this command:

kubectl get rs nginx-replicaset

The following information indicates your ReplicaSet is up and running:

NAME              DESIRED   CURRENT   READY   AGE
nginx-replicaset   3         3         3      96s

Inspecting the Effect of Scaling

It’s important to understand how scaling affects the pods in your ReplicaSet. When you scale out, additional pods are created. You can see the list of pods by running:

kubectl get pods -l app=nginx

After scaling in (reducing the number of replicas), Kubernetes removes some of the pods. Shutting down pods is not immediate and Kubernetes uses its discretion to decide which pods to terminate.

Advanced: Managing Scaling with Policies

In more sophisticated scenarios, you might want to define policies influencing how scaling happens. This could involve graceful shutdowns, cost-optimization strategies, or high-availability considerations. Using PodDisruptionBudgets and PriorityClasses can enhance these scaling strategies.

Here is an example of a basic PodDisruptionBudget:

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: nginx-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: nginx

Apply your PodDisruptionBudget as follows:

kubectl apply -f pdb.yaml

Output:

poddisruptionbudget.policy/nginx-pdb created

This will ensure that at least two replicas of the nginx application are always available during voluntary disruptions.

Conclusion

Scaling applications in Kubernetes using ReplicaSets is an essential skill for ensuring your application can adapt to workloads appropriately. Through ReplicaSets and the Horizontal Pod Autoscaler, you can both manually and automatically adjust the number of pod instances to meet demand.

Next Article: Managing ReplicaSets in Kubernetes: Examples & Best Practices

Previous Article: Understanding ReplicaSets in Kubernetes (with Examples)

Series: Kubernetes Tutorials

DevOps