Does Kubernetes increase network latency? (and how to benchmark it)

Updated: January 31, 2024 By: Guest Contributor Post a comment

Introduction

The modern software realm has undergone a significant paradigm shift with the adoption of container orchestration systems, and Kubernetes has established itself as a de facto standard. However, as we unpack the layers of complexity added to provide automation, fault tolerance, scalability, and the like, we cannot help but ponder: at what cost do these benefits come? Particularly, a recurring question is whether Kubernetes increases network latency.

This tutorial aims to explore this concern, examining how Kubernetes might affect network latency, and how to monitor and optimize networking in a Kubernetes cluster. While the objective is to provide an empirical understanding, we’d also look at methods and best practices designed to mitigate any potential latency overhead introduced by Kubernetes.

Understanding Kubernetes Networking

Before diving into the aspect of latency, it’s crucial to understand how networking operates within Kubernetes. Kubernetes adopts a flat networking model, which enables all pods to communicate with each other without NAT. Network policies can control traffic flow at the IP address or port level, and several CNI (Container Network Interface) plugins are available to implement these policies.

Default Kubernetes Networking Model

apiVersion: v1
kind: Pod
metadata:
  name: network-test-pod
spec:
  containers:
  - name: network-multi-tool
    image: praqma/network-multitool

Deploy the above pod to test intra-cluster communication. The network-multitool container provides a variety of networking tools to check if latency is introduced internally.

Measuring Network Latency in Kubernetes

To determine if Kubernetes adds network latency, you will need to conduct some benchmarks:

Use ping for Latency Tests

kubectl exec network-test-pod -- ping 

This command sends ICMP packets to another pod within the Kubernetes cluster, allowing you to measure the round trip time.

Using iperf3 for Throughput Benchmarking

iperf3 -s # Run on the server pod
iperf3 -c ip-of-server-pod # Run on the client pod

By running iperf3 in server mode in one pod and in client mode in another pod, you can measure the bandwidth available between the two and indirectly assess the latency under different load conditions.

Impact of Kubernetes on Network Latency

There are several layers where Kubernetes could introduce network latency, including:

  • The CNI plugin choice
  • Istio or other service mesh layers
  • Kube-proxy and its iptables or ipvs modes

For instance, calico or flannel can increase latency slightly due to the extra encapsulation they perform on packets. Similarly, service meshes like Istio add another layer of complexity that can increase latency if not configured properly.

For example:

calicoctl apply -f - <

By tweaking configuration parameters such as the block size and turning off NAT (where possible), you can optimize Calico CNI for better performance, potentially reducing induced latency.

Best Practices for Reducing Kubernetes-induced Latency

Adopt the following best practices to ensure minimal latency impact:

  • Choose the right CNI plugin. Some plugins are known for high performance with little trade-off on security or vice versa.
  • Configure Kube-proxy for latency-sensitive applications with IPVS, which might provide better performance than iptables proxy mode for such scenarios.
  • Balance service mesh benefits against their potential impact on network performance.
  • Use network policies wisely to reduce processing overhead on traffic that doesn’t require filtering.
  • Optimize the underlying network infrastructure and ensure you use high-speed networking equipment consistent with your performance requirements.
  • Ensure your workloads are as geographically close to your user base as possible to reduce inherent network travel time.
  • Opt for Horizontal Pod Autoscaler and Cluster Autoscaler for dynamically managing pod and node counts according to resource demands, thus reducing potential bottlenecks.

Horizontal Pod Autoscaler Example:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-autoscaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 1
  maxReplicas: 100
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

This HPA definition automatically scales the number of myapp pods based on CPU utilization, ensuring optimal performance without overscaling.

Conclusion

While Kubernetes does introduce an additional layer of virtual networking, it’s designed to minimize performance overhead as much as possible. By understanding and implementing proper network practices, choosing the appropriate CNI, and leveraging the power of Kubernetes scaling features, it’s possible to mitigate most, if not all, potential latency introduced by the platform.

It’s important to remember that observability and continuous performance monitoring play a pivotal role in maintaining and optimizing network performance. The adoption of Kubernetes is not mainly about overcoming its complexities, but rather harnessing its full potential while navigating the intricate trade-offs between performance, scalability, and functionality. Through diligence and the right tooling, you can often enjoy the benefits of Kubernetes without a significant compromise on network latency.