NGINX traffic distribution methods: A practical guide

Introduction
Understanding the Basics of NGINX Load Balancing
The Weighted Round-Robin Method
The Least Connections Method
IP Hash for Session Persistence
Advanced Hashing with Consistent Hash
Using Health Checks and Adaptive Fallback
Conclusion

Introduction

In the landscape of modern web applications, effectively distributing incoming network traffic across a set of different servers — commonly known as load balancing — is vital for ensuring optimal resource use, maximizing throughput, minimalizing response time, and avoiding system overloads. NGINX is a powerful open-source software that is widely known for its abilities in web serving, reverse proxying, caching, load balancing, and if configuration permits, even more security. In this guide, we shall uncover various methods to distribute traffic using NGINX, starting from basic round-robin approaches to more advanced techniques such as least connections and IP hashing.

Understanding the Basics of NGINX Load Balancing

Before diving into specific methods of traffic distribution, it’s crucial to understand the basics of load balancing in NGINX. Fundamentally, NGINX operates as a reverse proxy, directing client requests to backend servers. This means you have a group of servers, often referred to as an ‘upstream’ in NGINX parlance, that are set up to handle requests forwarded to them by the NGINX server.

By default, NGINX employs a round-robin method for load balancing. This means it distributes incoming requests sequentially amongst the available servers. As such, the round-robin method is the simplest form of load balancing with NGINX and is inherently fair as each server is given an equal chance to handle requests.

http {
    upstream myapp1 {
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
    }

    server {
        listen 80;

        location / {
            proxy_pass http://myapp1;
        }
    }
}

This configuration block defines an upstream group named ‘myapp1’ consisting of three servers. All you need to do is include a ‘proxy_pass’ directive inside a ‘location’ block of your server configuration, which points to the defined upstream group. With that, the round-robin load balancing is in action.

The Weighted Round-Robin Method

This method extends the basic round-robin approach by assigning weights to each server. This is useful when the servers in the upstream group have different capacities. By attributing a higher weight to the more capable servers, NGINX can allocate requests proportionally, hence the term ‘weighted round-robin’.

http {
    upstream myapp1 {
        server srv1.example.com weight=3;
        server srv2.example.com;
        server srv3.example.com weight=2;
    }

    # Rest of the configuration remains the same
}

In the modified configuration above, ‘srv1.example.com’ will receive more connections than ‘srv2.example.com’ thanks to the weight of 3, and ‘srv3.example.com’ will get more than ‘srv2.example.com’ due to its weight of 2. If all servers have the same capacity, the weights can be omitted, and NGINX will default to the usual round-robin behavior.

The Least Connections Method

The ‘least connections’ method favors servers with the least number of active connections. This is particularly effective when the requests have a variable time to process. It naturally avoids overloading a vulnerable server that’s already handling many connections, and instead directs new requests to less busy servers.

http {
    upstream myapp1 {
        least_conn;
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
    }

    # Rest of the server configuration remains the same
}

The ‘least_conn’ directive is placed before the server list within the ‘upstream’ block which changed the distribution method accordingly. As a result, the back-end server with the fewest active connections will be prioritized for new requests.

IP Hash for Session Persistence

Certain applications require that a client’s requests are sent to the same server; this can be essential for session consistency. The IP hash method helps achieve this by generating a unique hash for each client based on their IP address and using it to determine which server will handle their requests.

http {
    upstream myapp1 {
        ip_hash;
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com down;
    }

    # Rest of the server configuration remains the same
}

In the above, NGINX maintains client-to-server mapping by IP, thus ensuring persistence. Notice how ‘srv3.example.com’ is marked as ‘down’, NGINX won’t send any requests to it until it’s re-enabled, which can be useful for server maintenance or if the server has failed.

Advanced Hashing with Consistent Hash

Unique hashing needs for a more sophisticated method beyond basic IP hash can make use of consistent hash, which is a part of NGINX’s ‘hash’ directive. This involves configuring the hash with a specific key, such as a user’s session cookie.

http {
    upstream myapp1 {
        hash $cookie_session consistent;
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
    }

    # Rest of the configuration follows
}

Here the traffic distribution will be hash-keyed against the value of a cookie named ‘session’. Thus, a user with a consistent cookie value gets assigned to a specific server based on the hash algorithm which aids in preserving session stickiness beyond IP address.

Using Health Checks and Adaptive Fallback

Enterprise-grade configurations may demand more than just distributing; they require resilience in addressing failures. With health checks, NGINX can route traffic away from unhealthy servers. Furthermore, you can write adaptable configurations that change according to backend server health. For instance, marking a server ‘backup’ will keep it as standby unless all others are unhealthy.

http {
    upstream myapp1 {
        server srv1.example.com;
        server srv2.example.com backup;
    }

    server {
        #... other directives ...

        location / {
            proxy_intercept_errors on;
            error_page 502 503 504 /custom_50x.html;
        }
    }
}

The server labeled ‘backup’ shall only come into play if ‘srv1.example.com’ becomes unavailable, which is a simplistic adaptive measure. Additionally, incorporating error directives allows sensing server health and rendering custom error pages instead of default proxies.

Conclusion

Through this guide, we have explored different methods for distributing incoming traffic with NGINX. Whether you require simple round-robin, weighted decisions, maintaining user sessions, or sophisticated health checks and backups, NGINX offers a flexible way to handle your server load. Implementing these methods correctly can ensure high availability and improved user experience.

Next Article: NGINX user and group: Explained with examples

Previous Article: Health Checks in NGINX: The Complete Guide

Series: NGINX Tutorials

DevOps