NGINX: Blocking requests by User-Agent

Updated: January 19, 2024 By: Guest Contributor Post a comment

Introduction

NGINX is a powerful open-source web server that can be used for a variety of tasks, including web serving, reverse proxying, caching, load balancing, media streaming, and more. In this tutorial, we will focus on how to block requests based on the User-Agent string. This can be helpful when dealing with bots, crawlers, or any other unwanted traffic that you might want to prevent from accessing your server.

Understanding how to block requests by User-Agent is an important aspect of securing your web applications and managing traffic. Whether you are dealing with scrappers, spammy bots, or just unwanted traffic from known User-Agents, NGINX offers you a way to block them effectively.

Prerequisites

  • A server running NGINX
  • Basic understanding of NGINX configuration
  • Access to edit NGINX configuration files

Getting Started: Matching User-Agents With NGINX

NGINX can match incoming requests by their ‘User-Agent’ string using the ‘if’ directive in combination with a regular expression. You can place these directives in your ‘server’ or ‘location’ blocks to apply them to specific parts of your site.

  map $http_user_agent $blocked_user_agent {
    default         0;
    ~*BadBot          1;
}

server {
    listen 80;
    server_name example.com;

    if ($blocked_user_agent) {
        return 403;
    }

    # Remaining configuration...

Remember that the ‘if’ directive can have performance implications and should be used with caution. For each request that hits your server, NGINX will check, match and then return 403 if a User-Agent matches your predefined patterns.

Blocking Multiple User-Agents

To block multiple User-Agents, you should define each unwanted User-Agent in the map block. It’s more efficient to use a map block than multiple if conditions.

  map $http_user_agent $blocked_user_agent {
    default 0;
    ~*BadBot1|BadBot2|BadBot3 1;
}

server {
    # ...

    if ($blocked_user_agent) {
        return 403;
    }

    # ...

Using Include Files to Manage Blocked User-Agents

For those who want to separate the concerns and keep their main configuration cleaner, you can keep your blocked User-Agent patterns in a separate file, which you include in the main configuration.

  map $http_user_agent $blocked_user_agent {
    include /etc/nginx/blocked-user-agents.conf;
}

# In /etc/nginx/blocked-user-agents.conf
default 0;
~*BadBot 1;

# Main configuration
server {
    # ...

    if ($blocked_user_agent) {
        return 403;
    }

    # ...

Advanced Blocking: Combining User-Agent and IP Address

Sometimes, you may want to block requests that come from both a certain User-Agent pattern and a set of IP addresses. This requires combining two conditions.

  geo $bad_user_ip {
    default 0;
    10.50.0.0/16 1;
    192.168.1.0/24 1;
}

map $http_user_agent $blocked_user_agent {
    default 0; 
    ~*BadBot 1; 
}

server {
    listen 80;

    if ($blocked_user_agent) {
        set $block '1';
    }

    if ($bad_user_ip) {
        set $block '1';
    }

    if ($block = '1') {
        return 403;
    }

    # ...

By using the variables $bad_user_ip and $blocked_user_agent, you can set a flag variable $block and return a 403 status if it’s set to ‘1’. Note that ‘geo’ directive is used to define IP address-based conditions.

Testing Your Configuration

Before applying changes, it’s important to test your NGINX configuration to avoid any mistakes.

nginx -t

This command will check the syntax of your configuration files. If there are no errors, you can then reload your NGINX server to apply the changes without any downtime:

service nginx reload

Conclusion

This tutorial has walked you through various techniques to block requests by ‘User-Agent’ string using NGINX. From basic to advanced examples, implementing these methods would help harden your server’s security and give you greater control over the traffic to your applications.