Python Requests module: Setting custom user agent

Updated: January 2, 2024 By: Guest Contributor Post a comment

Overview

Working with web requests often necessitates simulating the behavior of different browsers. In Python, the requests module simplifies the process of setting custom user agents to tailor HTTP requests for your specific needs.

Introduction to User Agents

Whenever a web browser or automated script such as a web crawler sends a request to a web server, it includes a User-Agent header in the request. This header provides the server with information about the type of client making the request, allowing the server operators to customize the response for different device types, browsers, or to manage bots. User agents can also play a key role in scraping tasks and web automation, as they can help to mimic regular browser traffic to avoid getting blocked.

In this tutorial, we delve into the Python requests module to demonstrate how you can set a custom user agent for your HTTP requests. We’ll start with basic examples and move on to more advanced use-cases, including how to handle sites that use user agents for content delivery personalization or access control.

Getting Started with Python Requests

Before we look at setting custom user agents, let’s first install the requests module if it’s not already included in your Python environment:

pip install requests

Now, let’s send a simple request:

import requests
url = 'https://httpbin.org/get'
response = requests.get(url)
print(response.text)

The requests.get method will send a GET request to the specified URL and print the response body to the console. By default, the requests module generates its User-Agent.

Basic Custom User Agent Setup

To set a custom User Agent, you simply need to pass it as a header in your request:

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
print(response.text)

In the code above, we’re mimicking a request from Chrome on a Windows 10 machine.

Rotating User Agents

When scraping websites or automating interaction, rotating the user agent can help mask the bot’s activities. This is easily achievable with the requests module:

import random

user_agents = [
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1 Safari/605.1.15',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0',
    'Mozilla/5.0 (Linux; Android 10; SM-G981B) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.96 Mobile Safari/537.36'
]

headers = {
    'User-Agent': random.choice(user_agents)
}
response = requests.get(url, headers=headers)
print(response.text)

Select a random user agent from the list for each request.

Advanced Usage: Session Objects and Persistence

requests also allows creation of a session object which persists certain parameters across requests. This is especially useful when needing to maintain a consistent user agent:

s = requests.Session()
s.headers.update({
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36'
})
response = s.get(url)
print(response.text)

Incorporating User Agent Libraries

While you can manually set or rotate user agents, comprehensive user agent libraries exist that ensure more variability and the possibility of pulling from a frequently updated list of user agents. The fake-useragent library is one popular option:

from fake_useragent import UserAgent

ua = UserAgent()
headers = {
    'User-Agent': ua.random
}
response = requests.get(url, headers=headers)
print(response.text)

Handling Custom User Agents with Web Scraping and Automation

Using custom user agents becomes particularly crucial with web scraping and automation. Care must be taken to ensure compliance with the terms of service (ToS) of the target website and to respect robots.txt directives. The following snippet shows a respectful web scraping request setup:

response = requests.get(url, headers=headers)

if response.status_code == 200:
    # Process the response...
else:
    # Handle error or retry with a different user agent...

Always be mindful of the ethical and legal considerations when using custom user agents.

Conclusion

Setting a custom user agent with the Python requests module is a simple yet powerful technique. Whether you’re scraping data, automating interactions, or just accessing content in a more controlled manner, custom user agents are an invaluable tool. With the approaches discussed in this article, you can conduct more sophisticated and disguised web interactions with ease.