
Overview
aiohttp
is a modern library that provides asynchronous HTTP client and server functionality for Python. Streams are a way of handling data in chunks, without loading the whole file into memory at once. This can be useful for downloading large files or handling multiple requests concurrently.
In general, you can download files (especially large files of several hundred MB or more) with aiohttp
streams by following the steps listed below:
- Create an
aiohttp.ClientSession
object, which represents a connection pool for making HTTP requests. - Use the session.get() method to send a GET request to the file URL and get an
aiohttp.ClientResponse
object, which represents the response from the server. - Use the
response.content
attribute to access anaiohttp.StreamReader
object, which is a stream for reading the response body. - Use the
stream.read()
orstream.readany()
methods to read chunks of data from the stream and write them to a file object on your computer. - Close the response and the
session
objects when you are done (this can be done automatically by using theasync with
statement).
Words might be confusing and hard to understand. Let’s examine the following example for more clarity.
Complete Example
What we’re going to do is download 2 files at the same time. One file is CSV, and the other is PDF. Here’s the URL of the CSV file:
https://api.slingacademy.com/v1/sample-data/files/student-scores.csv
And this is the URL of the PDF file:
https://api.slingacademy.com/v1/sample-data/files/text-and-table.pdf
Before writing code, make sure you don’t forget to install aiohttp
:
pip install aiohttp
The complete code (with explanations):
# SlingAcademy.com
# This code uses Python 3.11.4
import asyncio
import aiohttp
# This function downloads a file from a URL and saves it to a local file
# The function is asynchronous and can handle large files because it uses aiohttp streams
async def download_file(url, filename):
async with aiohttp.ClientSession() as session:
print(f"Starting download file from {url}")
async with session.get(url) as response:
assert response.status == 200
with open(filename, "wb") as f:
while True:
chunk = await response.content.readany()
if not chunk:
break
f.write(chunk)
print(f"Downloaded {filename} from {url}")
# This function downloads two files at the same time
async def main():
await asyncio.gather(
# download a CSV file
download_file(
"https://api.slingacademy.com/v1/sample-data/files/student-scores.csv",
"test.csv",
),
# download a PDF file
download_file(
"https://api.slingacademy.com/v1/sample-data/files/text-and-table.pdf",
"test.pdf",
),
)
# Run the main function
asyncio.run(main())
After running the code, you’ll see this output:
Starting download file from https://api.slingacademy.com/v1/sample-data/files/student-scores.csv
Starting download file from https://api.slingacademy.com/v1/sample-data/files/text-and-table.pdf
Downloaded test.pdf from https://api.slingacademy.com/v1/sample-data/files/text-and-table.pdf
Downloaded test.csv from https://api.slingacademy.com/v1/sample-data/files/student-scores.csv
And the downloaded files will be saved in the same directory as your Python script, as shown in the screenshot below:

Their names are test.csv
and test.pdf
.