Sling Academy
Home/Python/Python asyncio: How to download a large file and show progress (percentage)

Python asyncio: How to download a large file and show progress (percentage)

Last updated: February 11, 2024

Getting Started

Before diving into the code, ensure you have Python 3.7 or higher installed on your machine, as asyncio and aiohttp leverage the latest async/await syntax introduced in Python 3.5. You’ll also need to install aiohttp library if it’s not already installed. You can do so using pip:

pip install aiohttp

Setting Up Your Download Function

First, let’s import the required libraries and define a simple asynchronous function for downloading files. Here’s a skeletal structure:

import asyncio
import aiohttp

async def download_file(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            with open(file_name, 'wb') as fd:
                while True:
                    chunk = await resp.content.read(1024)
                    if not chunk:
                        break
                    fd.write(chunk)

This function initializes an aiohttp session, requests the file, and then saves it in chunks. This is effective for large files, preventing memory overflow issues.

Implementing Progress Tracking

To show progress, we need to know the total file size. This can usually be fetched from the content-length header. Here’s how:

progress = 0
file_size = int(resp.headers['Content-Length'])

Then, update the progress within the file writing loop:

progress += len(chunk)
percentage = (progress / file_size) * 100
print(f"Download progress: {percentage:.2f}%")

Note the calculation of the download progress percentage and printing it out. This provides users with real-time feedback.

Putting It All Together

Let’s combine everything into a full download function with progress reporting:

import asyncio
import aiohttp

async def download_file(url, file_name):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            if resp.status == 200:
                file_size = int(resp.headers['Content-Length'])
                with open(file_name, 'wb') as fd:
                    progress = 0
                    while True:
                        chunk = await resp.content.read(1024)
                        if not chunk:
                            break
                        fd.write(chunk)
                        progress += len(chunk)
                        percentage = (progress / file_size) * 100
                        print(f"Download Progress: {percentage:.2f}%")
            else:
                print("Failed to download the file.")

This script will download a file, showing the download progress in the terminal. To run, save the code in a file (e.g., async_download.py) and execute it using:

python async_download.py

Handling Larger Files and Rate Limiting

For exceptionally large files, or to avoid hammering the server with requests, it’s wise to manage the speed of chunk downloading. This can be done easily by inserting an asyncio.sleep(x) within the loop, where x is the number of seconds to wait between chunks.

progress += len(chunk)
await asyncio.sleep(0.1)  # Download throttle for 100ms

This moderation helps in treating server resources respectfully while smoothly downloading large files.

Conclusion

Python’s asyncio and aiohttp offer an effective way to handle asynchronous file downloads, making your application more efficient and user-friendly by showing the download progress. They represent a killer combination for network-bound tasks. Remember that understanding how asynchronous operations work is key to taking full advantage of these features.

By grappling with these concepts and implementing the code from this tutorial, you’re well on your way to adding robust file download capabilities to your Python applications, with clear progress indication that keeps users informed throughout the process.

Next Article: Python: How to create your own asyncio TCP server (and test it using cURL)

Previous Article: Python asyncio: How to download a list of files in parallel

Series: Python Asynchronous Programming Tutorials

Python

You May Also Like

  • Introduction to yfinance: Fetching Historical Stock Data in Python
  • Monitoring Volatility and Daily Averages Using cryptocompare
  • Advanced DOM Interactions: XPath and CSS Selectors in Playwright (Python)
  • Automating Strategy Updates and Version Control in freqtrade
  • Setting Up a freqtrade Dashboard for Real-Time Monitoring
  • Deploying freqtrade on a Cloud Server or Docker Environment
  • Optimizing Strategy Parameters with freqtrade’s Hyperopt
  • Risk Management: Setting Stop Loss, Trailing Stops, and ROI in freqtrade
  • Integrating freqtrade with TA-Lib and pandas-ta Indicators
  • Handling Multiple Pairs and Portfolios with freqtrade
  • Using freqtrade’s Backtesting and Hyperopt Modules
  • Developing Custom Trading Strategies for freqtrade
  • Debugging Common freqtrade Errors: Exchange Connectivity and More
  • Configuring freqtrade Bot Settings and Strategy Parameters
  • Installing freqtrade for Automated Crypto Trading in Python
  • Scaling cryptofeed for High-Frequency Trading Environments
  • Building a Real-Time Market Dashboard Using cryptofeed in Python
  • Customizing cryptofeed Callbacks for Advanced Market Insights
  • Integrating cryptofeed into Automated Trading Bots