Sling Academy
Home/Python/Python: How to iterate over all files in a directory

Python: How to iterate over all files in a directory

Last updated: January 13, 2024

Overview

Working with filesystems is a common task for many Python developers. Whether you’re building a script to automate some tasks, creating a server application, or processing a batch of files, you may need to access and read through the contents of directories. Python, with its rich library set, simplifies these tasks allowing developers to easily iterate over all files in a directory.

In this tutorial, we’ll explore various methods to iterate over all files in a directory in Python. We’ll cover the use of the os module, the more modern pathlib module added in Python 3.4, and some third-party libraries that further simplify the task.

Prerequisites

  • Python, preferably the latest version. (As of my last update, Python 3.10 is commonly in use)
  • Basic knowledge of Python syntax and the file system
  • A directory with some files to practice with

Using the os Module

Let’s begin with the standard os library which provides a way to perform directory and file operations. The os.listdir() and os.walk() functions are most commonly used for iterating through files in a directory.

Using os.listdir()

The os.listdir() function returns a list of names of entries in the directory given by path:

import os

directory = '/path/to/directory'

for filename in os.listdir(directory):
    if filename.endswith('.txt'):
        # Do something with the file
        print(os.path.join(directory, filename))

Using os.walk()

The os.walk() is a more powerful tool for directory traversal. It generates the file names in a directory tree by walking the tree either top-down or bottom-up:

import os

for subdir, dirs, files in os.walk('/path/to/directory'):
    for file in files:
        # Do something with the file
        filepath = os.path.join(subdir, file)
        print(filepath)

Using the pathlib Module

In newer versions of Python, the pathlib module is the recommended way to work with files and directories. Below is a simple example of iteration with pathlib.Path():

from pathlib import Path

dir_path = Path('/path/to/directory')

for file_path in dir_path.iterdir():
    if file_path.is_file() and file_path.suffix == '.txt':
        print(file_path)

Using glob Module with pathlib

When paired with glob patterns, pathlib becomes even more powerful:

for file_path in dir_path.glob('*.txt'):
    print(file_path)

Using the os.scandir() and with Statement

Python 3.5 introduced os.scandir() which returns an iterator instead of a list. It’s more efficient when you’re working with large directories:

import os

with os.scandir('/path/to/directory') as entries:
    for entry in entries:
        if entry.is_file() and entry.name.endswith('.txt'):
            print(entry.path)

Error Handling

When iterating through directories, you may encounter permissions errors or broken links. It’s important to handle these exceptions:

import os

try:
    with os.scandir('/path/to/directory') as entries:
        for entry in entries:
            if entry.is_file() and entry.name.endswith('.txt'):
                print(entry.path)
except PermissionError as e:
    print(f'Permission denied: {e}')

Advanced Directory Traversal With Third-Party Libraries

While the standard library provides decent capabilities for directory traversal, there are several third-party libraries such as scandir and glob2. These deliver improved functionality or simpler syntax for complex tasks.

Conclusion

In conclusion, Python provides several methods to iterate over all the files in a directory. Your choice will depend on your exact requirements – for simpler tasks, os.listdir() may be adequate, whereas for walking a directory tree, os.walk() or pathlib.Path() along with glob patterns gives you a powerful toolset. Remember to handle any potential errors in order to make your scripts robust and reliable.

No matter which method you choose, you’ll be able to build efficient scripts that can harness the capabilities of Python’s file handling to accomplish a wide array of tasks. The code samples provided here offer a jumping-off point to get started with iterating files in a directory. With this knowledge in hand, you can confidently handle file systems in your next Python project.

Next Article: Python aiofiles: Read & Write files asynchronously

Previous Article: Python: Calculating total size of a folder and its contents

Series: Python: System & File I/O Tutorials

Python

You May Also Like

  • Introduction to yfinance: Fetching Historical Stock Data in Python
  • Monitoring Volatility and Daily Averages Using cryptocompare
  • Advanced DOM Interactions: XPath and CSS Selectors in Playwright (Python)
  • Automating Strategy Updates and Version Control in freqtrade
  • Setting Up a freqtrade Dashboard for Real-Time Monitoring
  • Deploying freqtrade on a Cloud Server or Docker Environment
  • Optimizing Strategy Parameters with freqtrade’s Hyperopt
  • Risk Management: Setting Stop Loss, Trailing Stops, and ROI in freqtrade
  • Integrating freqtrade with TA-Lib and pandas-ta Indicators
  • Handling Multiple Pairs and Portfolios with freqtrade
  • Using freqtrade’s Backtesting and Hyperopt Modules
  • Developing Custom Trading Strategies for freqtrade
  • Debugging Common freqtrade Errors: Exchange Connectivity and More
  • Configuring freqtrade Bot Settings and Strategy Parameters
  • Installing freqtrade for Automated Crypto Trading in Python
  • Scaling cryptofeed for High-Frequency Trading Environments
  • Building a Real-Time Market Dashboard Using cryptofeed in Python
  • Customizing cryptofeed Callbacks for Advanced Market Insights
  • Integrating cryptofeed into Automated Trading Bots