Sling Academy
Home/Python/Python: Calculating total size of a folder and its contents

Python: Calculating total size of a folder and its contents

Last updated: January 13, 2024

Introduction

Calculating the total size of a folder and its contents is a common task in many computing scenarios. Whether you’re monitoring disk usage, performing housekeeping on your file system, or simply curious about the amount of data a directory contains, Python offers efficient and easy-to-understand methods to get the job done. In this tutorial, we will explore how to calculate the size of a directory using Python’s built-in libraries.

The os module

Python’s built-in os module is the cornerstone of our folder size calculation operation. It provides a portable way of using operating system-dependent functionality such as reading, writing, and managing files and directories.

To calculate the size of a directory, we need to:

  1. Iterate over each file in the directory.
  2. Get the size of each file.
  3. Add up the sizes for a total sum.

Walking the directory tree

To begin, we must navigate the directory tree. This can be accomplished using os.walk(). Here’s what each part of the walk provides:

  • dirpath – the path to the directory
  • dirnames – a list of subdirectory names in dirpath
  • filenames – a list of the names of the non-directory files in dirpath

Example:

import os

folder_path = '/path/to/your/folder'
for dirpath, dirnames, filenames in os.walk(folder_path):
    print(f'Found directory: {dirpath}')
    for file in filenames:
        print(file)

The above code will print out all directories and files within the specified folder path.

Calculating the size of files

The next step is to obtain the size of each file. We can do this with os.path.getsize(), which returns the size in bytes of a given file path.

Example:

import os

folder_path = '/path/to/your/folder'
total_size = 0

for dirpath, dirnames, filenames in os.walk(folder_path):
    for file in filenames:
        file_path = os.path.join(dirpath, file)
        total_size += os.path.getsize(file_path)

print(f'Total size of the folder {folder_path} is: {total_size} bytes')

This code snippet will calculate the complete size of the given folder and print the total size in bytes.

Handling Exceptions

When working with files and folders, it’s possible to encounter errors such as FileNotFoundError or PermissionError. It’s good practice to handle these exceptions so that our script doesn’t crash during execution.

Example code with exception handling:

import os

folder_path = '/path/to/your/folder'
total_size = 0

try:
    for dirpath, dirnames, filenames in os.walk(folder_path):
        for file in filenames:
            file_path = os.path.join(dirpath, file)
            if not os.path.islink(file_path):  # Skip symbolic links
                total_size += os.path.getsize(file_path)
except Exception as e:
    print(f'An error occurred: {e}')
else:
    print(f'Total size of the folder {folder_path} is: {total_size} bytes')

This additional if statement checks whether the file path is a symbolic link because calling getsize() on a symbolic link would return the size of the link rather than the actual file.

Formatting the Output

While the size in bytes is exact, it isn’t always the most user-friendly way to understand size. To make our output more readable, let’s format the size to a more conventional measurement like kilobytes (KB), megabytes (MB), or gigabytes (GB).

Example code with formatted output:

def format_size(bytes, unit='MB'):
    """Format bytes to the selected unit."""
    units = {
        'KB': 1024,
        'MB': 1024**2,
        'GB': 1024**3,
        'TB': 1024**4
    }
    if unit not in units:
        raise ValueError('Unsupported unit. Available units: KB, MB, GB, TB')

    formatted_size = bytes / units[unit]
    return f'{formatted_size:.2f} {unit}'

import os

folder_path = '/path/to/your/folder'
total_size = 0

for dirpath, dirnames, filenames in os.walk(folder_path):
    for file in filenames:
        file_path = os.path.join(dirpath, file)
        if not os.path.islink(file_path):
            total_size += os.path.getsize(file_path)

formatted_total_size = format_size(total_size, 'MB')
print(f'Total size of the folder {folder_path} is: {formatted_total_size}')

We created a function, format_size, that will take in the number of bytes and output the formatted size. The final print statement now outputs the total size in a friendly format.

Conclusion

In this guide, we’ve learned how to use Python’s os module to walk through a directory tree and calculate the total size of its contents. We covered handling exceptions to ensure stable script execution and formatting the output to different size units for better readability. Whether you’re a system administrator, developer, or just someone interested in file system management, you now have a valuable tool to assess folder sizes easily with Python.

Remember to frequently adjust the scripting technique to accommodate the various file system structures you encounter, ensuring robustness and efficiency in your Python scripts.

Next Article: Python: How to iterate over all files in a directory

Previous Article: Python: Getting the Creation Date of a File/Directory

Series: Python: System & File I/O Tutorials

Python

You May Also Like

  • Python Warning: Secure coding is not enabled for restorable state
  • Python TypeError: write() argument must be str, not bytes
  • 4 ways to install Python modules on Windows without admin rights
  • Python TypeError: object of type ‘NoneType’ has no len()
  • Python: How to access command-line arguments (3 approaches)
  • Understanding ‘Never’ type in Python 3.11+ (5 examples)
  • Python: 3 Ways to Retrieve City/Country from IP Address
  • Using Type Aliases in Python: A Practical Guide (with Examples)
  • Python: Defining distinct types using NewType class
  • Using Optional Type in Python (explained with examples)
  • Python: How to Override Methods in Classes
  • Python: Define Generic Types for Lists of Nested Dictionaries
  • Python: Defining type for a list that can contain both numbers and strings
  • Using TypeGuard in Python (Python 3.10+)
  • Python: Using ‘NoReturn’ type with functions
  • Type Casting in Python: The Ultimate Guide (with Examples)
  • Python: Using type hints with class methods and properties
  • Python: Typing a function with default parameters
  • Python: Typing a function that can return multiple types