How to benchmark your NumPy arrays and operations

Introduction
Understanding Benchmarking
Basic Benchmarking with timeit
Benchmarking with IPython Magic
Benchmarking with perf_counter
Advanced Benchmarking with profilers
Micro and Macro Benchmarking
Conclusion

Introduction

Benchmarking is a vital process in performance optimization, especially when working with numerical computing in Python using NumPy. By measuring the runtime of array operations, you can make informed decisions about where to focus your optimization efforts. In this article, we’ll explore various methods for benchmarking your NumPy arrays and operations, with multiple code examples ranging from the basics to more advanced strategies.

Understanding Benchmarking

Benchmarking in computational contexts is the act of running a series of tests on a program or function to measure its performance, typically in terms of execution time. Accurate benchmarking is critical for optimizing code, as it provides measurable proof of any performance improvements.

Before starting, ensure that you have NumPy installed in your environment:

pip install numpy

Basic Benchmarking with timeit

Python comes with a built-in module, timeit, that can be used to measure the execution time of small code snippets. Here’s how you can use it to benchmark simple NumPy operations:

import numpy as np
from timeit import timeit

# Benchmarking array creation
setup_array_creation = 'import numpy as np'
code_array_creation = 'np.zeros((1000,1000))'

# Time array creation
print(f'Array creation time: {timeit(code_array_creation, setup=setup_array_creation, number=1000)} seconds')

The timeit function takes the code to be timed, the setup code that is run once at the beginning, and the number of times the code will be executed. The output shows the total time taken for the given number of executions. This is ideal for simple operations where overhead is minimal.

Benchmarking with IPython Magic

If you’re using an IPython environment like Jupyter Notebooks, you can use magic commands for quick benchmarks:

%load_ext numpy

# Using the magic command to benchmark
%timeit np.linalg.inv(np.random.rand(100,100))

This command will automatically measure the time taken to inverse a 100×100 matrix filled with random numbers. The %timeit runs the statement multiple times to get a more accurate measurement.

Benchmarking with perf_counter

For real-world programs, you might need more granularity than what timeit offers. In such cases, the perf_counter from the time module is useful:

import numpy as np
from time import perf_counter

start_time = perf_counter()

# More complex NumPy operations
a = np.random.rand(2000,2000)
inverted_a = np.linalg.inv(a)

end_time = perf_counter()

print(f'Inversion took {end_time - start_time} seconds')

Here, you measure the time immediately before and after the operation. This method allows you to include more extensive code blocks that may not be suitable for timeit.

Advanced Benchmarking with profilers

When dealing with complex operations, you might use a profiler to analyze the performance of your code. Python profilers such as cProfile can generate detailed reports on the function call time and frequency.

import numpy as np
import cProfile

# Profiling NumPy operations
pr = cProfile.Profile()
pr.enable()

# NumPy operations to profile
np.dot(np.random.rand(1000, 1000), np.random.rand(1000, 1000))

pr.disable()
pr.print_stats(sort='time')

This will display a table showing how much time was spent in each function. It’s more thorough than simple timing, as it can also reveal where your code is spending the most time.

Micro and Macro Benchmarking

In addition to measuring individual operations, it’s crucial to understand the impact on the overall program. Micro-benchmarking focuses on individual functions or small code snippets, whereas macro-benchmarking involves capturing performance of complete applications or systems. For NumPy operations that are part of larger systems, use tools like perf_counter or profilers during actual execution to gauge the impact on real-world scenarios.

Conclusion

Consistently benchmarking your NumPy operations allows you to understand and improve performance bottlenecks. Whether utilizing built-in Python tools like timeit and perf_counter, the IPython magic commands, or full-fledged profilers like cProfile, each technique provides valuable insights into the speed and efficiency of your numerical computations. With these tools, you can ensure your NumPy-based applications run at their optimal speed.

Next Article: NumPy: How to concatenate arrays vertically and horizontally

Previous Article: How to gracefully handle exceptions when working with NumPy

Series: NumPy Basic Tutorials

NumPy