Introduction
NumPy is a foundational package for numerical computing in Python. Among its many features, NumPy provides efficient ways to read and write array data to and from files, which is critical for data science, engineering, and analysis tasks. In this tutorial, you will learn how to perform file input and output (I/O) with NumPy.
Prerequisites
Before diving in, ensure you have the following prerequisites:
- A working Python installation
- NumPy installed (which can be installed using
pip install numpy
) - A basic understanding of Python and NumPy arrays
NumPy File I/O Basics
NumPy primarily deals with arrays, and it includes built-in functionalities to save arrays to files and load arrays from files. The two primary functions we’ll explore are np.save
and np.load
. Let’s start with a simple example:
import numpy as np
# Create a NumPy array
array = np.array([1, 2, 3, 4, 5])
# Save the array to a file
np.save('my_array', array)
# Now, let's load the saved array
loaded_array = np.load('my_array.npy')
print(loaded_array)
Output:
[1 2 3 4 5]
This set of steps is the most straightforward example of saving and loading NumPy array data.
Saving and Loading Multi-Dimensional Arrays
NumPy can handle multi-dimensional arrays with ease. Here’s how you can save and load a two-dimensional array:
import numpy as np
# Create a 2D NumPy array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
# Save to a file
np.save('my_2d_array', array_2d)
# Load the array from the saved file
loaded_2d_array = np.load('my_2d_array.npy')
print(loaded_2d_array)
Output:
[[1 2 3]
[4 5 6]]
Saving Multiple Arrays in a Single File
NumPy also provides a way to save multiple arrays into a single file using the np.savez
function, which saves the arrays in an uncompressed .npz format, or np.savez_compressed
for compressed storage.
import numpy as np
# Create multiple arrays
array1 = np.array([10, 20, 30])
array2 = np.array([40, 50, 60])
# Save arrays in a single file
np.savez('multiple_arrays', arr1=array1, arr2=array2)
# Load the data
arrays_loaded = np.load('multiple_arrays.npz')
# Access individual arrays using the keys
print(arrays_loaded['arr1'])
print(arrays_loaded['arr2'])
Output:
[10 20 30]
[40 50 60]
Saving and Loading Text Files
NumPy can read and write arrays to plain text files. You can use np.savetxt
to write arrays to text files and np.loadtxt
to read them back into an array.
import numpy as np
# Create an array
array = np.array([1.5, 2.5, 3.5])
# Save to a .txt file
np.savetxt('array.txt', array, delimiter=',')
# Load from the .txt file
loaded_array_txt = np.loadtxt('array.txt', delimiter=',')
print(loaded_array_txt)
Output:
[1.5 2.5 3.5]
Advanced: Saving and Loading Structured Arrays
NumPy can work with structured arrays, and you can save and load these array types as well. A structured array has fields, similar to columns in a table.
import numpy as np
# Define a structured data type
dtype = [('name', 'U10'), ('height', float), ('age', int)]
# Create a structured array
people_array = np.array([('Alice', 1.65, 30), ('Bob', 1.85, 25)], dtype=dtype)
# Save the structured array
np.save('people_array', people_array)
# Load the structured array
loaded_people_array = np.load('people_array.npy')
print(loaded_people_array)
Output:
[('Alice', 1.65, 30) ('Bob', 1.85, 25)]
Conclusion
In this guide, we covered how to save and load arrays to files with NumPy, from simple to more structured data types. Working with files is a common operation and doing so efficiently is vital in data-heavy applications. By mastering NumPy’s I/O capabilities, you’re better equipped to deal with a variety of data persistence scenarios.