How to save a NumPy array to a CSV file

Updated: January 23, 2024 By: Guest Contributor Post a comment

Introduction

Storing data efficiently and effectively often requires making use of diverse file formats that are both human-readable and easily transmissible. Among these, the comma-separated values (CSV) file stands out as a frequently used format, especially in the realm of data science and analytics. NumPy, being an essential library for data handling in Python, provides straightforward means to save arrays to CSV files. In this tutorial, we will go through the process of exporting a NumPy array to a CSV file, step by step, with multiple code examples ranging from the most basic use cases to more advanced scenarios.

Before diving into the examples, ensure that you have Python and the NumPy library installed. To install NumPy if you haven’t done so, run the following command:

pip install numpy

Basic Example of Saving a NumPy Array to a CSV File

The simplest scenario involves saving a one-dimensional or two-dimensional NumPy array to a CSV file. Let’s start with a basic example:

import numpy as np

# Create a simple 2D array
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Save to a CSV file
np.savetxt('data.csv', data, delimiter=',')

This code snippet will create a file ‘data.csv’ with the following content: 1,2,3 4,5,6 7,8,9

Specify Column Headers

If you want to include headers in your CSV file, you can use the 'fmt' and 'header' parameters of np.savetxt:

headers = 'Column1,Column2,Column3'
np.savetxt('data_with_headers.csv', data, fmt='%d', delimiter=',', header=headers, comments='')

This will add a header line to your CSV: Column1,Column2,Column3 1,2,3 4,5,6 7,8,9

Saving with Custom Formatting

You might want to format each column differently, especially when dealing with data that have different data types. Let’s assume you have an array that contains floating point numbers and you want to limit the number of decimal places:

data_floats = np.random.rand(3,3)
np.savetxt('data_floats.csv', data_floats, fmt='%.2f', delimiter=',')

The ‘%.2f’ format specifier in the code above tells np.savetxt to represent floating-point numbers with two decimal places. 0.68,0.79,0.65 0.14,0.50,0.44 0.59,0.54,0.32

Advanced Example: Structured Arrays

NumPy can create structured arrays that contain multiple data types. To save a structured array to a CSV file, you can use np.savetxt with a specific format string:

structured_data = np.array([(1, 'John', 9.5), (2, 'Alice', 8.7)], dtype=[('id', 'i'), ('name', 'U10'), ('score', 'f4')])
np.savetxt('structured_data.csv', structured_data, fmt='%i,%s,%.2f', delimiter=',', header='ID,Name,Score', comments='')

Your file ‘structured_data.csv’ will contain: ID,Name,Score 1,John,9.50 2,Alice,8.70

Writing Multidimensional Arrays

If you have multidimensional arrays (more than two dimensions), you will need to reshape or slice the data into a two-dimensional format:

# 3-Dimensional array example
multi_dim_array = np.arange(27).reshape(3, 3, 3)
for i, slice in enumerate(multi_dim_array):
    np.savetxt(f'slice_{i}.csv', slice, fmt='%d', delimiter=',')

This will create three separate CSV files (‘slice_0.csv’, ‘slice_1.csv’, ‘slice_2.csv’) for each of the 2D slices of the array: 0,1,2 3,4,5 6,7,8 and so on.

Handling Very Large Arrays

For extremely large NumPy arrays, consider using np.save instead of saving to a CSV, as files can become very large, and writing can be slow:

large_data = np.random.rand(10000, 10000)
np.save('large_data.npy', large_data)

However, if a CSV is necessary for integration or reporting, ensure that your system has the required memory and storage to handle the operation.

Conclusion

Saving NumPy arrays to CSV files is a common task in data processing and can be accomplished with ease using NumPy’s built-in functions. From simple arrays to more sophisticated structured data, being able to export this information to a CSV format enables better data sharing and interoperability.