Fixing NumPy Error: Array is not JSON serializable

Updated: January 23, 2024 By: Guest Contributor Post a comment

Introduction

When working with NumPy arrays and JSON data serialization in Python, you may encounter the error stating that ‘array is not JSON serializable’. This error occurs because JSON is a text-based format that doesn’t natively handle NumPy’s binary array objects. Fortunately, there are multiple workarounds to effectively deal with this issue.

Understanding the Error

Before diving into the solutions, let’s understand why this error occurs. JSON is a standard data interchange format that serializes objects to string representations. However, since NumPy arrays are not built-in Python data types, the JSON module does not know how to convert these objects to strings. As a result, attempting to serialize a NumPy array using the json.dumps() function will throw an error.

Solution #1: Use Python’s Built-in Lists

NumPy arrays can be converted to lists, which are serializable by JSON. This method is straightforward, with the only drawback being the potential for increased memory usage with very large arrays.

  1. Convert the NumPy array to a Python list using tolist().
  2. Serialize the list using json.dumps().

Example:

import json
import numpy as np

# Create a NumPy array
dataset = np.array([1, 2, 3])

# Convert to a list and serialize
data_json = json.dumps(dataset.tolist())

# Output
print(data_json)

Notes: Converting to a list is simple and widely compatible, but may not be the most memory-efficient for large arrays.

Solution #2: Use a Custom JSON Encoder

Implement a custom JSON encoder that extends json.JSONEncoder to handle NumPy array serialization. This is more flexible and can be customized to handle various NumPy data types.

  1. Define a custom encoder that inherits from json.JSONEncoder.
  2. Override the default() method to serialize NumPy arrays.
  3. Pass this encoder to json.dumps().

Example:

import json
import numpy as np

class NumPyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

dataset = np.array([1, 2, 3])
data_json = json.dumps(dataset, cls=NumPyEncoder)
print(data_json)

Notes: This method is more complex but allows for more control over serialization, and can be extended to handle different data types.

Solution #3: Use Serialization Libraries

Utilize third-party libraries such as pandas or json_tricks, which can natively handle NumPy array serialization to JSON. These libraries can deal with more complex data types and structures.

  1. Install the serialization library (e.g., pip install pandas).
  2. Use the library’s built-in functions for serialization.
import pandas as pd
import numpy as np

# Create a NumPy array wrapped in a pandas DataFrame
dataframe = pd.DataFrame(np.array([1, 2, 3]))

# Serialize using pandas' to_json() function
data_json = dataframe.to_json()

# Output
print(data_json)

Notes: These libraries provide powerful serialization options but add dependencies to your project.

Conclusion

When facing the ‘array is not JSON serializable’ error in NumPy, the solutions shown above can help resolve the issue. Whether you choose to convert NumPy arrays to lists, implement a custom JSON encoder, or opt for a third-party serialization library, understanding how to manipulate data formats will enable you to work with JSON and NumPy seamlessly in Python.