NumPy: How to find unique elements in an array

Updated: January 23, 2024 By: Guest Contributor Post a comment

Introduction

NumPy is a powerful library for numerical computing in Python. It provides tools for working with large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. In this tutorial, we will explore how to find unique elements within a NumPy array using various techniques and functions provided by the library.

To follow along with this tutorial, you should have a basic understanding of Python programming and be familiar with NumPy. If you haven’t already, you can install NumPy using pip:

pip install numpy

Basic Usage of np.unique()

The most straightforward way to find unique elements in a NumPy array is to use the np.unique() function. Here is a basic example:

import numpy as np

# Create a numpy array
arr = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])

# Find the unique elements of the array
unique_elements = np.unique(arr)

print(unique_elements)

Output:

[1 2 3 4]

The np.unique() function returns the sorted unique elements of the array.

Working with Multidimensional Arrays

When dealing with multidimensional arrays, np.unique() flattens the array by default and then finds the unique elements. However, you can also find unique rows or columns by setting the axis parameter. For example, to find unique rows:

import numpy as np

# Create a 2D numpy array
arr2d = np.array([[1, 2],
                  [2, 3],
                  [1, 2]])

# Find the unique rows of the array
unique_rows = np.unique(arr2d, axis=0)

print(unique_rows)

Output:

[[1 2]
 [2 3]]

Return Counts with np.unique()

NumPy’s np.unique() function can also return the counts of the unique elements by setting return_counts=True. This can be very informative if you’re interested in the distribution of elements:

import numpy as np

# Create a numpy array
arr = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])

# Find the unique elements and their counts
unique_elements, counts = np.unique(arr, return_counts=True)

print(unique_elements)
print(counts)

Output:

[1 2 3 4]
[1 2 3 4]

Advanced Usage of np.unique()

For more advanced usage, you might need to find unique elements and understand their relation to the original array. Using return_index=True, you can get the first indices of unique entries:

import numpy as np

# Create a numpy array
arr = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])

# Find the unique elements and their first indices
unique_elements, indices = np.unique(arr, return_index=True)

print(unique_elements)
print(indices)

Output:

[1 2 3 4]
[0 1 3 6]

This is useful when you need to refer back to the original array.

Finding Unique Elements in Structured Arrays

NumPy allows you to create structured arrays with multiple data types. You can also find unique elements within structured arrays:

import numpy as np

dtype = [('name', 'S10'), ('age', int)]

# Create a structured numpy array
arr_structured = np.array([('Bob', 30), ('Alice', 25), ('Bob', 30)], dtype=dtype)

# Find the unique elements
unique_structured = np.unique(arr_structured)

print(unique_structured)

Output:

[(b'Alice', 25) (b'Bob', 30)]

Here, np.unique() considers each tuple in the array as an element and finds the unique tuples.

Combine NumPy Arrays with Unique Values

If you want to combine arrays while ensuring that the resulting array only has unique elements, you can concatenate them first and then use np.unique():

import numpy as np

# Create two numpy arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([2, 3, 4])

# Concatenate and find the unique elements
combined_unique = np.unique(np.concatenate((arr1, arr2)))

print(combined_unique)

Output:

[1 2 3 4]

Conclusion

In this comprehensive tutorial, we’ve explored multiple methods for finding unique elements in NumPy arrays, from basic one-dimensional arrays to more complex structured ones. NumPy provides powerful tools which make such operations intuitive and efficient, capable of handling large data sets with ease.