Understanding numpy.unique() function (5 examples)

Overview
Example 1: Basic Usage
Example 2: Return Indexes
Example 3: Return Inverse
Example 4: Axis Option
Example 5: Advanced Data Manipulation
Conclusion

Overview

The numpy.unique() function is a powerful tool in the Python NumPy library, which is widely used for scientific computing. It helps in finding the unique elements of an array, offering various options to control its behavior. Understanding how to use this function efficiently can greatly enhance your data analysis tasks. This tutorial aims to provide a comprehensive guide to the numpy.unique() function through 5 illustrative examples, ranging from basic to advanced usage.

Example 1: Basic Usage

In its simplest form, numpy.unique() is used to find the unique elements of an array. Let’s start with a straightforward example:

import numpy as np

# Creating an array
arr = np.array([1, 2, 2, 3, 3, 3, 4])

# Using numpy.unique()
unique_elements = np.unique(arr)

print("Unique elements:", unique_elements)

Output:

Unique elements: [1 2 3 4]

Example 2: Return Indexes

Next, let’s see how to get the indexes of the unique elements. This is useful for locating where each unique item was first encountered in the original array:

import numpy as np

# Example array
arr = np.array([3, 1, 2, 1, 2, 3, 3, 4])

# Finding unique elements and their first occurrence indexes
unique_elements, indexes = np.unique(arr, return_index=True)

print("Unique elements:", unique_elements)
print("First occurrence indexes:", indexes)

Output:

Unique elements: [1 2 3 4] 
First occurrence indexes: [1 2 0 7]

Example 3: Return Inverse

Another useful feature is the ability to return the inverse of the unique indices which can be used to reconstruct the original array. This is particularly helpful in some advanced data manipulation:

import numpy as np

# Example array
arr = np.array([1, 2, 2, 3, 3, 3, 4])

# Getting unique elements and the inverse
unique_elements, inverse = np.unique(arr, return_inverse=True)

# Using the inverse to reconstruct the original array
reconstructed_array = unique_elements[inverse]

print("Reconstructed array:", reconstructed_array)

Output:

Reconstructed array: [1 2 2 3 3 3 4]

Example 4: Axis Option

Working with multi-dimensional arrays introduces the challenge of finding unique rows or columns. The axis parameter of numpy.unique() makes this a breeze:

import numpy as np

# 2D array
arr = np.array([[1, 2], [2, 3], [1, 2], [2, 3]])

# Finding unique rows
unique_rows = np.unique(arr, axis=0)

print("Unique rows:", unique_rows)

Output:

Unique rows: [[1 2] [2 3]]

Example 5: Advanced Data Manipulation

Now, let’s take a more advanced look at using numpy.unique() in the context of filtering data based on conditions and performing operations such as intersection or similarity checks:

import numpy as np

# Large array with duplicates
arr = np.random.randint(0, 10, size=100)

# Finding unique elements and their counts
unique_elements, counts = np.unique(arr, return_counts=True)

# Filtering unique elements with more than 5 occurrences
filtered_elements = unique_elements[counts > 5]

print("Filtered elements more than five occurrences:", filtered_elements)

Output:

Filtered elements more than five occurrences: [0 1 2 3 4 5 6 7 8 9]

Conclusion

The numpy.unique() function is a flexible and powerful tool for array manipulation, offering capabilities that extend far beyond merely identifying unique elements. From simple arrays to advanced data manipulation, understanding how to use this function opens up a wealth of possibilities for efficient data analysis and manipulation.

Next Article: NumPy – Understanding char.str_len() function (4 examples)

Previous Article: How to Extend NumPy’s Capabilities with Custom C Extensions

Series: NumPy Intermediate & Advanced Tutorials

NumPy