NumPy: How to find the row indexes of several values in a 2D array

Introduction
Getting Started
Basic Indexing with NumPy
Using np.argwhere for Multi-Conditional Indexing
Custom Functions for Complex Conditions
Conclusion

Introduction

NumPy is a fundamental package for scientific computing in Python, providing support for large, multi-dimensional arrays and matrices, along with a collection of high-level mathematical functions to operate on these arrays. When working with 2D arrays in NumPy, you may encounter scenarios where you need to find the indexes of rows that contain specific values. This tutorial will guide you through various methods of achieving this, complete with multiple code examples that range from basic to more advanced techniques.

Getting Started

Before we dive into finding row indexes in a NumPy array, ensure you have NumPy installed and imported in your programming environment:

import numpy as np

Basic Indexing with NumPy

Finding the index of a single value in a 1D array is straightforward, you can simply use the np.where() function. However, when working with 2D arrays, finding the row index for multiple values requires a different approach. Let’s start by creating a simple 2D NumPy array.

array_2d = np.array([[1,2,3],
                    [4,5,6],
                    [7,8,9],
                    [1,3,5]])
print(array_2d)

Finding a Single Value

If you’re searching for rows that contain a specific single value, you can use np.where() in combination with a condition.

row_indices = np.where(array_2d == 3)[0]
print('Rows with the value 3:', row_indices)

The output will list the row indices where the value 3 is present:

Rows with the value 3: [0 3]

Finding Multiple Values

When you need to find rows containing multiple specific values, you can still use np.where(), but you’ll have to iterate over the values and combine the conditions properly.

values_to_find = [3, 5]

for value in values_to_find:
    row_indices = np.where(array_2d == value)[0]
    print(f'Rows with the value {value}:', row_indices)

This will produce the output:

Rows with the value 3: [0 3]
Rows with the value 5: [1 3]

Advanced Indexing Techniques

Now let’s explore some more complex scenarios and how to handle them.

Multiple Values in the Same Row

What if we want to find the rows where several specific values appear together? A possible solution involves using masks and the np.all() function along with boolean indexing.

values_to_find = [1, 3]

mask = np.isin(array_2d, values_to_find)

row_indices = np.where(np.all(mask, axis=1))[0]
print('Rows where 1 and 3 appear together:', row_indices)

If both 1 and 3 are in the same row, the mask will have a True for that row, and np.all() will confirm this for us. The output is:

Rows where 1 and 3 appear together: [3]

Using Logical Operators

You can also combine the conditions using logical operators like & (and), | (or) within np.where() function.

row_indices = np.where((array_2d == 1) | (array_2d == 3))[0]
unique_indices = np.unique(row_indices)
print('Rows with either 1 or 3:', unique_indices)

In this example, the output will be the unique row indices where either the value 1 or 3 is present:

Rows with either 1 or 3: [0 3]

Using np.argwhere for Multi-Conditional Indexing

The np.argwhere() function is extremely handy when you’re dealing with complex conditions.

conditions = (array_2d == 1) | (array_2d == 3)
row_indices = np.argwhere(conditions)[:, 0]
unique_indices = np.unique(row_indices)
print('Row indices:', unique_indices)

This will provide you with the row indices where either 1 or 3 is present:

Row indices: [0 3]

Custom Functions for Complex Conditions

If none of the built-in NumPy functions suit your needs, you can always write a custom function to look for rows satisfying complex conditions.

Example:

import numpy as np

# Create a sample 2D array (matrix)
data = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9],
                 [10, 11, 12]])

# Define a custom function to filter rows based on conditions
def custom_filter(matrix, condition_function):
    filtered_rows = []
    for row in matrix:
        if condition_function(row):
            filtered_rows.append(row)
    return np.array(filtered_rows)

# Define a complex condition function (example: rows with even sum)
def complex_condition(row):
    return np.sum(row) % 2 == 0

# Use the custom_filter function to filter rows based on the complex condition
filtered_data = custom_filter(data, complex_condition)

# Print the original data and the filtered data
print("Original Data:")
print(data)
print("\nFiltered Data (Rows with Even Sum):")
print(filtered_data)

In this code:

We start with a sample 2D array called data, representing a matrix with rows and columns.
We define a custom function called custom_filter that takes a matrix and a condition function as input. This function iterates through the rows of the matrix and appends rows that satisfy the condition to a list.
We define a complex condition function called complex_condition as an example. In this case, it checks if the sum of elements in a row is even.
We use the custom_filter function to filter rows from the data matrix based on the complex_condition function.
Finally, we print both the original data and the filtered data to see the result.

You can customize the complex_condition function to define your own complex conditions for filtering rows in a NumPy array.

Conclusion

We’ve covered several methods to find the row indexes of certain values in a 2D NumPy array, from basic indexing with np.where() to more advanced usage of np.argwhere() and custom functions. These techniques are essential for data manipulation and analysis in Python, and mastering them will significantly enhance your NumPy skills.

Next Article: How to pretty print a NumPy array by suppressing the scientific notation (like 1e10)

Previous Article: NumPy: How to calculate Euclidean and Manhattan distances

Series: NumPy Basic Tutorials

NumPy