Making use of pandas.Series.isin() method (with examples)

Updated: February 22, 2024 By: Guest Contributor Post a comment

Introduction

The pandas library in Python is a powerful tool for data manipulation and analysis, providing flexible data structures that make it easy to clean, analyze, and visualize your data. One of the useful methods provided by pandas is the isin() method available for Series objects. This method checks each element in the Series to see if it is contained in a list or Series of values, returning a boolean Series of the same length indicating True or False for each element. In this tutorial, we will explore how to use the isin() method with comprehensive examples, ranging from basic to advanced use cases.

Getting Started

Before delving into examples, ensure you have pandas installed in your environment:

pip install pandas

Lets start by importing pandas:

import pandas as pd

Basic Usage

To understand the basic usage of isin(), consider the following example:

import pandas as pd

data = pd.Series([2, 4, 6, 8, 10])
print(data.isin([2, 4]))

This will output:

0     True
1     True
2    False
3    False
4    False
dtype: bool

Here, we are checking whether each element in our data Series is present in the list [2, 4]. The isin() method returns a Series of booleans indicating the presence or absence of each element.

Checking Against Another Series

You can also use isin() to compare against another Series. For example:

data = pd.Series([1, 2, 3, 4, 5])
check_series = pd.Series([3, 4, 5, 6, 7])
print(data.isin(check_series))

This code compares each element in data with those in check_series, resulting in:

0    False
1    False
2     True
3     True
4     True
dtype: bool

Using with DataFrames

Although this tutorial focuses on the Series isin() method, it’s useful to know you can apply it at a DataFrame level to filter rows. Here’s an example:

import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, 3, 4, 5],
    'B': [5, 6, 7, 8, 9]
})

filtered_df = df[df['A'].isin([1, 3, 5])]
print(filtered_df)

This results in:

   A  B
0  1  5
2  3  7
4  5  9

This example demonstrates how you can filter rows in a DataFrame based on whether certain values appear in one of its column’s Series.

Combining isin() with Other Methods

The isin() method becomes even more powerful when combined with other pandas methods. For instance, combining isin() with loc allows for selective data manipulation:

df = pd.DataFrame({
    'A': [1, 2, 3, 4, 5],
    'B': [5, 6, 7, 8, 9],
    'C': ['a', 'b', 'c', 'd', 'e']
})

df.loc[df['A'].isin([1, 3, 5]), 'C'] = 'Selected'
print(df)

This updates the C column to ‘Selected’ where the A column’s values are in the list [1, 3, 5], resulting in:

   A  B       C
0  1  5  Selected
2  3  7  Selected
4  5  9  Selected

Advanced Use Cases

For more advanced use cases, you might want to use isin() in conjunction with other conditions to filter data. Here’s how you can do it:

df = pd.DataFrame({
    'A': [1, 2, 3, 4, 5],
    'B': [5, 6, 7, 8, 9],
    'C': [10, 11, 12, 13, 14]
})

condition = df['A'].isin([1, 2, 3]) & df['C'] > 11
filtered_df = df[condition]
print(filtered_df)

This shows how to filter rows where column A contains either 1, 2, or 3, and concurrently, column C values are greater than 11, illustrating:

   A  B   C
2  3  7  12
3  4  8  13

Conclusion

The isin() method is a versatile tool in pandas for efficiently checking the presence of values. By showcasing its basic uses and extending into more complex examples, this tutorial aims to provide a comprehensive understanding of how to effectively apply it in various scenarios. Remember, mastering the use of isin() can significantly streamline your data manipulation and analysis tasks.