A detailed guide to pandas.Series.combine() method (with examples)

Updated: February 17, 2024 By: Guest Contributor Post a comment

The ‘pandas.Series.combine()’ method provides flexibility and power in manipulating and combining two Series objects, potentially using non-matching indexes. In this tutorial, you’ll grasp the method’s utility through various examples, progressing from simple to more complex applications.

Introduction to pandas.Series.combine()

The ‘combine()’ method in pandas merges two Series by applying a function to each pair of elements sharing the same index in the two series. The skeleton of the method is as follows:

Series.combine(other, func, fill_value=None)

Where other is the other Series to combine with, func is the function to apply to pairs of elements, and fill_value specifies what value to use when an index is missing in one Series but present in the other.

Basic Usage

Let’s start with a simple example. Assume you have two Series that represent two different aspects of the same items:

import pandas as pd

s1 = pd.Series([2, 4, 6, 8])
s2 = pd.Series([1, 3, 5, 7])

# Combine using sum
combined = s1.combine(s2, lambda x, y: x + y)
print(combined)

The output will be:

0 3 1 7 2 11 3 15 dtype: int64

This example simply adds corresponding elements from the two series together.

Handling Missing Values

One of the powerful aspects of ‘combine()’ is its handling of missing values through the ‘fill_value’ parameter. Let’s adjust our example to include missing values:

s1 = pd.Series([2, 4, None, 8])
s2 = pd.Series([1, None, 5, 7])

combined = s1.combine(s2, lambda x, y: x + y if pd.notnull(x) and pd.notnull(y) else x if pd.notnull(x) else y, fill_value=0)
print(combined)

The output adapts to missing values, filling them with zeros:

0 3.0 1 4.0 2 5.0 3 15.0 dtype: int64

More Advanced Examples

Now, let’s explore more complex scenarios. For example, combining series based on conditional logic:

s1 = pd.Series([20, 21, 19, 18])
s2 = pd.Series([15, 22, 20, 16])

combined = s1.combine(s2, lambda x, y: x if x > y else y)
print(combined)

This output selects the larger value from each pair:

0 20 1 22 2 20 3 18 dtype: int64

Moving onto a scenario where indexes don’t match:

s1 = pd.Series([2, 4, 6], index=['a', 'b', 'c'])
s2 = pd.Series([1, 3, 5], index=['b', 'c', 'd'])

combined = s1.combine(s2, lambda x, y: x+y, fill_value=0)
print(combined)

The method efficiently handles non-matching indices, providing a comprehensive combined series:

0 2.0 1 5.0 2 9.0 3 5.0 dtype: int64

Conclusion

‘pandas.Series.combine()’ method is a powerful tool for combining data from two series, offering great flexibility in handling different data manipulation scenarios, including handling missing values and non-matching indices. Mastering this method enhances data preprocessing capabilities crucial for effective data analysis.