Pandas DataFrame.squeeze() method (5 examples)

Updated: February 20, 2024 By: Guest Contributor Post a comment

Introduction

The Pandas library in Python is a powerhouse for data manipulation and analysis. Among its versatile set of features, the squeeze() method is notably efficient for reducing the dimensionality of DataFrame objects in certain conditions. This tutorial delves into the nuances of the squeeze() method with five illustrative examples, ranging from basic to advanced applications.

When to Use DataFrame.squeeze() Method?

DataFrame.squeeze() is used to convert a DataFrame with a single column or row into a Series. If the DataFrame has more than one column or row, it remains unaffected. This method is particularly useful when paired with data operations that might return a single column/row DataFrame and you need a more compact representation.

Let’s begin by understanding the basic usage of squeeze() before diving into specific scenarios where it proves invaluable.

Example 1: Basic Usage

First, let’s create a DataFrame with a single column and see how squeeze() operates:

import pandas as pd

# Create a DataFrame with a single column
df = pd.DataFrame({'A': [1, 2, 3, 4, 5]})

# Applying squeeze
df_squeezed = df.squeeze()
print(type(df_squeezed))
print(df_squeezed)

Output:

<class 'pandas.core.series.Series'>
1
2
3
4
5
Name: A, dtype: int64

This example demonstrates not only the reduction to a Series but also the preservation of the column name as the Series name.

Example 2: Working with a Single Row DataFrame

Now, let’s try squeeze() with a DataFrame that has a single row:

df = pd.DataFrame({'A': [1], 'B': [2], 'C': [3]})
# Applying squeeze
df_squeezed = df.squeeze()
print(df_squeezed)

Output:

A    1
B    2
C    3
Name: 0, dtype: int64

In this scenario, squeeze() converts the single-row DataFrame into a Series where the DataFrame column names become the Series index, closely mirroring the preceding example but in the context of rows instead of columns.

Example 3: Squeezing with More Specific Selections

Often, you’ll encounter situations where you need to extract a Series from a DataFrame based on specific conditions. For instance, querying a DataFrame might return a subset of data that is either a single column or row:

df = pd.DataFrame({'A': [1, 2, 3, 4, 5],
                   'B': [6, 7, 8, 9, 10]})

# Selecting a single column and squeezing
single_column = df.loc[:, ['A']].squeeze()
print(single_column)

# Selecting a single row and squeezing
single_row = df.loc[[0], :].squeeze()
print(single_row)

Output:

1    1
2    2
3    3
4    4
5    5
Name: A, dtype: int64

A    1
B    6
Name: 0, dtype: int64

This example illustrates the versatility of squeeze() in handling both column and row selections, further emphasizing its utility in data preprocessing and analysis tasks.

Example 4: Squeezing MultiIndex DataFrames

The squeeze() method can also be adeptly used with MultiIndex (hierarchical index) DataFrame structures. When dealing with a MultiIndex DataFrame that results in a single column or row after indexing, squeeze() can simplify the structure to a Series. Here’s how:

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]},
                  index=[['x', 'y'], ['a', 'b']])
# Making it a MultiIndex DataFrame
df.index = pd.MultiIndex.from_tuples(df.index)

# Squeezing a single element MultiIndex DataFrame
df_squeezed = df.loc['x', 'a'].squeeze()
print(df_squeezed)

Output:

A    1
B    3
Name: (x, a), dtype: int64

This scenario showcases squeeze() accommodating the complexity of MultiIndex DataFrames, efficiently reducing dimensions while retaining the hierarchical index structure within the resulting Series.

Example 5: Chain Operations with squeeze()

Last but not least, squeeze() can be an integral part of method chaining in Pandas, allowing for streamlined data manipulation workflows. Here’s an example:

df = pd.DataFrame({'A': range(1, 11),
                   'B': range(11, 21)})

# Chaining operations
df_filtered = df.query('A > 5').loc[:, ['A']].squeeze()
print(df_filtered)

Output:

6     6
7     7
8     8
9     9
10    10
Name: A, dtype: int64

This example highlights the succinctness achievable with squeeze() as part of a method chaining sequence, enhancing readability and conciseness in data manipulation scripts.

Conclusion

The squeeze() method is an essential tool in the Pandas library, facilitating seamless dimensionality reduction for DataFrames. Through the examples provided, it is evident that squeeze() not only simplifies data structures but also streamlines data analysis workflows, particularly in scenarios requiring precision and conciseness.