Sling Academy
Home/Pandas/Pandas DataFrame stack() and unstack() methods (7 examples)

Pandas DataFrame stack() and unstack() methods (7 examples)

Last updated: February 22, 2024

Introduction

This tutorial delves into the utility of the stack() and unstack() methods available in pandas, a powerful library in Python designed for data manipulation and analysis. By converting between wide and long formats, these methods offer nuanced control over DataFrame structure. We will walk through seven increasingly complex examples to showcase their versatility.

Prerequisites: This article assumes a basic understanding of pandas and Python. Familiarity with DataFrames and Series objects will be beneficial.

The Purposes of Stack() and Unstack()

Pandas DataFrame provides two intriguing methods, stack() and unstack(), that simplifies reshaping data. Essentially, stack() converts column levels into index levels, pivoting a DataFrame from a wide format to a long one. Conversely, unstack() shifts index levels to column levels, facilitating a pivot from long to wide format.

Example 1: Basic Stacking

Let’s begin with a simple DataFrame.

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    'A': ['one', 'two', 'three'],
    'B': ['four', 'five', 'six'],
    'C': [1, 2, 3]
})
print(df)

Stacking this DataFrame:

stacked_df = df.stack()
print(stacked_df)

Output:

0  A       one
   B      four
   C         1
1  A       two
   B      five
   C         2
2  A     three
   B       six
   C         3
dtype: object

This demonstrates how stack() converts the DataFrame into a Series with a multi-level index.

Example 2: Unstacking Basics

To unstack the previous example:

unstacked_df = stacked_df.unstack()
print(unstacked_df)

Output:

       A     B  C
0    one  four  1
1    two  five  2
2  three   six  3

The DataFrame returns to its original structure, exemplifying unstack() flexibility.

Example 3: Stacking Selective Columns

For more control, you can stack selective columns.

partially_stacked = df[['A', 'B']].stack()
print(partially_stacked)

Output:

0  A    one
   B   four
1  A    two
   B   five
2  A  three
   B    six
dtype: object

This method keeps specific data points in a long format while excluding others.

Example 4: Unstacking with Levels

When dealing with multi-level indexes, specifying the level to unstack becomes crucial.

multi_level_df = df.stack()
unstacked_by_level = multi_level_df.unstack(level=0)
print(unstacked_by_level)

Output:

     0    1      2
A  one  two  three
B  four five  six
C    1    2      3

This example unstacks at a specific level, illustrating flexibility in restructuring.

Example 5: Cross-Section with xs

Combining stack()/unstack() with xs() (cross-section) allows for precise data slicing.

cross_section = multi_level_df.xs('A', level=1)
print(cross_section)

Output:

0      one
1      two
2    three
dtype: object

This targets specific slices within the multi-level Series, showcasing an advanced application of stacked data.

Example 6: Handling Missing Data

Stacking and unstacking can also elegantly handle missing data, an inherent challenge in data manipulation.

Consider a DataFrame with missing values:

df_with_na = pd.DataFrame({
    'A': ['one', None, 'three'],
    'B': ['four', 'five', 'six'],
    'C': [1, 2, 3]
})
stacked_with_na = df_with_na.stack()

Missing values are automatically excluded, simplifying data cleaning processes.

Example 7: Stacking and Unstacking with MultiIndex Columns

For DataFrames with multi-level columns, stacking and unstacking can transform data structures in complex ways.

multi_col_df = pd.DataFrame({
    ('A', 'cat'): ['one', 'two', 'three'],
    ('B', 'dog'): ['four', 'five', 'six'],
    ('C', 'mouse'): [1, 2, 3]
}).set_index([('A', 'cat')]).stack()
print(multi_col_df)

This combination of methods affords nuanced restructuring for detailed analysis.

Conclusion

This guide outlined the practical applications of stack() and unstack() methods, from basic to advanced uses. These examples illustrate the powerful flexibility pandas offers in data manipulation, enabling complex reshaping and structuring for analysis.

Next Article: Mastering DataFrame.transpose() method in Pandas (with examples)

Previous Article: Explore DataFrame.swaplevel() method in Pandas (5 examples)

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)