Explore DataFrame.swaplevel() method in Pandas (5 examples)

Introduction
When to Use the swaplevel() Method?
Example 1: Basic Usage of swaplevel()
Example 2: Swapping Column Levels
Example 3: Sorting After Swapping for Better Readability
Example 4: Swapping Levels While Querying Data
Example 5: Incorporating swaplevel() in Data Aggregation
Conclusion

Introduction

Working with multi-index DataFrames in Pandas, a powerful Python data analysis library, involves intricate manipulations to structure data in a way that makes it easy to analyze. One of the methods that simplify this process is swaplevel(). In this guide, we will explore the swaplevel() method in Pandas through five detailed examples, ranging from basic to advanced usage. This tutorial intends to provide you with a clear understanding of how and when to use swaplevel(), enhancing your data analysis skills.

When to Use the swaplevel() Method?

The swaplevel() method is used to swap levels in a DataFrame’s MultiIndex (both in rows and columns). It can accept a couple of parameters – i and j, which are the levels you want to swap. If these parameters are not specified, the default is to swap the two innermost levels.

Example 1: Basic Usage of swaplevel()

import pandas as pd
import numpy as np

np.random.seed(2024)

# Creating a sample DataFrame
arrays = [['bar', 'bar', 'baz', 'baz'],
          ['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second'))
df = pd.DataFrame({'A': np.random.randn(4), 'B': np.random.randn(4)}, index=index)

# Using swaplevel()
df_swapped = df.swaplevel('first', 'second')
print(df_swapped)

Output:

                     A         B
second first                    
one    bar    1.668047  0.916052
two    bar    0.737348  1.160330
one    baz   -0.201538 -2.619962
two    baz   -0.150912 -1.325295

This introduces you to the basic concept of swapping levels in a DataFrame index. The method rearranges the levels, but the data remains unchanged. This can be particularly useful for data analysis and visualization tasks.

Example 2: Swapping Column Levels

import pandas as pd
import numpy as np

np.random.seed(2024)

# Creating a DataFrame with MultiIndex columns
columns = pd.MultiIndex.from_arrays(
    [["A", "A", "B", "B"], ["one", "two", "one", "two"]], names=["upper", "lower"]
)
df = pd.DataFrame(np.random.randn(3, 4), columns=columns)

# Swapping the column levels
df_swapped = df.swaplevel("upper", "lower", axis=1)
print(df_swapped)

Output:

lower       one       two       one       two
upper         A         A         B         B
0      1.668047  0.737348 -0.201538 -0.150912
1      0.916052  1.160330 -2.619962 -1.325295
2      0.459989  0.102052  1.053553  1.624043

This example demonstrates how to swap levels in a DataFrame’s columns rather than its rows. The use of the axis=1 argument specifies that the operation should be performed on columns.

Example 3: Sorting After Swapping for Better Readability

This example is an expansion of Example #1:

import pandas as pd
import numpy as np

np.random.seed(2024)

# Creating a sample DataFrame
arrays = [['bar', 'bar', 'baz', 'baz'],
          ['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second'))
df = pd.DataFrame({'A': np.random.randn(4), 'B': np.random.randn(4)}, index=index)

# Using swaplevel()
df_swapped = df.swaplevel('first', 'second')

# Continuing with the DataFrame from Example 1
df_swapped_sorted = df_swapped.sort_index()
print(df_swapped_sorted)

Output:

                     A         B
second first                    
one    bar    1.668047  0.916052
       baz   -0.201538 -2.619962
two    bar    0.737348  1.160330
       baz   -0.150912 -1.325295

Swapping levels can sometimes lead to a DataFrame that’s hard to analyze at a glance, particularly if the levels are not sorted. This example shows how sorting the DataFrame after swapping levels can improve readability significantly.

Example 4: Swapping Levels While Querying Data

import pandas as pd
import numpy as np

np.random.seed(2024)

# Creating a sample DataFrame
arrays = [["bar", "bar", "baz", "baz"], ["one", "two", "one", "two"]]
index = pd.MultiIndex.from_arrays(arrays, names=("first", "second"))
df = pd.DataFrame({"A": np.random.randn(4), "B": np.random.randn(4)}, index=index)

# Reusing the initial DataFrame
df_query = df.swaplevel('first', 'second').query('second == "one"')
print(df_query)

Output:

                     A         B
second first                    
one    bar    1.668047  0.916052
       baz   -0.201538 -2.619962

This advanced example illustrates how you can swap levels directly within a data query operation. By rearranging the levels, it’s possible to tailor the query to specific analysis needs, showcasing the versatility of swaplevel() within data manipulation workflows.

Example 5: Incorporating swaplevel() in Data Aggregation

import pandas as pd
import numpy as np

# Example DataFrame
arrays = [["A", "A", "B", "B"], ["C1", "C2", "C1", "C2"]]
columns = pd.MultiIndex.from_arrays(arrays, names=('Letter', 'Number'))
values = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
df = pd.DataFrame(values.T, columns=columns)

# Swapping and then aggregating
df_swapped_aggregated = df.swaplevel('Letter', 'Number', axis=1).groupby(level=0).sum()
print(df_swapped_aggregated)

Output:

Number C1 C2  C1  C2
Letter  A  A   B   B
0       1  5   9  13
1       2  6  10  14
2       3  7  11  15
3       4  8  12  16

This final example dives into a more complex scenario, illustrating how swaplevel() can be used as part of a data aggregation process. By swapping levels before performing operations like grouping, this technique offers a flexible approach to data analysis.

Conclusion

Throughout this guide, we explored the versatility and utility of the swaplevel() method in Pandas through multiple examples. From basic swapping of index levels to more advanced applications in querying and aggregating data, swaplevel() proves to be an invaluable tool in the data analyst’s arsenal. Embrace these techniques to streamline your data manipulation tasks and elevate your analytical insights.

Next Article: Pandas DataFrame stack() and unstack() methods (7 examples)

Previous Article: Pandas DataFrame nlargest() and nsmallest() methods (5 examples)

Series: DateFrames in Pandas

Pandas