Pandas

Syntax & Parameters
Example 1: Basic Melting
Example 2: Without Specifying id_vars
Example 3: Multiple id_vars and value_vars
Example 4: Melting with Hierarchical Column Headers
Example 5: Customizing Melted DataFrames
Conclusion

Syntax & Parameters

The DataFrame.melt() method in Pandas is a versatile function used to transform or reshape data in DataFrames. It ‘melts’ the DataFrame into a long format, where multiple columns are merged into one, allowing for a more flexible data structure that is easier to aggregate, manipulate, and read for certain types of analyses. This tutorial will guide you through five practical examples to demonstrate the power and flexibility of melt() in various data manipulation scenarios.

Before we dive into the examples, it’s important to understand the basic syntax of the melt() method:

pd.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)

Where:

frame is the DataFrame to melt.
id_vars are the column(s) to hold constant. The data in these columns will be repeated for each row.
value_vars are the columns to melt down into a single column.
var_name allows you to rename the variable column generated after melting.
value_name allows you to rename the value column after melting.

Example 1: Basic Melting

Let’s start with a basic example to convert a wide DataFrame into a long format. This operation consolidates column labels into a single ‘variable’ column, and their respective values into a ‘value’ column.

import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Carol'],
    'Math': [85, 88, 92],
    'Science': [90, 82, 88]
})

melted_df = pd.melt(df, id_vars=['Name'], var_name='Subject', value_name='Score')

print(melted_df)

Output:

   Name   Subject  Score
0  Alice  Math     85
1  Bob    Math     88
2  Carol  Math     92
3  Alice  Science  90
4  Bob    Science  82
5  Carol  Science  88

Example 2: Without Specifying `id_vars`

If you don’t specify the id_vars, every column not set as a value_var becomes an id_var by default, leading to a different reshaping:

df = pd.DataFrame({
    'City': ['New York', 'Los Angeles', 'Chicago'],
    'Temperature': [59, 75, 48],
    'Humidity': [55, 65, 70]
})

melted_df = pd.melt(df)

print(melted_df)

Output:

    Variable       Value
0   City           New York
1   City           Los Angeles
2   City           Chicago
3   Temperature    59
4   Temperature    75
5   Temperature    48
6   Humidity       55
7   Humidity       65
8   Humidity       70

Example 3: Multiple `id_vars` and `value_vars`

This example involves consolidating more complex datasets. By specifying multiple id_vars and value_vars, you can reshape a dataset in more detail:

import pandas as pd

df = pd.DataFrame({
    'Name': ['Dave', 'Emma'],
    '2019_Sales': [250, 300],
    '2020_Sales': [265, 340],
    'Region': ['East', 'West']
})

melted_df = pd.melt(df, id_vars=['Name', 'Region'], value_vars=['2019_Sales', '2020_Sales'], var_name='Year', value_name='Sales')

print(melted_df)

Output:

   Name Region        Year  Sales
0  Dave   East  2019_Sales    250
1  Emma   West  2019_Sales    300
2  Dave   East  2020_Sales    265
3  Emma   West  2020_Sales    340

Example 4: Melting with Hierarchical Column Headers

DataFrames with hierarchical (multi-level) column headers can also be melted to simplify their structure for analysis. This requires indicating the col_level parameter:

import pandas as pd

df = pd.DataFrame({
    ('Location', 'City'): ['London', 'Tokyo', 'New York'],
    ('Weather', 'Temperature'): [15, 26, 21],
    ('Weather', 'Humidity'): [80, 70, 65]
})

df.columns = pd.MultiIndex.from_tuples(df.columns)

melted_df = pd.melt(df, col_level=1)

print(melted_df)

Output:

    Variable       Value
0   City           London
1   City           Tokyo
2   City           New York
3   Temperature    15
4   Temperature    26
5   Temperature    21
6   Humidity       80
7   Humidity       70
8   Humidity       65

Example 5: Customizing Melted DataFrames

The final example demonstrates how you can further customize the melting process to suit specific data analysis needs. Specific combinations of the parameters allow for a great level of detail in how the data is reshaped:

import pandas as pd

df = pd.DataFrame({
    'Student': ['John', 'Jane'],
    'Math': [92, 85],
    'Science': [88, 90],
    'History': [94, 88]
})

melted_df = pd.melt(df, id_vars=['Student'], value_vars=['Math', 'Science'], var_name='Subject', value_name='Score')

print(melted_df)

Output:

   Student  Subject  Score
0  John     Math     92
1  Jane     Math     85
2  John     Science  88
3  Jane     Science  90

Conclusion

The melt() method provides a highly adaptable tool for reshaping DataFrames, making data easier to analyze and work with. Through these examples, you’ve seen how to go from simple wide-to-long transformations to more complex custom melts tailored for specific data structures and requirements.

Next Article: Pandas – Using DataFrame.assign() method (5 examples)

Previous Article: Mastering DataFrame.transpose() method in Pandas (with examples)

Series: DateFrames in Pandas

Pandas