Sling Academy
Home/Pandas/Pandas – Using DataFrame.melt() method (5 examples)

Pandas – Using DataFrame.melt() method (5 examples)

Last updated: February 22, 2024

Syntax & Parameters

The DataFrame.melt() method in Pandas is a versatile function used to transform or reshape data in DataFrames. It ‘melts’ the DataFrame into a long format, where multiple columns are merged into one, allowing for a more flexible data structure that is easier to aggregate, manipulate, and read for certain types of analyses. This tutorial will guide you through five practical examples to demonstrate the power and flexibility of melt() in various data manipulation scenarios.

Before we dive into the examples, it’s important to understand the basic syntax of the melt() method:

pd.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)

Where:

  • frame is the DataFrame to melt.
  • id_vars are the column(s) to hold constant. The data in these columns will be repeated for each row.
  • value_vars are the columns to melt down into a single column.
  • var_name allows you to rename the variable column generated after melting.
  • value_name allows you to rename the value column after melting.

Example 1: Basic Melting

Let’s start with a basic example to convert a wide DataFrame into a long format. This operation consolidates column labels into a single ‘variable’ column, and their respective values into a ‘value’ column.

import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Carol'],
    'Math': [85, 88, 92],
    'Science': [90, 82, 88]
})

melted_df = pd.melt(df, id_vars=['Name'], var_name='Subject', value_name='Score')

print(melted_df)

Output:

   Name   Subject  Score
0  Alice  Math     85
1  Bob    Math     88
2  Carol  Math     92
3  Alice  Science  90
4  Bob    Science  82
5  Carol  Science  88

Example 2: Without Specifying id_vars

If you don’t specify the id_vars, every column not set as a value_var becomes an id_var by default, leading to a different reshaping:

df = pd.DataFrame({
    'City': ['New York', 'Los Angeles', 'Chicago'],
    'Temperature': [59, 75, 48],
    'Humidity': [55, 65, 70]
})

melted_df = pd.melt(df)

print(melted_df)

Output:

    Variable       Value
0   City           New York
1   City           Los Angeles
2   City           Chicago
3   Temperature    59
4   Temperature    75
5   Temperature    48
6   Humidity       55
7   Humidity       65
8   Humidity       70

Example 3: Multiple id_vars and value_vars

This example involves consolidating more complex datasets. By specifying multiple id_vars and value_vars, you can reshape a dataset in more detail:

import pandas as pd

df = pd.DataFrame({
    'Name': ['Dave', 'Emma'],
    '2019_Sales': [250, 300],
    '2020_Sales': [265, 340],
    'Region': ['East', 'West']
})

melted_df = pd.melt(df, id_vars=['Name', 'Region'], value_vars=['2019_Sales', '2020_Sales'], var_name='Year', value_name='Sales')

print(melted_df)

Output:

   Name Region        Year  Sales
0  Dave   East  2019_Sales    250
1  Emma   West  2019_Sales    300
2  Dave   East  2020_Sales    265
3  Emma   West  2020_Sales    340

Example 4: Melting with Hierarchical Column Headers

DataFrames with hierarchical (multi-level) column headers can also be melted to simplify their structure for analysis. This requires indicating the col_level parameter:

import pandas as pd

df = pd.DataFrame({
    ('Location', 'City'): ['London', 'Tokyo', 'New York'],
    ('Weather', 'Temperature'): [15, 26, 21],
    ('Weather', 'Humidity'): [80, 70, 65]
})

df.columns = pd.MultiIndex.from_tuples(df.columns)

melted_df = pd.melt(df, col_level=1)

print(melted_df)

Output:

    Variable       Value
0   City           London
1   City           Tokyo
2   City           New York
3   Temperature    15
4   Temperature    26
5   Temperature    21
6   Humidity       80
7   Humidity       70
8   Humidity       65

Example 5: Customizing Melted DataFrames

The final example demonstrates how you can further customize the melting process to suit specific data analysis needs. Specific combinations of the parameters allow for a great level of detail in how the data is reshaped:

import pandas as pd

df = pd.DataFrame({
    'Student': ['John', 'Jane'],
    'Math': [92, 85],
    'Science': [88, 90],
    'History': [94, 88]
})

melted_df = pd.melt(df, id_vars=['Student'], value_vars=['Math', 'Science'], var_name='Subject', value_name='Score')

print(melted_df)

Output:

   Student  Subject  Score
0  John     Math     92
1  Jane     Math     85
2  John     Science  88
3  Jane     Science  90

Conclusion

The melt() method provides a highly adaptable tool for reshaping DataFrames, making data easier to analyze and work with. Through these examples, you’ve seen how to go from simple wide-to-long transformations to more complex custom melts tailored for specific data structures and requirements.

Next Article: Pandas – Using DataFrame.assign() method (5 examples)

Previous Article: Mastering DataFrame.transpose() method in Pandas (with examples)

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)