Sling Academy
Home/Pandas/Pandas: Turn a DataFrame to a list of dictionaries

Pandas: Turn a DataFrame to a list of dictionaries

Last updated: February 19, 2024

Introduction

Pandas is an immensely popular Python library for data manipulation and analysis. One of its core data structures is the DataFrame, which efficiently stores and operates on tabular data. In certain cases, you may want to convert a DataFrame into a list of dictionaries, which can be more convenient for JSON serialization, or for passing data to systems that expect this format. This tutorial will guide you through this conversion process, covering basic to advanced scenarios.

Getting Started

Let’s start by importing pandas and creating a simple DataFrame for our examples:

import pandas as pd

# Create a DataFrame
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
        'Age': [28, 34, 29, 32],
        'City': ['New York', 'Paris', 'Berlin', 'London']}
df = pd.DataFrame(data)
print(df)

Output:

    Name  Age      City
0   John   28  New York
1   Anna   34      Paris
2  Peter   29    Berlin
3  Linda   32    London

Basic Conversion

The simplest way to turn a DataFrame into a list of dictionaries is by using the .to_dict() method with the 'records' orientation, which converts each row to a dictionary, with column names as keys:

list_of_dicts = df.to_dict('records')
print(list_of_dicts)

Output:

[{'Name': 'John', 'Age': 28, 'City': 'New York'},
 {'Name': 'Anna', 'Age': 34, 'City': 'Paris'},
 {'Name': 'Peter', 'Age': 29, 'City': 'Berlin'},
 {'Name': 'Linda', 'Age': 32, 'City': 'London'}]

Customizing the Conversion

You may want to include or exclude certain columns from your list of dictionaries. Pandas allows for easy customization by passing specific columns to the .to_dict() method:

# Including specific columns
list_of_dicts_partial = df[['Name', 'City']].to_dict('records')
print(list_of_dicts_partial)

Output:

[{'Name': 'John', 'City': 'New York'},
 {'Name': 'Anna', 'City': 'Paris'},
 {'Name': 'Peter', 'City': 'Berlin'},
 {'Name': 'Linda', 'City': 'London'}]

Dealing with Missing Data

When converting DataFrames with missing values to dictionaries, it’s important to decide how these values should be handled. By default, Pandas will include missing values as None in the dictionaries. However, you can choose to exclude these keys entirely:

# DataFrame with missing values
data_with_missing = {'Name': ['Tom', 'Sara', 'Chris'], 'Age': [25, None, 28], 'City': ['Rome', 'Madrid', None]}
df_missing = pd.DataFrame(data_with_missing)

# Exclude keys with None values
list_of_dicts_no_none = df_missing.dropna().to_dict('records')
print(list_of_dicts_no_none)

Output:

[{'Name': 'Tom', 'Age': 25, 'City': 'Rome'}]

Advanced Conversion Techniques

For applications needing more control or additional processing during conversion, Pandas allows for more sophisticated customization. For example, using list comprehensions with DataFrame.iterrows() for row-wise processing:

# Advanced example with iterrows()
advanced_list_of_dicts = [{col:val for col, val in row.iteritems()} for index, row in df.iterrows()]
print(advanced_list_of_dicts)

Output:

[{'Name': 'John', 'Age': 28, 'City': 'New York'},
 {'Name': 'Anna', 'Age': 34, 'City': 'Paris'},
 {'Name': 'Peter', 'Age': 29, 'City': 'Berlin'},
 {'Name': 'Linda', 'Age': 32, 'City': 'London'}]

Conclusion

Converting a DataFrame to a list of dictionaries is a versatile skill that enhances interoperability between Pandas and other Python libraries or external systems. This tutorial has demonstrated how to perform this conversion, from basic to advanced techniques, providing complete flexibility for your data manipulation needs.

Next Article: Pandas: How to import a CSV file into a DataFrame

Previous Article: Pandas: Convert a list of dicts into a DataFrame

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)