Pandas: How to append new rows to a DataFrame (4 approaches)

Updated: February 20, 2024 By: Guest Contributor Post a comment

Introduction

In the world of data analysis and manipulation, Pandas is one of the most popular libraries in Python due to its powerful functionalities. Among its diverse set of capabilities, appending new rows to an existing DataFrame stands out as a common task that data scientists and analysts frequently encounter. This tutorial aims to guide you through various ways to append rows to a DataFrame, from basic methods to more advanced techniques, ensuring you have the tools needed to handle your data effectively.

Getting Started

Before diving into the various methods for appending rows, let’s set up a basic DataFrame to work with. Ensure you have Pandas installed in your environment. If not, you can install it using pip install pandas. Once installed, import Pandas and create a simple DataFrame:

import pandas as pd

df = pd.DataFrame({
    'Name': ['John', 'Anna'],
    'Age': [28, 22],
    'City': ['New York', 'Los Angeles']
})

print(df)

Output:

   Name  Age         City
0  John   28     New York
1  Anna   22  Los Angeles

This initial DataFrame contains basic information about two individuals. Our goal is to add more entries to this DataFrame.

Method 1: Using _append() Method

The _append() method in Pandas allows you to add one or more rows to the end of a DataFrame. This method is straightforward and convenient for quickly adding a few rows. Let’s add a single row to our DataFrame:

new_row = {"Name": "Mike", "Age": 32, "City": "Chicago"}
df = df._append(new_row, ignore_index=True)
print(df)

Output:

   Name  Age         City
0  John   28     New York
1  Anna   22  Los Angeles
2  Mike   32      Chicago

The ignore_index=True parameter is crucial as it allows the DataFrame to reindex, incorporating the new row seamlessly. Without it, the index would be inconsistent.

Method 2: Using concat() Function

While append() is suitable for adding a few rows, the concat() function is more efficient for appending multiple rows or another DataFrame. Here’s how you can use it:

new_rows = pd.DataFrame({
    'Name': ['Emily', 'Dan'],
    'Age': [25, 30],
    'City': ['Denver', 'Boston']
})

df = pd.concat([df, new_rows]).reset_index(drop=True)
print(df)

Output:

    Name  Age         City
0   John   28     New York
1   Anna   22  Los Angeles
2  Emily   25       Denver
3    Dan   30       Boston

This method is particularly useful when dealing with larger datasets or when you need to combine two DataFrames.

Method 4: Adding Multiple Rows from Various Sources (e.g., a list of dicts)

Another advanced technique is to append rows from different sources such as lists, dictionaries, or other DataFrames. The approach varies depending on the source format, illustrating the flexibility of Pandas when handling data. Here’s an example of appending a list of dictionaries:

more_rows = [
    {"Name": "Lucas", "Age": 27, "City": "Seattle"},
    {"Name": "Emma", "Age": 33, "City": "Austin"},
]
df = pd.concat([df, pd.DataFrame(more_rows)]).reset_index(drop=True)
print(df)

Ouptut:

    Name  Age         City
0   John   28     New York
1   Anna   22  Los Angeles
2  Lucas   27      Seattle
3   Emma   33       Austin

This method exemplifies Pandas’ ability to adapt to various data formats, making it an indispensable tool for data manipulation.

Method 4: Using DataFrame’s loc or iloc to Inject Rows

This technique is just for your reference. It looks verbose and overcomplicated in comparison to other mentioned approaches.

import pandas as pd

# Initial DataFrame
df = pd.DataFrame(
    {"Name": ["John", "Anna"], "Age": [28, 22], "City": ["New York", "Los Angeles"]}
)

# Let's assume we want to insert "Sophia" at index 1
new_index = df.index.tolist()  # Convert index to a list
new_index.insert(1, "new")  # Insert a placeholder for the new row's index

# Reindex the DataFrame to include the new index
df = df.reindex(new_index)

# Insert the new row data at the 'new' index
df.loc["new"] = ["Sophia", 29, "Miami"]

# Reset the index to maintain a continuous integer sequence
df = df.reset_index(drop=True)

print(df)

Output:

     Name   Age         City
0    John  28.0     New York
1  Sophia  29.0        Miami
2    Anna  22.0  Los Angeles

In this approach:

  • We first convert the DataFrame’s index to a list and insert a placeholder for the new row’s intended position.
  • We then reindex the DataFrame using this new index list, which creates a space for the new row.
  • We assign the new row’s data to the placeholder index.
  • Finally, we reset the index of the DataFrame to ensure it has a continuous integer sequence, removing the placeholder.

This method is somewhat unconventional for inserting rows and can lead to confusion, especially with more complex data manipulations. It’s generally recommended to use methods like pd.concat or DataFrame.append for adding rows, as they are more intuitive and less prone to errors or unexpected behaviors.

Handling Data Types and Non-Uniform Data

When appending rows, it’s essential to consider the data types and structure of the incoming data. Pandas does a remarkable job of automatically aligning data types when appending rows. However, in scenarios where the data types are non-uniform or not as expected, you might need to explicitly define or convert data types to ensure consistency across the DataFrame.

Performance Considerations

While appending rows is a powerful feature, it’s vital to consider the performance implications, especially when working with large DataFrames. Methods like concat() are optimized for efficiency but understanding the underlying mechanics and choosing the right tool for the job can significantly impact performance.

Conclusion

Appending rows to a DataFrame is a fundamental part of manipulating data sets in Pandas. Whether you’re adding a single row or combining multiple DataFrames, understanding the various methods to append rows—along with their benefits and limitations—is crucial for effective data analysis. With practice and exploration, you’ll find appending rows in Pandas to be a valuable skill in your data manipulation toolbox.