Sling Academy
Home/Pandas/Pandas: How to iterate over rows in a DataFrame (6 examples)

Pandas: How to iterate over rows in a DataFrame (6 examples)

Last updated: February 24, 2024

Introduction

In data analysis and manipulation with Python, Pandas is one of the most popular libraries due to its powerful and flexible data structures. A common task you may encounter is the need to iterate over rows in a DataFrame. This can be for data transformation, analysis, or even generating insights. In this tutorial, we’ll explore six methods to iterate over rows in a Pandas DataFrame, ranging from basic to advanced techniques.

Setting Up Your DataFrame

Before diving into the examples, let’s set up a simple DataFrame to use throughout this tutorial:

import pandas as pd
data = {
  'Name': ['John', 'Anna', 'Peter', 'Linda'],
  'Age': [28, 34, 29, 32],
  'City': ['New York', 'Paris', 'Berlin', 'London']
}
df = pd.DataFrame(data)
print(df)

This DataFrame contains names, ages, and cities of four individuals:

    Name  Age      City
0   John   28  New York
1   Anna   34     Paris
2  Peter   29    Berlin
3  Linda   32    London

Example 1: Iterating with iterrows()

One of the simplest ways to iterate over DataFrame rows is by using the iterrows() method. This yields the index and row data as a Series for each row.

for index, row in df.iterrows():
    print(index, row["Name"], row["Age"], row["City"])
    print('---') # Add a separator between rows

Output:

0 John 28 New York
---
1 Anna 34 Paris
---
2 Peter 29 Berlin
---
3 Linda 32 London
---

This method is particularly useful for quick inspections or operations that do not require vectorized operations for performance gains.

Example 2: Using itertuples()

The itertuples() method is a faster alternative to iterrows() and returns named tuples of the data.

for row in df.itertuples():
    print(row.Index, row.Name, row.Age, row.City)

This approach is usually faster than iterrows() but keep in mind that it does not allow modifications to the DataFrame directly within the loop.

Example 3: Apply Functions

The apply() method is very powerful for applying a function along an axis of the DataFrame (rows in this case).

df.apply(lambda x: print(x['Name'], x['Age'], x['City']), axis=1)

This way is more Pandas-centric and can leverage internal optimizations.

Output:

John 28 New York
Anna 34 Paris
Peter 29 Berlin
Linda 32 London

Example 4: Vectorized Operations

For purely computational tasks, direct vectorized operations on columns are preferred due to their high efficiency. Here’s an example:

df['Age_plus_one'] = df['Age'] + 1
print(df)

This operation adds 1 to each value in the ‘Age’ column without explicitly iterating over each row.

Example 5: Using applymap() for Element-wise Operations

While not strictly for row operations, applymap() is great for element-wise operations on a DataFrame. If your task requires individual transformations per element, consider this:

df[['Name', 'City']].applymap(str.upper)

This converts all strings in the ‘Name’ and ‘City’ columns to uppercase.

Example 6: The transform() Method

Another sophisticated method for row-wise operations is using transform(), which allows you to perform a function on each element in the row, but with the ability to retain the original shape of the DataFrame.

df['Name_length'] = df['Name'].transform(lambda x: len(x))
print(df)

This adds a column showing the length of each name. It’s particularly useful for more complex data transformations within groups.

Conclusion

Iterating over rows in a DataFrame is a common task in data analysis with Pandas. The method you choose depends on the specific requirements of your task, such as the need for speed, simplicity, or direct data modification. Understanding these six methods provides a robust toolkit for handling various data iteration and transformation tasks effectively.

Next Article: Is it possible to use async/await in Pandas?

Previous Article: Pandas: What is a MultiIndex and how to create one

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)