Pandas – DataFrame.mode() method (5 examples)

Updated: February 24, 2024 By: Guest Contributor Post a comment

Introduction

When working with data in Python, the Pandas library stands out as a powerful tool for data manipulation and analysis. One of the useful methods provided by this library is the DataFrame.mode() method, which is particularly helpful when you need to find the most frequent values across your data set. In this tutorial, we’ll explore the DataFrame.mode() method through five practical examples. We will start with basic usage and gradually move to more advanced examples, showing the versatility of this method.

What is DataFrame.mode() Used for?

The mode() function is used to find the mode(s) of each element along the selected axis. The result’s index will be the original DataFrame’s column if axis=0, and will be the DataFrame’s indices if axis=1. In cases where there are multiple modes in a data set, the mode() function returns all of the modes.

Basic Example: Finding the Mode of a Single Column

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['John', 'Ana', 'Peter', 'John', 'John']}
df = pd.DataFrame(data)

# Find the mode of the 'Name' column
mode_result = df['Name'].mode()
print(mode_result)

Output:

0    John
Name: Name, dtype: object

This basic example demonstrates how to find the most frequent name in the ‘Name’ column. The mode, in this case, is “John” since it appears more often than any other name.

Example 2: Mode of Each Column in a DataFrame

import pandas as pd

# Create another sample DataFrame with multiple types of data
data = {'Name': ['John', 'Ana', 'Peter', 'John'],
        'Age': [24, 30, 22, 24],
        'City': ['New York', 'Los Angeles', 'New York', 'Miami']}
df = pd.DataFrame(data)

# Find the mode for each column
each_mode = df.mode()
print(each_mode)

Output:

   Name  Age        City
0  John   24    New York

This example highlights how to calculate the mode for each column. It is useful when you want to find common patterns across different fields in your data set.

Example 3: Handling Multiple Modes

import pandas as pd

# Suppose we have a DataFrame with multiple potential modes
data = {'Size': ['Small', 'Medium', 'Large', 'Medium', 'Small', 'Small']}
df = pd.DataFrame(data)

# Find the mode
modes = df['Size'].mode()
print(modes)

Output:

0    Small
1    Medium

In this example, both ‘Small’ and ‘Medium’ appear twice, making them both modes of the ‘Size’ column. The mode() function can handle such situations gracefully, returning both values.

Example 4: Mode Along a Different Axis

import pandas as pd
import numpy as np

# Create a DataFrame with numerical values
data = {'Test1': [88, 92, 100, 92],
        'Test2': [92, 100, 88, 100],
        'Test3': [100, 88, 92, 92]}
df = pd.DataFrame(data)

# Find the mode along axis 1 (rows)
row_modes = df.mode(axis=1)
print(row_modes)

Output:

      0
0  88.0
1  92.0
2  92.0
3  92.0

This advanced example shifts the focus from columns to rows, calculating the mode for each row rather than each column. It’s particularly useful for data sets where you might want to find patterns or repetitions across different measurements or tests.

Example 5: Excluding NA/NaN Values

import pandas as pd
import numpy as np

# Create a DataFrame with some missing values
data = {'Scores': [90, np.nan, 88, 90, 88, np.nan]}
df = pd.DataFrame(data)

# Find the mode excluding NA/NaN values
mode_no_na = df['Scores'].mode(dropna=True)
print(mode_no_na)

Output:

0    88.0
1    90.0

By passing dropna=True to the mode() function, we can exclude NA/NaN values from our calculation. It is particularly useful in data cleaning and preprocessing stages of data analysis.

Conclusion

The DataFrame.mode() method in Pandas is versatile and powerful, enabling us to easily find the most frequent values in our data. Through these five examples, we have seen various applications, from basic usage to handling multiple modes and excluding missing values. Employing the mode() method can significantly simplify your data analysis process.