Sling Academy
Home/Pandas/Understanding DataFrame.mean() method in Pandas

Understanding DataFrame.mean() method in Pandas

Last updated: February 20, 2024


Data manipulation and analysis form the backbone of data science, with pandas being one of the most powerful and widely used libraries in Python for these tasks. Among its functionalities, the DataFrame.mean() method is pivotal for statistical analyses, allowing users to compute the mean of the data across various axes. This tutorial aims to guide you through the nuances of the DataFrame.mean() method, providing a comprehensive understanding through a series of code examples.

First, ensure you have pandas installed:

pip install pandas

Syntax of DataFrame.mean()

The DataFrame.mean() method computes the mean of the values for the requested axis. If no axis is specified, it defaults to computing the column-wise mean. Use the following syntax:

DataFrame.mean(axis=None, skipna=True, level=None, numeric_only=True, **kwargs)


  • axis: {0 or ‘index’, 1 or ‘column’} – Specify the axis for the mean calculation.
  • skipna: Boolean, default True – Whether to exclude NA/null values.
  • level: If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series.
  • numeric_only: Boolean, default True – Include only float, int, boolean data.

Basic Usage

Starting with the basics, let’s create a simple DataFrame:

import pandas as pd

# Sample DataFrame
data = { 'A': [1, 2, 3], 'B': [4, 5, None], 'C': [7, 8, 9] }
df = pd.DataFrame(data)

This gives us:

   A    B  C
0  1  4.0  7
1  2  5.0  8
2  3  NaN  9

Computing the column-wise mean:

mean_values = df.mean()


A    2.0
B    4.5
C    8.0
dtype: float64

Advanced Usage

Moving to more complex examples, let’s use some real-world data and demonstrate the use of other parameters.

Let’s assume you’ve loaded a dataset containing multiple columns, including some non-numeric ones:

To compute the mean excluding non-numeric types:

mean_values = df.mean(numeric_only=True)

Let’s calculate row-wise mean, excluding NA values:

mean_rows = df.mean(axis=1, skipna=True)

For datasets with hierarchical indices:

# Assuming df has a MultiIndex
mean_level = df.mean(level=0)

Working with Time Series Data

Pandas is also adept at handling time series data. If your dataset includes datetime indices, calculating the mean over specific time intervals becomes very simple.


# Generating sample time series data
df_ts = pd.date_range(start='1/1/2020', periods=100)
df_ts['value'] = np.random.random(size=(100,))

# Monthly mean
df_ts_monthly_mean = df_ts.resample('M').mean()

Note: This example assumes you have NumPy installed for generating random numbers:

pip install numpy


Through this tutorial, we explored the DataFrame.mean() method in pandas, an essential tool for statistical analysis. Starting from basic examples and gradually moving to more complex scenarios, we discussed how to accurately compute means across different axes, data types, and structures. Armed with these insights, you’re now better equipped to handle a wide range of data analysis tasks.

Next Article: Pandas DataFrame.median() method (5 examples)

Previous Article: Pandas – Using DataFrame.min() method

Series: DateFrames in Pandas


You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)