# Pandas: How to count non-NA/null values in a DataFrame (4 ways)

Updated: February 20, 2024 By: Guest Contributor Post a comment

## Introduction

Pandas, a cornerstone library in Python for data manipulation and analysis, offers various approaches for handling missing data within a DataFrame. A common task when preprocessing data is counting non-NA/null values to assess data completeness or perform subsequent analysis. This tutorial will guide you through several methods to accomplish this, ranging from straightforward to more nuanced approaches.

## Preparing a Test DataFrame

Before we delve into counting non-NA/null values, letâ€™s create a simple DataFrame to work with:

``````import pandas as pd
import numpy as np

# Sample DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
'Age': [29, None, 25, np.nan, 42],
'City': ['New York', 'Los Angeles', np.nan, 'Chicago', 'Austin'],
'Profession': [None, 'Developer', 'Designer', 'Developer', np.nan]
})
print(df)
``````

This DataFrame represents a snapshot of a dataset, including some missing values represented by `None` or `np.nan`.

## Method 1: count()

The simplest way to count non-NA/null values across each column is to use the `count()` method:

``````# Counting non-null values in each column
df.count()
``````

This method directly provides the count of non-null values per column. However, it does not include row-wise counts or differentiate between data types.

## Method 2: notnull() with sum()

A more granular approach can be achieved by combining `notnull()` with `sum()`:

``````# Counting non-null values using notnull() and sum()
print(df.notnull().sum())
``````

This method allows for a more detailed view by providing both column-wise and row-wise counts when needed. Itâ€™s particularly useful for more detailed analytical tasks.

## Method 3: Using info()

The `info()` method is useful for getting a quick summary of the DataFrame, including the count of non-NA values:

``````# Using info() to get summary details
df.info()
``````

Although not directly providing a count in the form of an object or output suitable for further manipulation, `info()` is highly efficient for an overall summary glance at your data.

## Advanced Methods

### Custom Functions

For customized needs, you can wrap these methods within your own functions to calculate specific metrics or for more complex filtering conditions:

``````def count_non_null(df, col_name):
""" Counts non-NA/null values in a specific column. """
return df[col_name].notnull().sum()

# Example usage
count = count_non_null(df, 'Age')
print(f'Non-null Age values count: {count}')
``````

This custom function example highlights how we can extend built-in methods to fit specific needs.

### Combining Multiple Conditions

For analyses that require counting non-null values under specific conditions, Pandas conditional selection can be paired with `notnull()` seamlessly:

``````# Counting non-null 'Age' values for users from 'New York'
count = df[df['City'] == 'New York']['Age'].notnull().sum()
print(f'Non-null Age values for New Yorkers: {count}')
``````

These advanced methods and customizations underscore the flexibility of pandas when working with missing data.

## Conclusion

Counting non-NA/null values in a Pandas DataFrame is a foundational skill for data analysis and can be achieved through multiple methods depending on the need. Whether itâ€™s a quick count with `count()`, a detailed analysis using `notnull()` and `sum()`, or a custom function for specific conditions, Pandas provides the tools necessary for effective data manipulation. Mastering these techniques ensures a solid groundwork for handling missing values in your data exploration and cleaning processes.

Search tutorials, examples, and resources