Pandas

Introduction
Getting Started
Basic Usage
Axis Parameter
Skipping NaN Values
Using skipna
Aggregating Minimum Values
Advanced Usage: Custom Functions
Conclusion

Introduction

Pandas is a popular Python library for data analysis and manipulation. Whether you’re dealing with large datasets or just need to perform quick data transformations, Pandas provides a comprehensive set of tools to accomplish your tasks efficiently. The DataFrame.min() method is one of these useful tools, allowing users to easily compute the minimum value along a specific axis of the DataFrame. This tutorial provides an in-depth guide to using the DataFrame.min() method, complete with various examples ranging from basic to advanced use cases.

Getting Started

Before diving into the DataFrame.min() method, ensure you have Pandas installed in your Python environment:

pip install pandas

Once installed, you can import Pandas and create a simple DataFrame to get started:

import pandas as pd
df = pd.DataFrame({
    'A': [1, 2, 3, 4],
    'B': [5, 6, None, 8],
    'C': [9, 10, 11, 12]
})
print(df)

Output:

   A    B   C
0  1  5.0  9
1  2  6.0 10
2  3  NaN 11
3  4  8.0 12

Basic Usage

The most straightforward use of DataFrame.min() is to find the minimum value across the entire DataFrame. By default, this checks all numeric columns, avoiding any non-numeric data:

print(df.min())

Output:

A    1.0
B    5.0
C    9.0
dtype: float64

Axis Parameter

The axis parameter allows you to specify whether to compute the minimum values along columns (axis=0) or rows (axis=1):

print(df.min(axis=0))
print(df.min(axis=1))

Output:

A    1.0
B    5.0
C    9.0
dtype: float64

0    1.0
1    2.0
2    3.0
3    4.0
dtype: float64

As seen, specifying axis=0 (default) returns the minimum value in each column, while axis=1 returns the minimum value for each row.

Skipping NaN Values

The DataFrame.min() method automatically skips NaN (Not a Number) values. This behavior ensures that NaN values do not affect the computation of the minimum:

df['B'][2] = pd.NA
print(df.min())

Output:

A    1.0
B    5.0
C    9.0
dtype: float64

Using skipna

Though skipping NaN values is the default behavior, this can be adjusted using the skipna parameter:

print(df.min(skipna=False))

Setting skipna=False will stop the method from ignoring NaN values, potentially resulting in NaN as the output for columns containing such values.

Aggregating Minimum Values

Pandas also allows for more complex manipulations such as aggregating minimum values across multiple columns:

df['Min_A_B'] = df[['A', 'B']].min(axis=1)
print(df)

Output:

   A    B   C  Min_A_B
0  1  5.0  9      1.0
1  2  6.0 10      2.0
2  3  NaN 11      3.0
3  4  8.0 12      4.0

Here, a new column is created to store the minimum value between columns ‘A’ and ‘B’ for each row.

Advanced Usage: Custom Functions

An advanced feature of Pandas is the ability to use the apply() function alongside DataFrame.min() to perform custom minimum value computations. For instance, you might want to find the minimum value in a DataFrame after applying a specific transformation:

df['Adjusted Min'] = df[['A', 'C']].apply(lambda x: (x - 1).min(), axis=1)
print(df)

Output:

   A    B   C  Min_A_B  Adjusted Min
0  1  5.0  9      1.0           0.0
1  2  6.0 10      2.0           1.0
2  3  NaN 11      3.0           2.0
3  4  8.0 12      4.0           3.0

By subtracting 1 from columns ‘A’ and ‘C’ before computing the minimum, you can generate custom analytics tailored to your specific needs.

Conclusion

The DataFrame.min() method in Pandas is a powerful tool for summarizing and analyzing datasets. By understanding its basic usage, exploring the effects of different parameters, and applying it in more advanced scenarios, you can harness the full potential of this function to derive meaningful insights from your data.

Next Article: Understanding DataFrame.mean() method in Pandas

Previous Article: How to Use Pandas for Geospatial Data Analysis (3 examples)

Series: DateFrames in Pandas

Pandas