Sling Academy
Home/Pandas/Pandas – DataFrame.cummin() method (5 examples)

Pandas – DataFrame.cummin() method (5 examples)

Last updated: February 20, 2024

Introduction

The cummin() method in Pandas is a powerful tool that computes the cumulative minimum of a DataFrame or Series axis. This method is part of the broad suite of Descriptive Statistics functions available in Pandas, making data analysis tasks both simpler and more efficient. In this tutorial, we’ll explore how to use the cummin() method across various scenarios to tighten your grip on data manipulation and analytics using Pandas.

Getting Started

First and foremost, ensure you have Pandas installed in your environment:

pip install pandas

Once installed, you can import Pandas and proceed with the examples.

import pandas as pd

Basic Usage of cummin()

To understand the basic functionality, let’s create a simple DataFrame:

df = pd.DataFrame({
    'A': [2, 3, 1, 4, 2],
    'B': [5, 3, 4, 2, 1]
})
print(df)

Applying cummin():

result = df.cummin()
print(result)

This will compute the cumulative minimum across each column, showing how values are progressively minimized.

Column-wise and Row-wise Computation

You can specify the axis along which the cumulative minimum should be computed using the axis parameter:

result_col = df.cummin(axis=0) # Default, column-wise
result_row = df.cummin(axis=1) # Row-wise
print("Column-wise\n", result_col)
print("Row-wise\n", result_row)

Understanding the difference between column-wise and row-wise operations is crucial, as it affects how data is analyzed and presented.

Working with Missing Data

Handling missing data is an intrinsic part of data analysis. Luckily, the cummin() method handles NaN values gracefully. By default, NaN values are ignored in the computation, acting as a sort of ‘neutral’ element.

df_nan = pd.DataFrame({
    'A': [np.nan, 3, 1, 4, 2],
    'B': [5, np.nan, 4, 2, 1]
})
print(df_nan.cummin())

Comparing with Other Columns

Sometimes, it’s necessary to compare cumulative minimums across different columns or frames. This can get slightly more complex depending on your dataset and the specific comparisons you want to make. For demonstration, let’s create two DataFrames:

df1 = pd.DataFrame({
    'A': [2, 1, 3, 4],
    'B': [5, 2, 4, 2]
})

df2 = pd.DataFrame({
    'A': [3, 4, 2, 1],
    'B': [1, 2, 3, 4]
})

df1_cummin = df1.cummin()
df2_cummin = df2.cummin()
print("DataFrame 1 Cumulative Min:\n", df1_cummin)
print("DataFrame 2 Cumulative Min:\n", df2_cummin)

This can provide insights into how values develop across datasets, helping you make more informed comparisons and decisions.

Advanced: Mixing cummin() with Other Methods

For more advanced usages, combining cummin() with other DataFrame operations can yield powerful analysis tools. An example would be filtering rows based on cumulative minimum criteria:

result_filtered = df[df['A'].cummin() <= 2]
print(result_filtered)

This filters the DataFrame to only include rows where the cumulative minimum of column ‘A’ remains 2 or less.

Conclusion

The cummin() method in Pandas provides a streamlined way to compute cumulative minimums across your datasets. Whether you’re doing basic data exploration or complex analyses, understanding how to leverage this method can significantly enhance your data manipulation and decision-making capabilities. With practice, you’ll find the cummin() method an indispensable part of your data analysis toolkit.

Next Article: Pandas – DataFrame.cumprod() method (4 examples)

Previous Article: Pandas DataFrame.cummax() method: Explained with examples

Series: DateFrames in Pandas

Pandas

You May Also Like

  • How to Use Pandas Profiling for Data Analysis (4 examples)
  • How to Handle Large Datasets with Pandas and Dask (4 examples)
  • Pandas – Using DataFrame.pivot() method (3 examples)
  • Pandas: How to ‘FULL JOIN’ 2 DataFrames (3 examples)
  • Pandas: Select columns whose names start/end with a specific string (4 examples)
  • 3 ways to turn off future warnings in Pandas
  • How to Integrate Pandas with Apache Spark
  • How to Use Pandas for Web Scraping and Saving Data (2 examples)
  • How to Clean and Preprocess Text Data with Pandas (3 examples)
  • Pandas – Using Series.replace() method (3 examples)
  • Pandas json_normalize() function: Explained with examples
  • Pandas: Reading CSV and Excel files from AWS S3 (4 examples)
  • Using pandas.Series.rank() method (4 examples)
  • Pandas: Dropping columns whose names contain a specific string (4 examples)
  • Pandas: How to print a DataFrame without index (3 ways)
  • Fixing Pandas NameError: name ‘df’ is not defined
  • Pandas – Using DataFrame idxmax() and idxmin() methods (4 examples)
  • Pandas FutureWarning: ‘M’ is deprecated and will be removed in a future version, please use ‘ME’ instead
  • Pandas: Checking equality of 2 DataFrames (element-wise)