# Pandas – Using DataFrame.cumsum() method (with examples)

## Introduction

The `DataFrame.cumsum()` method in Pandas is an incredibly useful tool that allows for the computation of cumulative sums across a DataFrame, either column-wise or row-wise. This functionality is particularly beneficial when analyzing sequential data, time series, or for computing running totals in financial data or inventories. In this tutorial, we’ll dive deep into the `cumsum()` method, exploring its utility through 5 practical examples.

Before we begin, ensure that you have Pandas installed in your Python environment. You can install Pandas using pip:

``pip install pandas``

## Basic Use of `cumsum()` in Pandas

First, let’s start with a basic example to understand how to apply `cumsum()`.

``````import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
})

cs_df = df.cumsum()
print(cs_df)``````

Output:

``````    A   B
0   1   5
1   3  11
2   6  18
3   10  26``````

## Column-wise Cumulative Sum

Next, let’s focus on calculating column-wise cumulative sums. This is the default behavior of `cumsum()`, which we saw in our first example. However, you can explicitly specify this by passing `axis=0` as an argument.

``df.cumsum(axis=0)``

## Row-wise Cumulative Sum

To calculate cumulative sums across rows, we change the axis parameter to `axis=1`.

``````import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
})

cs_row = df.cumsum(axis=1)
print(cs_row)``````

Output:

``````    A   B   C
0   1   5  12
1   2   7  15
2   3   9  18``````

## Handling NA Values

Pandas `cumsum()` allows you to handle NaN (Not a Number) values gracefully. By default, `cumsum()` includes NaN in the calculation as a zero value. However, you can skip them by setting the `skipna` parameter to `False`.

``````df_with_na = pd.DataFrame({
'A': [1, NaN, 3],
'B': [NaN, 5, 6],
'C': [7, NaN, 9]
})

cs_skip_na = df_with_na.cumsum(skipna=False)
print(cs_skip_na)``````

Output:

``````     A    B     C
0  1.0  NaN   7.0
1  NaN  5.0   NaN
2  3.0  11.0  9.0``````

## Using `cumsum()` with GroupBy

For more advanced use cases, you can combine `cumsum()` with `GroupBy` operations to calculate cumulative sums within groups. This is particularly useful when analyzing subdivided data.

``````import pandas as pd

df = pd.DataFrame({
'Category': ['A', 'A', 'B', 'B'],
'Value': [1, 2, 3, 4]
})

df_grouped = df.groupby('Category').cumsum()
print(df_grouped)``````

Output:

``````   Value
0      1
1      3
2      3
3      7``````

## Visualizing Cumulative Sums

The final example demonstrates how to visualize cumulative sums using the powerful data visualization library, Matplotlib. This is an essential skill for data scientists who need to communicate their findings effectively.

``````import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
})

cs_df = df.cumsum()

plt.plot(cs_df['A'], label='Column A')
plt.plot(cs_df['B'], label='Column B')
plt.legend()
plt.show()``````

## Conclusion

The Pandas `DataFrame.cumsum()` method is a versatile tool that enables detailed data analysis through the computation of cumulative sums. Whether working with simple or complex datasets, understanding how to employ `cumsum()` effectively can vastly enhance your data manipulation and analysis capabilities. By integrating these examples into your workflow, you’ll be well-equipped to tackle a wide range of data science challenges.

Search tutorials, examples, and resources