Understanding ndarray.cumsum() method in NumPy (5 examples)

Updated: February 27, 2024 By: Guest Contributor Post a comment

Introduction

NumPy, the cornerstone library for numerical computing in Python, provides a vast array of functions to perform operations on arrays efficiently. Among these, the ndarray.cumsum() method is a powerful tool for computing the cumulative sum of array elements over a specified axis. This tutorial will guide you through understanding and using ndarray.cumsum() with practical examples, ranging from basic to advanced applications.

Syntax & Parameters

The ndarray.cumsum() method returns the cumulative sum of the elements along a given axis. The cumulative sum is a sequence where each number is the sum of itself plus all previous numbers in the sequence. This function can be particularly useful in data analysis and scientific computing for running totals or understanding the progressive accumulation of values.

Syntax:

numpy.ndarray.cumsum(axis=None, dtype=None, out=None)

Parameters:

  • axis: (Optional) Axis or axes along which the cumulative sum is computed. By default, the flattened array is used.
  • dtype: (Optional) Data type of the returned array and of the accumulator in which the elements are summed. If not specified, the data type of the array is used.
  • out: (Optional) Output array where the result is placed.

Example 1: Basic Usage

import numpy as np

# Create a NumPy array
array = np.array([1, 2, 3, 4])

# Compute the cumulative sum
cumulative_sum = array.cumsum()

# Output the result
print(cumulative_sum)

Output:

[ 1 5 6 10]

This example demonstrates the basic functionality of ndarray.cumsum(), computing the cumulative sum of a one-dimensional array.

Example 2: Cumulative Sum Along an Axis

import numpy as np

# Create a 2D array
array = np.array([[1, 2], [3, 4]])

# Compute the cumulative sum along axis 0
cumulative_sum_axis0 = array.cumsum(axis=0)

# Output the result
print(cumulative_sum_axis0)

Output:

[[1 2]
[4 6]]

Specifying the axis parameter allows for cumulative sums to be computed along rows or columns of multi-dimensional arrays. In this example, axis 0 refers to rows, accumulating each column’s values vertically.

Example 3: Cumulative Sum with dtype Specification

import numpy as np

# Create a NumPy array of integers
array = np.array([1, 2, 3, 1000])

# Compute the cumulative sum with dtype=float
cumulative_sum_dtype = array.cumsum(dtype=float)

# Output the result
print(cumulative_sum_dtype)

Output:

[ 1. 3. 6. 1006.]

Changing the data type via the dtype parameter can be useful when dealing with large numbers or when precision is necessary, preventing potential overflow or underflow errors.

Example 4: Masked Arrays and Conditional Cumulative Sum

import numpy as np

# Create a masked array
array = np.ma.array([1, 2, -1, 3], mask=[0, 0, 1, 0])

# Compute the cumulative sum, excluding masked elements
cumulative_sum_masked = array.cumsum()

# Output the result
print(cumulative_sum_masked)

Output:

[1 3 3 6]

This example illustrates the use of a masked array, where certain elements are excluded from calculations. The ndarray.cumsum() function respects these masks, offering versatility in handling incomplete or irregular datasets.

Example 5: Advanced Application – Time Series Analysis

import numpy as np
import pandas as pd

# Generate a time series data
dates = pd.date_range('20210101', periods=6)
data = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))

# Convert a column to a NumPy array
array = data['A'].values

# Cumulative sum for trend analysis
trend = array.cumsum()

# Output the trend
print(trend)

Output (vary, due to the randomness):

[1.35115086 1.28622184 1.01832008 1.08246151 1.67871901 1.247058  ]

Applying ndarray.cumsum() in time series analysis can reveal trends and patterns over time. This example shows how to apply the cumulative sum to financial or scientific time series data to analyze the overall trend or direction.

Conclusion

The ndarray.cumsum() method in NumPy is a versatile function for computing cumulative sums across one or multiple axes, with various applications ranging from basic array operations to advanced data analysis and time series examination. Understanding and utilizing this function can enhance your data processing work, contributing to more efficient and insightful outcomes.