How to Perform Advanced Time Series Forecasting with NumPy

Updated: January 23, 2024 By: Guest Contributor Post a comment

Introduction

Time series forecasting is a critical component in numerous business applications including demand forecasting, stock market prediction, and resource allocation. In Python, NumPy is a foundational library for numerical computing, and while it’s not specifically designed for time series analysis – which is often handled by libraries like pandas and statsmodels – it can still play a supportive role in handling arrays and mathematical operations needed in the forecasting process.

This tutorial aims to dive into advanced techniques in time series forecasting with an emphasis on how NumPy can contribute to these tasks. We’ll explore strategies from preprocessing data to making predictions, assuming you have a foundational understanding of time series analysis and Python’s NumPy library.

Getting Started

First, ensure that you have NumPy installed:

pip install numpy

We also suggest using pandas for handling date indices, which you can install using:

pip install pandas

Basic Operations with NumPy

At the core of any time series forecasting task is the manipulation of numerical arrays. NumPy’s array object is ideal for performing vectorized operations which are highly efficient.

Creating Time Series Data

import numpy as np
import pandas as pd

# Creating a date range
date_range = pd.date_range(start='2021-01-01', end='2021-12-31', freq='D')
# Simulating time series data
np.random.seed(0)
ts_data = np.random.randn(len(date_range))

This code snippet creates a pandas date range for 2021 and a corresponding NumPy array of random values to simulate daily time series data.

Moving Average

A common technique in time series analysis is calculating the moving average, which helps in identifying trends.

def moving_average(data, window_size):
  return np.convolve(data, np.ones(window_size), 'valid') / window_size

# Calculate the moving average
ma_ts_data = moving_average(ts_data, 7)

This uses NumPy’s convolve function to compute the moving average over a specified window size.

More Advanced Forecasting Techniques

Next, we’ll look at some more sophisticated forecasting models and how NumPy can play a role in implementing them. Given NumPy’s focus on numerical operations rather than direct statistical modelling, it’s often used in conjunction with other libraries; it’s the underpinning of more complex operations that are abstracted away by higher-level time series libraries.

Autoregressive Integrated Moving Average (ARIMA)

One of the most popular and powerful approaches for time series forecasting is ARIMA. In a simplified form, you can think about ARIMA as an extension of the moving average method we explored above.

from statsmodels.tsa.arima.model import ARIMA

# Prepare the time series data
series = pd.Series(ts_data, index=date_range)
# Fit the ARIMA model
model = ARIMA(series, order=(5, 1, 0))
model_fit = model.fit()
# Forecast
forecast = model_fit.forecast(steps=10)
forecast_array = np.array(forecast)

NumPy helps to convert the forecast object into an array, which may be useful for downstream applications that expect NumPy arrays.

Fourier Transform for Seasonal Decomposition

Fourier transforms are powerful tools in time series analysis especially for identifying seasonality in the data. They convert the time series from the time domain into the frequency domain.

from numpy.fft import fft

# Perform Fourier Transform
fft_result = fft(ts_data)
powers = np.abs(fft_result)**2
frequencies = np.fft.fftfreq(ts_data.size, d=1/365)

# Filtering out low power frequencies
filtered_powers = powers > 1e5
filtered_fft_result = fft_result * filtered_powers

# Inverse Fourier Transform to get smoothed series
smoothed_series = np.fft.ifft(filtered_fft_result).real

Here, the NumPy Fast Fourier Transform (FFT) efficiently computes the seasonal components of the time series. After filtering, the inverse FFT brings the series back to the time domain.

Conclusion

In this tutorial, we have explored a mix of basic and advanced time series forecasting techniques, demonstrating the versatility of NumPy in the forecasting process. Combining NumPy’s power with the capabilities of higher-level libraries enables us to tackle complex forecasting challenges in a clear manner.