Evaluating Stationarity and Cointegration with statsmodels

In time series analysis, understanding the concepts of stationarity and cointegration is critical, especially when you work with financial or economic data. These properties affect how we model time series data, and whether we can make reliable forecasts or inferences from them.

Understanding Stationarity
1. Stationarity Testing using Python
Understanding Cointegration
1. Cointegration Testing with statsmodels
Practical Applications

Understanding Stationarity

A time series is considered stationary if its statistical properties such as mean, variance, and autocorrelation are constant over time. Stationarity is a crucial assumption for many time series models because it simplifies the analysis and forecasting of time series data.

Stationarity Testing using Python

The statsmodels library in Python provides tools to test for stationarity. The most commonly used test is the Augmented Dickey-Fuller (ADF) test. Let's see how this can be implemented:


import pandas as pd
import numpy as np
from statsmodels.tsa.stattools import adfuller

# Generating a random walk time series
ts = pd.Series(np.random.randn(1000).cumsum())

# Apply the Augmented Dickey-Fuller test
result = adfuller(ts)

print('ADF Statistic: %f' % result[0])
print('p-value: %f' % result[1])
print('Critical Values:')
for key, value in result[4].items():
    print('\t%s: %.3f' % (key, value))

If the p-value is less than a pre-specified threshold (often 0.05), the null hypothesis of non-stationarity is rejected, indicating the series is stationary.

Understanding Cointegration

Cointegration refers to a scenario where two or more non-stationary series are linearly related in such a way that a linear combination of them is stationary. This is significant in econometrics and pairs trading strategies in finance.

Cointegration Testing with statsmodels

We utilize the Engle-Granger two-step method to test for cointegration using statsmodels. Let's explore this with a practical example:


import statsmodels.api as sm
from statsmodels.tsa.stattools import coint

# Generate synthetic data
ts1 = ts + np.random.normal(size=len(ts))  # Non-stationary time series
ts2 = 2 * ts + np.random.normal(size=len(ts))  # Linear combination implying potential cointegration

# Perform the Cointegration Test
coint_t, p_value, critical_values = coint(ts1, ts2)

print('Cointegration test statistic: ', coint_t)
print('p-value: ', p_value)
print('Critical values: ', critical_values)

Here, if the p-value is below the threshold, it indicates that the time series are cointegrated.

Practical Applications

Testing for stationarity and cointegration helps validate assumptions needed for further modeling like vector autoregressions (VAR) or error correction models (ECM). It is especially useful in pairs trading, where traders look for histories of equilibrium between asset prices.

Using tools like statsmodels, a robust understanding of these concepts aids in better time series analysis and prediction modeling. Always pair these statistical tests with other qualitative analysis to ensure reliability in financial models.

Next Article: Using statsmodels for Linear and Logistic Regression in Algo Trading

Previous Article: Debugging Common statsmodels Errors and Warnings

Series: Algorithmic trading with Python

Python