Integrating External Covariates for Improved Time-Series Forecasting in PyTorch

Time-series forecasting is a critical task in various domains, from finance to weather prediction. Traditional models often rely solely on past values of the target variable. However, integrating external covariates can significantly improve the forecasting by providing additional context. In this article, we will explore how to incorporate external covariates into a time-series forecasting model using PyTorch. We'll walk through setting up a dataset, creating a model, and training it for enhanced forecasting accuracy.

Why Use External Covariates?
Setting Up the Dataset
Building the Forecasting Model
Training the Model
Concluding Thoughts

Why Use External Covariates?

External covariates can be any external data source that may influence the time-series you are predicting. For example, in predicting sales, you might include advertising spend, holidays, or even weather data. By integrating such covariates, the model can learn more nuanced patterns and improve its predictive performance.

Setting Up the Dataset

First, let's create a dataset that includes both the time-series data and the external covariates. We will represent this data using PyTorch's torch.utils.data.Dataset.

import torch
timport torch.utils.data as data
import pandas as pd

class TimeSeriesDataset(data.Dataset):
    def __init__(self, time_series, covariates, window_size):
        self.time_series = time_series
        self.covariates = covariates
        self.window_size = window_size

    def __len__(self):
        return len(self.time_series) - self.window_size

    def __getitem__(self, idx):
        time_series_window = self.time_series[idx:idx + self.window_size]
        covariates_window = self.covariates[idx:idx + self.window_size]
        return time_series_window, covariates_window

Here, time_series is a sequence of the target variable, while covariates contains the additional data that may influence the target.

Building the Forecasting Model

Now that we have our dataset, let's define a PyTorch model that can take both the time-series and the covariates as input:

import torch.nn as nn

class ForecastingModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(ForecastingModel, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.linear = nn.Linear(hidden_size, output_size)

    def forward(self, time_series, covariates):
        inputs = torch.cat((time_series, covariates), dim=2)
        lstm_output, _ = self.lstm(inputs)
        forecast = self.linear(lstm_output[:, -1, :])
        return forecast

In this model, the LSTM layer is responsible for learning temporal patterns from the concatenated inputs of time-series and covariates, combined with a linear layer for final forecast generation.

Training the Model

With our model defined, it’s time to train it using a suitable loss function and optimizer. Here’s how you can do this in PyTorch:

import torch.optim as optim

# Hypothetical dataset
time_series_data = torch.rand(100, 1)
covariates_data = torch.rand(100, 10)

# Prepare DataLoader
window_size = 10
train_dataset = TimeSeriesDataset(time_series_data, covariates_data, window_size)
train_loader = data.DataLoader(train_dataset, batch_size=32, shuffle=True)

# Initialize model, loss, and optimizer
model = ForecastingModel(input_size=11, hidden_size=32, output_size=1)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(10):  # Example for 10 epochs
    for time_series, covariates in train_loader:
        optimizer.zero_grad()
        output = model(time_series.unsqueeze(2), covariates)
        loss = criterion(output, time_series[:, -1])
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1}, Loss: {loss.item()}")

We leverage Mean Squared Error (MSE) as the loss function due to its suitability in regression tasks. The optimizer used here is Adam, which adapts the learning rates of all model parameters.

Concluding Thoughts

Incorporating external covariates into time-series forecasting using PyTorch requires handling datasets, custom model building, and considered training processes. This approach substantially enhances the model’s performance as it allows it to learn more complex relationships within data, thus improving forecasts. Experimenting with different types of covariates and neural network architectures can lead to further optimization.

By following this guide, you’re now equipped with the basics of integrating external sources into time-series forecasting models using PyTorch. Keep experimenting with different extensions to fully leverage the power of combining dense neural networks with recurrent architectures for time-series prediction contexts.

Next Article: Optimizing Hyperparameters for Time-Series Models in PyTorch

Previous Article: Leveraging PyTorch Lightning to Accelerate Time-Series Model Training

Series: Time-Series and Forecasting in PyTorch

PyTorch