Running PyTorch Models on CPU or GPU with Device-Agnostic Code

When developing machine learning models with PyTorch, it's crucial to ensure your code can run seamlessly on both CPU and GPU. Writing device-agnostic code enables scalability and flexibility, optimizing for environments with different resources. This guide will walk you through setting up your PyTorch models to be device-agnostic with practical examples.

Why Device-Agnostic Code?
Checking for GPU Availability
Sending Data and Models to Device
1. 1. Loading Model to Device
2. 2. Transferring Data to Device
Training Loop Example
Conclusion

Why Device-Agnostic Code?

PyTorch, a popular deep learning library, offers a high level of flexibility in defining, training, and deploying models. Device agnostic code refers to code that can run on both CPU and GPU without modification. This is important because:

Portability: You may want to test models on a development machine with only CPUs, while training models using the computational power of GPUs in a production environment.
Flexibility: Ensures your code is adaptable across different machines with varying resources.
Efficiency: GPUs significantly speed up both training and inference time for large models.

Checking for GPU Availability

The first step in writing device-agnostic PyTorch code is to check if a GPU is available on the machine. PyTorch provides a straightforward way to check for available GPUs:

import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Here, we utilize PyTorch’s torch.cuda.is_available() method to see if CUDA-capable GPUs are available for use. We then declare a device variable that dynamically assigns itself to 'cuda' (GPU) or 'cpu'.

Sending Data and Models to Device

Once you determine the appropriate device, the next step is transferring your data and model to this device.

1. Loading Model to Device

model = MyModel()
model.to(device)

By calling model.to(device), you ensure that the entire model moves to either a CPU or GPU, based on availability.

2. Transferring Data to Device

Your training data and all tensors need to be on the same device where the model resides:

for inputs, labels in dataloader:
    inputs, labels = inputs.to(device), labels.to(device)
    # Continue with the training process...

This step reiterates how crucial it is to systematically transfer both your model and data to the same device to avoid runtime errors and ensure proper computations.

Training Loop Example

Once data and models have been moved to the appropriate device, we are ready to write the training loop which isn't specific to either CPU or GPU.

for epoch in range(num_epochs):
    for inputs, labels in dataloader:
        inputs, labels = inputs.to(device), labels.to(device)

        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

As shown, the above code demonstrates a simple training loop running on a device determined at runtime, with potential to run on either CPU or GPU.

Conclusion

By adhering to these guidelines for device-agnostic code, you'll enhance your PyTorch applications' ability to flexibly toggle between CPUs and GPUs. Adopting this approach can substantially improve the efficacy of deploying deep learning models across varied environments, ensuring speed and efficiency when leveraging available hardware accelerators.

Next Article: Seamlessly Switching Between CPU and GPU in PyTorch

Previous Article: How to Write Device-Agnostic Code in PyTorch

Series: The First Steps with PyTorch

PyTorch