TensorFlow `Module`: How to Track Trainable Variables

Tensors and variables form the essential building blocks of machine learning models in TensorFlow. Managing these correctly is indispensable, especially when dealing with complex architectures. A common need is tracking the trainable variables in a custom module to update weights effectively during training. In TensorFlow, the Module class assists in encapsulating variables and managing such complexities.

Understanding TensorFlow Module
Tracking Trainable Variables
1. Example of a Custom Module
2. Accessing Trainable Variables
Working with Submodules
1. Example with Submodules
2. Getting Trainable Variables from Submodules
Conclusion

Understanding TensorFlow Module

Let's begin by exploring what a TensorFlow Module is. At the core, it's a simple python object with benefits:

It can create variables.
It automatically tracks variables and submodules.
It is robust and easy to use.

A Module collects all variables created within it, including the variables from the submodules, making it straightforward to build and manage models, especially those with hierarchical structures.

Tracking Trainable Variables

Trainable variables are crucial as these are the parameters optimized by learning algorithms during training. TensorFlow’s tf.Module assists in automatically managing and tracking these variables. Let's delve into an example demonstrating this feature.

Example of a Custom Module

Consider a simple linear layer as a custom module. We aim to encapsulate its weights and bias terms:

import tensorflow as tf

class LinearLayer(tf.Module):
    def __init__(self, input_dim, output_dim, name=None):
        super().__init__(name=name)
        self.w = tf.Variable(tf.random.normal([input_dim, output_dim]), name='w')
        self.b = tf.Variable(tf.zeros([output_dim]), name='b')
        
    def __call__(self, x):
        return tf.matmul(x, self.w) + self.b

In this example:

The linear layer module initializes weights w and biases b using tf.Variable.
The module overloads the __call__ method to perform the computation.

Accessing Trainable Variables

Once the module is implemented, accessing its trainable variables is quite straightforward. TensorFlow automatically lists all trainable variables added within a Module.

# Instantiate the module
layer = LinearLayer(input_dim=4, output_dim=3)

# Display all trainable variables in the module
for var in layer.trainable_variables:
    print(var.name, var.shape)

This code will output:

w:0 (4, 3)
b:0 (3,)

Working with Submodules

Managing submodules is another task simplified by tf.Module. Suppose you have a neural network layer composed of multiple linear layers.

Example with Submodules

Below is an example of a network harnessing two stacked linear layers:

class MyModel(tf.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        
        self.layer1 = LinearLayer(input_dim, hidden_dim)
        self.layer2 = LinearLayer(hidden_dim, output_dim)

    def __call__(self, x):
        x = self.layer1(x)
        x = tf.nn.relu(x)
        return self.layer2(x)

With this setup:

The MyModel module contains both LinearLayer instances as submodules which are automatically tracked by TensorFlow.
Calling mymodel.trainable_variables will yield all trainable variables from both the linear layers, aggregating weights and biases from any depth level in the module hierarchy.

Getting Trainable Variables from Submodules

The process of acquiring trainable variables from submodules remains consistent and convenient:

# Instantiate the model
model = MyModel(input_dim=6, hidden_dim=5, output_dim=2)

# Display all trainable variables across submodules
for var in model.trainable_variables:
    print(var.name, var.shape)

Conclusion

The tf.Module class in TensorFlow provides an easy-to-use and organized method for managing variables within complex ML models. By leveraging its automatic variable tracking, users can focus more on the development of the models themselves without the overhead of manual variable management. It's particularly advantageous when dealing with neural networks that include multiple nested modules.

Next Article: TensorFlow `Module`: Debugging Common Issues in Custom Layers

Previous Article: Understanding TensorFlow's `Module` Lifecycle and State Management

Series: Tensorflow Tutorials

Tensorflow