Sling Academy
Home/Tensorflow/TensorFlow SavedModel: Serving Models with TensorFlow Serving

TensorFlow SavedModel: Serving Models with TensorFlow Serving

Last updated: December 18, 2024

TensorFlow is a powerful open-source library for machine learning and deep learning tasks. One of its core components that makes it exceptionally versatile is its ability to deploy trained models with TensorFlow Serving. In this article, we'll walk through how to save a trained model in TensorFlow using SavedModel format and then serve it with TensorFlow Serving.

Understanding the TensorFlow SavedModel Format

The SavedModel format is TensorFlow’s universal format for serializing trained models. It allows the model to be independent of the code that created it and is favorable because:

  • It saves everything required to share or deploy a model, including weights, computation graphs, variables, and even optimizers.
  • It is portable across different programming environments and tools.
  • It can serve multiple users across various platforms easily.

Saving a Model Using SavedModel

To begin with, let's look at how to save a model in the SavedModel format. Assume you have a simple neural network model built using TensorFlow:

import tensorflow as tf

# Building a simple Sequential model
def create_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(10, activation='relu', input_shape=(28, 28)),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    model.compile(optimizer='adam', 
                  loss='sparse_categorical_crossentropy', 
                  metrics=['accuracy'])
    return model

# Create an instance of the model
model = create_model()

# Train your model (example with dummy data)
model.fit(x_train, y_train, epochs=5)

# Save the entire model
tf.saved_model.save(model, "/tmp/saved_model/my_model")

In this example, tf.saved_model.save() is used to export the model to the specified directory "/tmp/saved_model/my_model". Make sure to replace x_train and y_train with your actual dataset.

Setting Up TensorFlow Serving

TensorFlow Serving makes it easier to deploy models quickly. To set up TensorFlow Serving:

  1. First, you need Docker installed on your machine as TensorFlow Serving is provided via Docker images.
  2. Once Docker is installed, pull the TensorFlow Serving Docker image:
docker pull tensorflow/serving

Serving the SavedModel

With your SavedModel ready and Docker running TensorFlow Serving, you can now start serving your model:

docker run -p 8501:8501 --name=tf_model_serving \
   --mount type=bind,source=/tmp/saved_model/my_model,target=/models/my_model \
   -e MODEL_NAME=my_model -t tensorflow/serving

Here's the breakdown of the command:

  • -p 8501:8501: Maps port 8501 of Docker image to host, allowing us access over the network.
  • --mount type=bind,...: Mounts the directory where SavedModel resides to the model directory in Docker.
  • -e MODEL_NAME=my_model: Sets the environment variable MODEL_NAME.

Testing Your Model Endpoint

After setting this up, navigate to your browser or a tool like Postman and access the URL: http://localhost:8501/v1/models/my_model.

Your model will respond, confirming it’s ready. You can use a POST request to test your model inference like so:

{
  "signature_name": "serving_default",
  "instances": [
    {"input_tensor": [...sample input data as list...]}
  ]
}

Ensure your signature_name and input formats match your model's expected structure.

Conclusion

TensorFlow Serving is an efficient framework for serving your TensorFlow models in production. Using the SavedModel format allows seamless integration and deployment across different platforms and environments, leveraging TensorFlow's robust capabilities in a myriad of systems.

Next Article: TensorFlow SavedModel: Debugging Common Save Issues

Previous Article: TensorFlow SavedModel: Converting Keras Models to SavedModel

Series: Tensorflow Tutorials

Tensorflow

You May Also Like

  • TensorFlow `scalar_mul`: Multiplying a Tensor by a Scalar
  • TensorFlow `realdiv`: Performing Real Division Element-Wise
  • Tensorflow - How to Handle "InvalidArgumentError: Input is Not a Matrix"
  • TensorFlow `TensorShape`: Managing Tensor Dimensions and Shapes
  • TensorFlow Train: Fine-Tuning Models with Pretrained Weights
  • TensorFlow Test: How to Test TensorFlow Layers
  • TensorFlow Test: Best Practices for Testing Neural Networks
  • TensorFlow Summary: Debugging Models with TensorBoard
  • Debugging with TensorFlow Profiler’s Trace Viewer
  • TensorFlow dtypes: Choosing the Best Data Type for Your Model
  • TensorFlow: Fixing "ValueError: Tensor Initialization Failed"
  • Debugging TensorFlow’s "AttributeError: 'Tensor' Object Has No Attribute 'tolist'"
  • TensorFlow: Fixing "RuntimeError: TensorFlow Context Already Closed"
  • Handling TensorFlow’s "TypeError: Cannot Convert Tensor to Scalar"
  • TensorFlow: Resolving "ValueError: Cannot Broadcast Tensor Shapes"
  • Fixing TensorFlow’s "RuntimeError: Graph Not Found"
  • TensorFlow: Handling "AttributeError: 'Tensor' Object Has No Attribute 'to_numpy'"
  • Debugging TensorFlow’s "KeyError: TensorFlow Variable Not Found"
  • TensorFlow: Fixing "TypeError: TensorFlow Function is Not Iterable"