TensorFlow is a powerful open-source library for machine learning and deep learning tasks. One of its core components that makes it exceptionally versatile is its ability to deploy trained models with TensorFlow Serving. In this article, we'll walk through how to save a trained model in TensorFlow using SavedModel format and then serve it with TensorFlow Serving.
Understanding the TensorFlow SavedModel Format
The SavedModel format is TensorFlow’s universal format for serializing trained models. It allows the model to be independent of the code that created it and is favorable because:
- It saves everything required to share or deploy a model, including weights, computation graphs, variables, and even optimizers.
- It is portable across different programming environments and tools.
- It can serve multiple users across various platforms easily.
Saving a Model Using SavedModel
To begin with, let's look at how to save a model in the SavedModel format. Assume you have a simple neural network model built using TensorFlow:
import tensorflow as tf
# Building a simple Sequential model
def create_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation='relu', input_shape=(28, 28)),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
# Create an instance of the model
model = create_model()
# Train your model (example with dummy data)
model.fit(x_train, y_train, epochs=5)
# Save the entire model
tf.saved_model.save(model, "/tmp/saved_model/my_model")
In this example, tf.saved_model.save()
is used to export the model to the specified directory "/tmp/saved_model/my_model"
. Make sure to replace x_train
and y_train
with your actual dataset.
Setting Up TensorFlow Serving
TensorFlow Serving makes it easier to deploy models quickly. To set up TensorFlow Serving:
- First, you need Docker installed on your machine as TensorFlow Serving is provided via Docker images.
- Once Docker is installed, pull the TensorFlow Serving Docker image:
docker pull tensorflow/serving
Serving the SavedModel
With your SavedModel ready and Docker running TensorFlow Serving, you can now start serving your model:
docker run -p 8501:8501 --name=tf_model_serving \
--mount type=bind,source=/tmp/saved_model/my_model,target=/models/my_model \
-e MODEL_NAME=my_model -t tensorflow/serving
Here's the breakdown of the command:
-p 8501:8501
: Maps port 8501 of Docker image to host, allowing us access over the network.--mount type=bind,...
: Mounts the directory where SavedModel resides to the model directory in Docker.-e MODEL_NAME=my_model
: Sets the environment variable MODEL_NAME.
Testing Your Model Endpoint
After setting this up, navigate to your browser or a tool like Postman and access the URL: http://localhost:8501/v1/models/my_model
.
Your model will respond, confirming it’s ready. You can use a POST
request to test your model inference like so:
{
"signature_name": "serving_default",
"instances": [
{"input_tensor": [...sample input data as list...]}
]
}
Ensure your signature_name
and input formats match your model's expected structure.
Conclusion
TensorFlow Serving is an efficient framework for serving your TensorFlow models in production. Using the SavedModel format allows seamless integration and deployment across different platforms and environments, leveraging TensorFlow's robust capabilities in a myriad of systems.