10. Scaling Models in Production Flashcards
How do you deploy a model trained using TensorFlow?
A saved model containing a complete TensorFlow program and computation. You don’t need the original code to run. You can deploy the model with TensorFlow Lite, TensorFlow.js, TensorFlow Serving or TensorFlow Hub.
What are the two types of API endpoints are allowed by TensorFlow Serving?
REST and gRPC
What does TensorFlow Serving handle?
Handle model serving and version management
What are the steps to set up TF Serving?
Install TensorFlow Serving with Docker
Train and save a model with TensorFlow
Serve the model using TensorFlow Serving
How do you manage TensorFlow Serving?
You can choose to use a managed TensorFlow prebuilt container on Vertex AI.
What is SignatureDef in TensorFlow Serving?
It defines how a saved model expects its inputs and provides its outputs.
How does TF Serving response to a new version of a model?
It automatically unloads the old model and loads the newer version.
What are the two types of input features that are fetched in real time to invoke the model for prediction?
Static and dynamic reference features
What are the two use cases to set up a real-time prediction endpoint?
Models trained in Vertex AI using Vertex AI training, i.e., AutoML and custom models.
Model trained elsewhere, i.e., import the model to GCP.
How to improve prediction latency?
Pre-compute predictions in an offline batch scoring job and store them in a low-latency read data store like Memorystore or Datastore for online serving.
Compare Static and Dynamic Reference Features
Static:
Values don’t change in real time
Available in a data warehouse
Estimate the price of a house based on the location
Store in NoSQL
Dynamic:
Values compute on the fly
A list of aggregated values for a particular window in a certain period of time
Predict an engine failure in the next hour
Dataflow streaming pipeline and store in a low-latency database, e.g., Bigtable
What are the two categories of lookup keys for prediction requests?
Specific entity, e.g., customer id
Specific combination of input features, e.g., a hashed combination of all possible input features.
What are the specific filenames for importing prebuilt container?
TensorFlow SavedModel: saved_model.pb
scikit-learn: model.joblib or model.pkl
XGBoost: model.bst, model.joblib or model.pkl
How do you import a custom container?
Create a container image
Push the image using Cloud Build to Artifact Registry.
How to set up autoscaling for your endpoint in Vertex AI?
Specify the autoscaling for your endpoint container. Vertex AI automatically provisions the container resources and sets up autoscaling for your endpoint.