10. Scaling Models in Production Flashcards
How do you deploy a model trained using TensorFlow?
A saved model containing a complete TensorFlow program and computation. You don’t need the original code to run. You can deploy the model with TensorFlow Lite, TensorFlow.js, TensorFlow Serving or TensorFlow Hub.
What are the two types of API endpoints are allowed by TensorFlow Serving?
REST and gRPC
What does TensorFlow Serving handle?
Handle model serving and version management
What are the steps to set up TF Serving?
Install TensorFlow Serving with Docker
Train and save a model with TensorFlow
Serve the model using TensorFlow Serving
How do you manage TensorFlow Serving?
You can choose to use a managed TensorFlow prebuilt container on Vertex AI.
What is SignatureDef in TensorFlow Serving?
It defines how a saved model expects its inputs and provides its outputs.
How does TF Serving response to a new version of a model?
It automatically unloads the old model and loads the newer version.
What are the two types of input features that are fetched in real time to invoke the model for prediction?
Static and dynamic reference features
What are the two use cases to set up a real-time prediction endpoint?
Models trained in Vertex AI using Vertex AI training, i.e., AutoML and custom models.
Model trained elsewhere, i.e., import the model to GCP.
How to improve prediction latency?
Pre-compute predictions in an offline batch scoring job and store them in a low-latency read data store like Memorystore or Datastore for online serving.
Compare Static and Dynamic Reference Features
Static:
Values don’t change in real time
Available in a data warehouse
Estimate the price of a house based on the location
Store in NoSQL
Dynamic:
Values compute on the fly
A list of aggregated values for a particular window in a certain period of time
Predict an engine failure in the next hour
Dataflow streaming pipeline and store in a low-latency database, e.g., Bigtable
What are the two categories of lookup keys for prediction requests?
Specific entity, e.g., customer id
Specific combination of input features, e.g., a hashed combination of all possible input features.
What are the specific filenames for importing prebuilt container?
TensorFlow SavedModel: saved_model.pb
scikit-learn: model.joblib or model.pkl
XGBoost: model.bst, model.joblib or model.pkl
How do you import a custom container?
Create a container image
Push the image using Cloud Build to Artifact Registry.
How to set up autoscaling for your endpoint in Vertex AI?
Specify the autoscaling for your endpoint container. Vertex AI automatically provisions the container resources and sets up autoscaling for your endpoint.
What do you need to specify when deploy a model in Vertex AI?
The deploy method will create an endpoint and deploy your model.
You need to provide:
model name, traffic-split, machine-type, accelerator type, accelerator count, starting replica count, max replica count
Hints: Never Stop Making Amazing Apple Raspberry Rolls.
Can you deploy models from Model Registry?
Yes, Model Registry is a centralized place to track versions of both AutoML and custom models.
What is your input format if you use prebuilt containers or custom container to serve predictions?
JSON
What is A/B testing used for?
A/B testing compares the performance of two versions of a model and see which one is more appealing to viewers.
What is the strategy to replace one model with another one?
Add a new model to the same endpoint and gradually increase the traffic split for the new model until 100%.
What is Vertex AI model evaluation used for?
Run model evaluation jobs regardless of Vertex service is used to train the model.
It will also store and visualize the evaluation results across multiple models in the Model Registry.
What do you get from online explanation requests?
You get both predictions and feature attributions.
Why do you need to undeploy models?
incur charges
What are the input data options for batch training in Vertex AI?
JSON, TFRecord, CSV files, File list, BigQuery
Hints: Lions Jump Carefully Through Bushes.