10. Scaling Models in Production Flashcards by KK Cheng

How do you deploy a model trained using TensorFlow?

A saved model containing a complete TensorFlow program and computation. You don’t need the original code to run. You can deploy the model with TensorFlow Lite, TensorFlow.js, TensorFlow Serving or TensorFlow Hub.

How well did you know this?

Not at all

Perfectly

What are the two types of API endpoints are allowed by TensorFlow Serving?

REST and gRPC

How well did you know this?

Not at all

Perfectly

What does TensorFlow Serving handle?

Handle model serving and version management

How well did you know this?

Not at all

Perfectly

What are the steps to set up TF Serving?

Install TensorFlow Serving with Docker
Train and save a model with TensorFlow
Serve the model using TensorFlow Serving

How well did you know this?

Not at all

Perfectly

How do you manage TensorFlow Serving?

You can choose to use a managed TensorFlow prebuilt container on Vertex AI.

How well did you know this?

Not at all

Perfectly

What is SignatureDef in TensorFlow Serving?

It defines how a saved model expects its inputs and provides its outputs.

How well did you know this?

Not at all

Perfectly

How does TF Serving response to a new version of a model?

It automatically unloads the old model and loads the newer version.

How well did you know this?

Not at all

Perfectly

What are the two types of input features that are fetched in real time to invoke the model for prediction?

Static and dynamic reference features

How well did you know this?

Not at all

Perfectly

What are the two use cases to set up a real-time prediction endpoint?

Models trained in Vertex AI using Vertex AI training, i.e., AutoML and custom models.
Model trained elsewhere, i.e., import the model to GCP.

How well did you know this?

Not at all

Perfectly

How to improve prediction latency?

Pre-compute predictions in an offline batch scoring job and store them in a low-latency read data store like Memorystore or Datastore for online serving.

How well did you know this?

Not at all

Perfectly

Compare Static and Dynamic Reference Features

Static:
Values don’t change in real time
Available in a data warehouse
Estimate the price of a house based on the location
Store in NoSQL

Dynamic:
Values compute on the fly
A list of aggregated values for a particular window in a certain period of time
Predict an engine failure in the next hour
Dataflow streaming pipeline and store in a low-latency database, e.g., Bigtable

How well did you know this?

Not at all

Perfectly

What are the two categories of lookup keys for prediction requests?

Specific entity, e.g., customer id
Specific combination of input features, e.g., a hashed combination of all possible input features.

How well did you know this?

Not at all

Perfectly

What are the specific filenames for importing prebuilt container?

TensorFlow SavedModel: saved_model.pb
scikit-learn: model.joblib or model.pkl
XGBoost: model.bst, model.joblib or model.pkl

How well did you know this?

Not at all

Perfectly

How do you import a custom container?

Create a container image
Push the image using Cloud Build to Artifact Registry.

How well did you know this?

Not at all

Perfectly

How to set up autoscaling for your endpoint in Vertex AI?

Specify the autoscaling for your endpoint container. Vertex AI automatically provisions the container resources and sets up autoscaling for your endpoint.

How well did you know this?

Not at all

Perfectly

What do you need to specify when deploy a model in Vertex AI?

Study These Flashcards

The deploy method will create an endpoint and deploy your model.
You need to provide:
model name, traffic-split, machine-type, accelerator type, accelerator count, starting replica count, max replica count

Hints: Never Stop Making Amazing Apple Raspberry Rolls.

Can you deploy models from Model Registry?

Study These Flashcards

Yes, Model Registry is a centralized place to track versions of both AutoML and custom models.

What is your input format if you use prebuilt containers or custom container to serve predictions?

Study These Flashcards

JSON

What is A/B testing used for?

Study These Flashcards

A/B testing compares the performance of two versions of a model and see which one is more appealing to viewers.

What is the strategy to replace one model with another one?

Study These Flashcards

Add a new model to the same endpoint and gradually increase the traffic split for the new model until 100%.

What is Vertex AI model evaluation used for?

Study These Flashcards

Run model evaluation jobs regardless of Vertex service is used to train the model.
It will also store and visualize the evaluation results across multiple models in the Model Registry.

What do you get from online explanation requests?

Study These Flashcards

You get both predictions and feature attributions.

Why do you need to undeploy models?

Study These Flashcards

incur charges

What are the input data options for batch training in Vertex AI?

Study These Flashcards

JSON, TFRecord, CSV files, File list, BigQuery

Hints: Lions Jump Carefully Through Bushes.

What output options do you have for batch prediction?

BigQuery table or Cloud Storage

What are the four primary functions for MLflow?

MLflow tracking: Experiment tracking, record and compare parameters and results. MLflow projects: Packaging ML code in a reusable and reproducible form MLflow models: Manage and deploy models to different platforms MLflow model registry: centralised model store (versioning, stage transitions and annotations) Hints: Tomatoes Provide Marvelous Redness.

Which Google Cloud service can help detecting skew and monitoring model performance over time?

Vertex AI Model Monitoring and Feature Store

How do you run MLflow on Google Cloud?

Create a PostgreSQL DB for storing model metadata Create a Cloud Storage bucket for storing artifacts Create a Compute Engine to install MLFlow

What do you test for target performance?

Training-serving skew and quality of the model with real-time data Monitor the model age and performance Test model weights and outputs are numerically stable

How do you create a trigger and schedule a training or prediction job on Vertex AI?

Cloud Scheduler set up a cron job to schedule training or prediction job. Vertex AI managed notebook can execute and schedule a training or prediction job Cloud Build can retrain a model (Dockerfile) and kick off a training. Cloud Run is a managed offering to deploy containers. Cloud Pub/Sub and Cloud Functions or Cloud Storage event-based trigger Hints: Snakes Never Bother Porcupines.

What is Cloud Workflows?

Orchestrate multiple HTTP-based services into a workflow, i.e., chain microservices together.

What is Vertex AI Pipelines?

automate, monitor, govern ML system by orchestrating your ML workflow in a serverless manner and storing your artifacts using ML Metadata. You can analyze the lineage.

What is Cloud Composer?

Orchestrate data-driven workflows. Built on Apache Airflow Fully managed Support on-premises and multiple cloud platforms. Direct acyclic graph

10. Scaling Models in Production Flashcards

(33 cards)