Serving and scaling models Flashcards
What are the three major challenges in managing ML features, and how does Vertex AI Feature Store address them?
The three major challenges are:
Sharing and Reuse: Features are often duplicated across projects and teams.
Low-Latency Serving: Serving features in production with low latency is difficult.
Training-Serving Skew: Misalignment between training and serving feature values.
Vertex AI Feature Store addresses these by:
Providing a centralized repository for feature sharing and discovery.
Ensuring low-latency feature serving with optimized infrastructure.
Computing feature values once for both training and serving, mitigating skew.
What is the difference between batch and online serving in Vertex AI Feature Store? Provide use cases for each.
Batch Serving:
Fetches large volumes of data for offline tasks like model training or batch predictions.
Example: Preparing a dataset for retraining a demand forecasting model.
Online Serving:
Retrieves small batches of data with low latency for real-time predictions.
Example: Fetching user preferences for personalized recommendations.
Define the key components of the Vertex AI Feature Store data model and their roles.
1) Feature Store: The top-level container for all features and their values.
2) Entity Type: A collection of semantically related features (e.g., “Users” or “Movies”).
3) Entities: Specific instances of entity types (e.g., “user_01” or “movie_02”).
4) Features: Attributes of entities (e.g., “age,” “average_rating”).
5) Feature Values: Data describing the features for a specific entity, timestamped to capture temporal changes.
How does the Feature Store help prevent data leakage during model training?
Feature Store uses point-in-time lookups to fetch feature values as they existed when the labels were generated. This ensures:
Temporal Consistency: Feature values align with the prediction time.
No Future Data Leakage: Prevents using information unavailable during the original event.
What are the prerequisites for data ingestion into Vertex AI Feature Store?
Data must include:
Entity ID: Uniquely identifies entities (must be a STRING).
Timestamp: Indicates when feature values were generated (optional if all values are simultaneous).
Feature Columns: Values matching the feature schema.
Data must be in:
BigQuery tables.
Cloud Storage files in Avro or CSV format.
Column headers must be defined (e.g., Avro schema, BigQuery column names).
Explain the concept of training-serving skew and how Feature Store mitigates it.
Training-Serving Skew: Mismatch between feature values used during training and those used during serving.
Mitigation:
Compute feature values once and reuse them.
Centralized storage ensures consistent feature definitions.
Monitor and alert for data drift.
Describe the role of an entity view in the Feature Store. What is the difference in retrieving features for online or batch serving?
An entity view represents a subset of features and their values for a given entity type.
Online Serving: Retrieves specific features for real-time predictions.
Batch Serving: Combines features across multiple entity types for offline tasks.
What is feature ingestion, and what methods does Feature Store offer?
Feature ingestion is importing computed feature values into the Feature Store.
Methods:
Batch Ingestion: Bulk import from sources like BigQuery or Cloud Storage.
Stream Ingestion: Real-time feature updates for online use.
How does Feature Store enable feature monitoring, and what can be tracked?
Feature monitoring tracks:
Data Drift: Changes in feature distributions.
Ingestion Metrics: Volume, processing time, errors.
Serving Metrics: CPU utilization, latency.
Alerts can be set for anomalies to ensure data quality.
What is the importance of timestamps in Feature Store and how they are used?
Timestamps associate feature values with their generation time:
Enable point-in-time lookups.
Track historical changes in features.
Facilitate time-series modeling.
What are the data retention policies in Feature Store?
Feature Store retains feature values based on their timestamps. The retention limit depends on the timestamp column, not the ingestion time.
How can Feature Store improve collaboration in ML projects?
Centralizes features for reuse across teams.
Provides APIs for easy discovery and access.
Implements role-based permissions for governance.
How are feature values stored for batch and online serving?
Offline Store: Retains historical data for training and batch predictions.
Online Store: Holds the latest feature values for low-latency retrieval.
What steps are involved in creating a Feature Store?
Preprocess and clean data.
Define feature store, entity types, and features.
Ingest feature values using batch or streaming methods.
Enable monitoring for quality control.
Describe the relationship between an entity and its features in Feature Store.
Entities are instances of entity types, and features describe specific attributes of these entities. For example:
Entity Type: “Movies.”
Entity: “movie_01.”
Features: “average_rating,” “genres.”
What is the minimum dataset size required for Feature Store ingestion?
Feature Store requires a minimum of 1,000 rows for batch ingestion to ensure data quality and usability.
How does Feature Store support feature discovery and sharing?
APIs: Search and retrieve features easily.
Centralized Repository: Ensures shared access.
Versioning: Tracks feature evolution for collaboration.
What are the value types supported by Feature Store, and why is this flexibility important?
Supported types: Scalars, arrays, and tensors (e.g., STRING, Boolean array).
Importance: Handles diverse data formats across ML models and tasks.
How does Feature Store handle array data types?
Arrays must use formats like Avro or BigQuery. Null values are disallowed, but empty arrays are acceptable.
Provide an example of using Feature Store for a real-world prediction task.
Example: Predicting baby weight based on historical features like birth location and mother’s age:
Batch ingest historical data into Feature Store.
Serve predictions to a mobile app using online serving API.
Give a high level outline of the steps needed to build an end-to-end pipeline to predict user churn using XGBoost?
The lab focuses on building an end-to-end machine learning pipeline that involves:
1) Training an XGBoost classifier in BigQuery ML to predict user churn.
2) Evaluating and explaining the model using BigQuery ML Explainable AI.
3) Generating batch predictions directly in BigQuery.
4) Exporting the model to Vertex AI for online prediction.
5) Leveraging Vertex AI for scalable predictions and MLOps.
How does BigQuery ML eliminate common ML workflow inefficiencies?
BigQuery ML:
1) Enables training and inference using standard SQL queries, eliminating the need to move data to separate environments.
2) Reduces the complexity of ML pipelines with fewer lines of code.
3) Integrates seamlessly with BigQuery’s scalable data storage and querying capabilities.
What are the advantages of deploying BigQuery ML models to Vertex AI?
Scalable Online Predictions: Provides low-latency, real-time predictions.
Enhanced Monitoring: Utilizes Vertex AI’s MLOps tools for retraining and anomaly detection.
Integration with Applications: Enables direct integration with customer-facing UIs like dashboards.
Why is the use of Google Analytics 4 data significant?
The Google Analytics 4 dataset used in the lab provides real-world user data from the mobile application Flood It! to predict user churn. It allows ML engineers to train realistic models for business-driven use cases.