MLE Flashcards

Question 1

Q

What are the two types of Quantization in TFX?

Answer

A

Post Training Quantization and Quantization Aware Training.

Question 2

Q

What is Post Training Quantization?

Answer

A

Post-training quantization is a conversion technique that can reduce model size while also improving CPU and hardware accelerator latency, with little degradation in model accuracy. You can quantize an already-trained float TensorFlow model when you convert it to TensorFlow Lite format using the TensorFlow Lite Converter. It reduces the model size by integer casting, reduced float precision or using a dynamic range quantization.

Question 3

Q

What is Quantization Aware Training?

Answer

A

Training that reduces the space of a model by reducing the bit precision of training weights in a neural network. It is more accurate than post training quantization but it is harder to use and requires full retraining.

Question 4

Q

What are the four available options for model training on Vertex AI?

Answer

A

1.) AutoML
2.) Custom Training
3.) Model Garden
4.) Generative AI

Question 5

Q

What is AutoML? When is it recommended?

Answer

A

AutoML is a no code solution for training tabular, image, text, or video data without preparing data splits. It is best for:
- Automatically tuning a model with some input data.
- Teams that have little to know coding experience
- Teams that want to quickly get a model running.
-Teams that do not want control of hyperparameter tuning aside from early stopping.
- Teams that are solving a problem in the defined problem types offered.
- Models served on an edge device or on google cloud.
- Models with latency greater than 100ms.
-

Question 6

Q

What is BigQueryML? When is it recommended?

Answer

A

BigQueryML is Google’s built in ML commands to create ML directly in BQ. It is recommended for:
- Those comfortable in SQL.
- Those with data already in BQ
- Those whose problems are covered by BQ’s model set.

Question 7

Q

What is custom training? When is it recommended?

Answer

A

Custom training is complete freedom to optimize all aspects of an ML pipeline. It is recommended for:
- Problems outside the scope of BQML and AutoML.
- Problems that are already written in code from another premesis.

Question 8

Q

What are the three custom training methods on Vertex AI?

Answer

A

1.) Custom Jobs
2.) Hyper-parameter tuning jobs
3.) Training pipelines

Question 9

Q

What are custom training custom jobs?

Answer

A

A basic way to run a custom machine learning model on Vertex AI. It needs a pre-built or custom container to run in.

Question 10

Q

What are custom training hyperparameter tuning jobs?

Answer

A

This runs multiple trials of custom jobs to tune hyperparameters. It requires, a metric to evaluate performance against, a maximum number of trials to perform, a maximum number of parallel trials, the maximum number of jobs that can fail, the machine type and any accelerators (GPUs/TPUs it uses), the custom or pre-built container information it is using.

Question 11

Q

What is a custom training, training pipeline?

Answer

A

A training pipeline can run a custom job or hyperparameter tuning job and outputs your model to a google cloud storage bucket.

Question 12

Q

What frameworks have pre-built containers for training?

Answer

A

TensorFlow, XGBoost, Scikit-Learn, Pytorch

Question 13

Q

What is model garden?

Answer

A

Model Garden in the Google Cloud console is an ML model library that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. Many of these are pretriained and allow fine tuning/ transfer learning to customize to a nearby solution.

Question 14

Q

What is fine tuning and transfer learning? When should one be used over the other?

Answer

A

Transfer learning is the process of retraining the final layers of a pre-trained model. Fine tuning is an extension of transfer learning that allows retraining of the weights of a model as well. Fine tuning is recommended for larger datasets while transfer learning is for smaller ones as it is more likely to overfit.

Question 15

Q

What is the AutoML workflow?

Answer

A

1.) Prepare your training data.
2.) Create a dataset.
3.) Train a model.
4.) Evaluate and iterate on your model.
5.) Get predictions from your model.
6.) Interpret prediction results.

Question 16

Q

What are best practices for ML Environment Setup in custom training?

Answer

A

1.) Use Vertex AI Workbench notebooks for experimentation and development.
2.) Create notebook instances for each team member.
3.) Store ML resources the same just like datasets with IAM permisisons.
4.) Use Vertex AI SDK for Python.

Question 17

Q

What are best practices for ML Development in custom training?

Answer

A

1.) Store structured and semi-structured data in BigQuery.
2.) Store image, video, audio and unstructured data on Cloud Storage.
3.) Use Vertex AI Data Labeling for unstructured data.
4.) Use Vertex AI Feature Store with structured data.
5.) Avoid storing data in block storage.
6.) Use Vertex AI TensorBoard and Vertex AI Experiments for analyzing experiments.
7.) Train a model within a notebook instance for small datasets.
8.) Maximize your model’s predictive accuracy with hyperparameter tuning.
9.) Use feature attributions (importances) to gain insights into model predictions.

Question 18

Q

What are best practices for Data Processing in custom training?

Answer

A

) Use BigQuery to process structured and semi-structured data or if data is in BQ already.
2.) Use Dataflow to process data.
3.) Use Dataproc for serverless Spark data processing.

Question 19

Q

What are best practices for operationalized training in custom training?

Answer

A

1.) Run code in a managed service like Vertex AI training (container based solutions with task.py file) or Vertex AI pipelines.
2.) Operationalize job execution with training pipelines.
3.) Use training checkpoints to save the current state of your experiment.
4.) Prepare model artifacts for serving in Cloud Storage.
5.) Regularly compute new feature values and push them to feature store.

Question 20

Q

What is operationalized training?

Answer

A

Operationalized training refers to the process of making model training repeatable, tracking repetitions, and managing performance.

Question 21

Q

What is Dataproc?

Answer

A

A managed Apache Spark/Hadoop service that allows batch processing, querying, streaming and ML.

Question 22

Q

What is Dataflow?

Answer

A

Data flow is a serverless service built on Apache Beam for setting up automated data processing pipelines. It can be used with TFX and Kubeflow Pipelines as they have integrated DataFlow runners. Since Vertex AI Pipelines support both, it can also be used there.

Question 23

Q

What is Vertex AI TensorBoard?

Answer

A

A tool for measuring and visualizing aspects of a TF ML workflow.

Question 24

Q

What is a Vertex AI Managed Dataset?

Answer

A

Vertex AI offers a central repo for datasets which can be used for AutoML and custom models on Vertex AI. It accepts Image, Tabular, Text and Video data

Question 25

Q

What file formats are should be used for model artifacts from a Vertex AI pre-built container?

Answer

A

1.) TensorFlow: saved_model.pb
2.) Scikit-Learn: model.joblib or model.pkl
3.) XGBoost: model.bst
4.) PytTorch: model.pth

Question 26

Q

What are best practices for model deployment and serving in custom training?

Answer

A

1.) Specify the number and type of machines you need?
2.) Plan inputs to the model using batch or online serving techniques.
3.) Turn on auto-scaling by defining the minimum and maximum nodes with a bare minimum of 2 nodes.

Question 27

Q

What is batch prediction?

Answer

A

Prediction of batches of data brought in at a regular interval. Requests are asynchronous and come directly from the model. Requires an input source and output location of either GCS or BigQuery. Can also be done by reading batch features with the Feature Store API but this would be slower as features would need to be ingested.

Question 28

Q

What are ML workflow orchestration best practices?

Answer

A

1.) Use Vertex AI pipelines for running DAGS created by Kubeflow, TFX and Airflow.
2.) Use Kubeflow pipelines to author your pipelines.

Question 29

Q

What are recommended Artifact Organization best practices?

Answer

A

1.) Organize ML artifacts.
2.) Use version control for pipeline and custom component code.

Question 30

Q

What artifacts should be stored in the source control repo?

Answer

A

Vertex AI Workbench notebooks
Pipeline source code
Preprocessing functions
Model source code
Model training packages
Serving functions

Question 31

Q

What is an Artifact?

Answer

A

An artifact is the output resulting from each step of a ML workflow.

Question 32

Q

What artifacts should be stored in Experiments and ML Metadata?

Answer

A

Experiments
Hyperparameters
Metaparameters
Metrics
Data Artifacts
Model Artifacts
Pipeline Metadata

Question 33

Q

What artifacts should be stored in Vertex AI Model Registry?

Answer

A

Trained Models from AutoML, Custom training or BigQueryML. They can be versioned

Question 34

Q

What artifacts should be stored in the Artifact Registry?

Answer

A

Pipeline containers
Custom training environments
Custom prediction environments

Question 35

Q

What artifacts should be stored in Vertex AI Prediction?

Answer

A

Deployed models

Question 36

Q

What are best practices for model monitoring?

Answer

A

1.) Use drift and skew detection at an endpoint. It uses TFDV under the hood to determine data drift and skew.
2.) Fine tune alert thresholds.
3.) Use feature attributions as an early warning sign for data drift or skew through Vertex Explainable AI.

Question 37

Q

What is data skew?

Answer

A

The degree of distortion between your training data and production data?

Question 38

Q

What is data drift?

Answer

A

The process at which data drifts over time changing the underlying statistical distribution of inputs and target.

Question 39

Q

What is an online prediction?

Answer

A

A synchronous request made to a model endpoint for serving predictions with low latency and/or streaming data.

Question 40

Q

What are the guidelines for experimentation?

Answer

A

1.) Have fixed thresholds for optimizing metrics and satisficing metrics like latency and model size.
2.) Implement an evaluation routine that is model indifferent.
3.) Ensure you have a baseline model to compare against.
4.) Track every experiment and incremental improvement.

Question 41

Q

What are the guidelines for data quality?

Answer

A

1.) Address class imbalance early.
2.) Automate data preprocessing.
3.) Prevent data leakage with a test-train split that isolates test data from the tuning process.
4.) Generate a data schema that includes feature statistics.
5.) Ensure training data is properly shuffled in batches.
6.) Use a validation set for model/hyperparameter tuning.

Question 42

Q

What are the guidelines for model quality?

Answer

A

1.) For DNN’s, monitor for NaN values in loss and percentage of weights as this can indicate errors or vanishing/exploding gradients.
2.) Use validation and test data to check for overfitting/underfitting
3.) Analyze misclassified instances to check for mislabeling, outliers or pre-processing that is needed.
4.) Analyze feature importance and remove those that have little importance.

Question 43

Q

What are the guidelines for data validation?

Answer

A

1.) Verify features match the expected schema.
2.) Verify data is in expected ranges and distributions.
3.) Validate the maximum fraction of missing values.

Question 44

Q

What are the guidelines for model validation?

Answer

A

1.) Validate the model on unseen test data.
2.) Ensure the test data is representative of the data and that time series test data is fresher than train.

Question 45

Q

What are guidelines for model deployment?

Answer

A

1.) Verify the model can be called.
2.) Validate satisficing requirements.
3.) Unit test the model for edge cases and typical cases.
4.) Test in a staging environment where you can roll back to a previous version if needed.
5.) Use A/B or multi-armed bandit testing before fully rolling out a new model.

Question 46

Q

What are guidelines for model serving?

Answer

A

1.) Regularly profile request data for tracking data drift or skew and set alerts for skew/drift thresholds.
2.) Identify concept drift by checking how feature importance changes over time.
3.) Determine outliers with respect to the training data.
4.) Perform continuous evaluation where true labels are available.
5.) Monitor service efficiency.
6.) Monitor predictive performance.

Question 47

Q

What are the 3 continuous parts of MLOps?

Answer

A

1.) Integration (CI): Testing and validating code, components, data, schmas and models.
2.) Delivery (CD): Deployment of an end-to-end pipeline that automatically pushes to a prediction service.
3.) Training (CT): Models are automatically retrained and served as they improve.

Question 48

Q

What is MLOps maturity level 0?

Answer

A

Completely manual process that is script driven and experimental. It has no CI, CD or CT.

Question 49

Q

What is MLOps maturity level 1?

Answer

A

A step up from level 0 with CT integration. It needs automated data and model validation.

Question 50

Q

What is MLOps maturity level 2?

Answer

A

It includes integration of CI/CD on top of CT for rapid automation of pipeline experimentation and integration. It requires source control, test/build services, deployment services, model registry, feature store, ml metadata store and ml pipeline orchestration which are all automated.

Question 51

Q

What is Vertex AI Pipelines?

Answer

A

Vertex AI pipelines is a managed resource for MLOps. It supports both KubeFlow Pipelines and Tensor Flow Extended frameworks but it manages the compute cluster for you in a containerized environment.

Question 52

Q

When should TFX be used?

Answer

A

When the pipeline you are creating is running tensorflow code.

Question 53

Q

When should Kubeflow Pipelines be used?

Answer

A

When not using a tensorflow code or off premises/multicloud solutions are needed.

Question 54

Q

How can BigQueryML serve models?

Answer

A

BQML can natively serve batch predictions but by integrating into Vertex AI it can be deployed to an endpoint through Model Registry and perform online prediction. This deployment does not work for ARIMA+ or XGBoost models.

Question 55

Q

What is Vertex AI Vizier?

Answer

A

A black-box optimization service that helps tune hyperparameters. Done when the objective/loss function function is unknown or too costly to use. By default it uses Bayesian Optimization but can also use Grid Search, Random Search or an unspecified mode that chooses a best solution.

Question 56

Q

What is Neural Architect Search?

Answer

A

The process by which AutoML searches for and finds the best model architecture for a given problem and tunes its parameters.

Question 57

Q

What is Dataprep?

Answer

A

An intelligent cloud data service to visually explore, clean, and prepare data for analysis and machine learning. IT auto detects 17 different data types and can transform structured or unstructured data in CSV, JSON or relational tables up to petabytes.

Question 58

Q

What are the 3 connections for Dataprep?

Answer

A

1.) Direct Upload/Download
2.) GCS
3.) BigQuery

Question 59

Q

What Tabular Data Problems can AutoML solve?

Answer

A

Classification/Regression, Forecasting

Question 60

Q

What Text Problems can AutoML solve?

Answer

A

Classification, entity extraction, sentiment analysis

Question 61

Q

What Video Problems can AutoML solve?

Answer

A

Action Recognition, classification, object tracking.

Question 62

Q

What Image Problems can AutoML solve?

Answer

A

Classification, Object Detection

Question 63

Q

What is the CLUSTER_SPEC/TF_CONFIG?

Answer

A

A Vertex AI environment variables specifying the cluster used for running a distributed training job in Vertex AI/Tensorflow. They need a primary replica/chief which manages the cluster, workers which perform the training, parameter servers (if using ParameterServerStrategy) or evaluators. CLUSTER_SPEC is set for the full cluster and TF_CONFIG is set on each replica of a training job for multiple replica jobs.

Question 64

Q

What is data parallelism?

Answer

A

Data is split and used to train different models. The overall model is updated asynchronously (allreduce) or synchronously (Parameter Serving)

Answer 65

A

For large models, weights are split across multiple devices and each device trains part of a model.

Answer 66

A

A simple asynchronous training strategy where multiple GPUs can be used on one machine. IT creates one replica per GPU and trains. The results are allreduced together at each update step.

Answer 67

A

An asynchronous strategy that scales MirroredStrategy horizontally by replicating jobs across multiple workers/machines. It requires the TF_CONFIG variable to work

Answer 68

A

A distributed training strategy that uses TPUs to implement the MirroredStrategy.

Answer 69

A

A synchronous model training strategy that uses multiple machines. The parameter server is a cental co-ordinator that save checkpoints, distributes data and updates weights from workers as they are input. It requires the TF_CONFIG variable and TFConfigClusterResolver to define cluster organization.

Answer 70

A

a Tensor Processing Unit that can perfrom allreduce based asynchronous training. It can cause “data bottlenecks” if data size is not properly considered as they are extremely fast. Therefor this requires a balance of file number vs file size to avoid network overhead. Can only read from GCS.

Answer 71

A

An asynchronous training strategy that does not mirror variables and all operations are replicated across local GPUs.

Answer 72

A

Model is dominated by matrix computations
Model has no custom TF/PyTorch/JAX operations in main training loop.
Model trains for weeks or more.
Model is large and has effective batch sizes.

Answer 73

A

Linear Algebra is frequently branching or has element wise operations.
Workloads access memory in a sparse manner.
Workload requires high-precision arithmetic.
Neural Network ahs custom training operations in the main training loop.

Answer 74

A

Precision measures fraction of relevant positives over retrieved positives (TP/(TP + FP)) while recall measures the fraction of relevant positives over the number of expected positives (TP/(TP + FN)). F1 seeks to balance with the harmonic mean of both.

Answer 75

A

A vertex AI offering that allows you to quickly test and customize language, vison and speech models.

Answer 76

A

1.) AI for healthcare: Generates healthcare analytics
2.) Discovery AI for retail

Answer 77

A

1.) Contact Center AI / Dialogflow
2.) Document AI

Answer 78

A

IT is a managed service that streamlines ML feature management. It act as a layer between BQ data that serves the latest features at low latency. It registers multiple BQ tables or views and serves the freshest data based on timestamp.

Answer 79

A

TFX is an extension of tensor flow that allows the build of ML pipelines for a production environment. It is supported by Vertex AI pipelines to allow cloud native pipeline operation.

Answer 80

A

1.) ExmpleGen - Ingests and optionally splits data
2.) StatisticsGen - Calculates statistics on a dataset.
3.) SchemaGen - Examines statistics and creates a schema.
4.) ExampleValidator - Uses schema and statistics to find anomalies + missing values.
5.) Transform - Performs feature engineering
6.) Trainer - Trains the model
7.) Tuner - Tunes Hyperparameters
8.) Evaluator - Performs deep analysis of the training results
9.) InfaValidator - Checks if the model is servable
10.) Pusher - Deploys the model to serving infrastructure
11.) BulkInferrer - Performs batch predictions with a trained model.

Answer 81

A

Airflow, Kubeflow or Vertex AI Pipelines.

Answer 82

A

Have a setup.py in your root directory containing the requirements of the program. Have a trainer/ folder containing task.py that is the entry point to evoke the model.py file. Have an __init__.py file in every sub directory to make the module a package.

Answer 83

A

Use the TRANSFORM clause. This can be used for general imputation, numerical normalization, scaling and bucketing, categorical encoding/crossing, text tokenization/vectorization and image manipulation.

Answer 84

A

BQ automatically performs imputation, numeric standardization (most models), one-hot encoding, multi-hot encoding (arrays), Timestamp transformation and struct expansion.

Answer 85

A

Using ML.PREDICT, ML.FORCAST, ML.RECOMMEND, ML.DETECT_ANOMALIES

Answer 86

A

It can determine test entity types, analyze sentiment, annotate text (all features), classify text and moderate text. All to pre-trained set of solutions.

Answer 87

A

It can perform synchronous, asynchronous or real time transcription of specified language speech.

Answer 88

A

Take written text and convert it to speech with a pre-set list of voices.

Answer 89

A

Allows dynamically translated text. Cloud Translation uses a Google pre-trained or a custom machine learning model to translate text with 100+ language pairs.

Answer 90

A

Cloud Vision allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content.

Answer 91

A

Stored video analysis, streaming video analysis,
object detection and tracking, logo recognition, face detection, person detection, video annotation.

Answer 92

A

Digitizing documents for e-readers, optical character recognition, image recognition, entity extraction, NLP, document classification, key-value pair recognition, translation, normaliztion.

Answer 93

A

General, specialized, custom.

Answer 94

A

When feature selection is needed along with regularization to prevent overfitting.

Answer 95

A

When features are collinear/co-dependent and so removing with L1 could be detrimental.

Answer 96

A

To prevent overfitting.

Answer 97

A

An efficient API for TF import of various data types.

Answer 98

A

1.) TextLineDataset - import lines from text files.
2.) TFRecordDataset - import TF record data
3.) FixedLengthRecordDataset - import records from binary files.

Answer 99

A

End to end service for building customer recommendation systems.

Answer 100

A

A system for storing, synthesizing, de-identifying and analyzing healthcare data. Supports DICOM (digital imaging and communications in medicine), HL7v2 (event messaging service) and FHIR (Fast healthcare interoperability sources)

Answer 101

A

A system built on Dialogflow for building managed AI chat bots.

Answer 102

A

Tabular Workflow for End-to-End AutoML is a complete AutoML pipeline for classification and regression tasks. It is similar to the AutoML API, but allows you to choose what to control and what to automate. Instead of having controls for the whole pipeline, you have controls for every step in the pipeline. These pipeline controls include:

Data splitting
Feature engineering
Architecture search
Model training
Model ensembling
Model distillation

Answer 103

A

The cold start problem occurs when the recommender system lacks sufficient information to make reliable predictions or suggestions for a user or an item.

Answer 104

A

1.) Time series data
2.) Data groupings
3.) Burst data

Answer 105

A

A mismatch in the data that was trained vs used for prediction. Often due to processing issues, poor assumptions or sampling issues.

Answer 106

A

Change in the statistical properties of data over time.

Answer 107

A

Manage your datasets in a central location.
Easily create labels and multiple annotation sets.
Create tasks for human labeling using integrated data labeling.
Track lineage to models for governance and iterative development.
Compare model performance by training AutoML and custom models using the same datasets.
Generate data statistics and visualizations.
Automatically split data into training, test, and validation sets.

Answer 108

A

Store and maintain your offline feature data in BigQuery, taking advantage of the data management capabilities of BigQuery.

Share and reuse features by adding them to the feature registry.

Serve features for online predictions at low latencies using Bigtable online serving or at ultra-low latencies using Optimized online serving.

Store embeddings in your feature data and perform vector similarity searches.

Track feature metadata in Dataplex.

Answer 109

A

Like BigQuery ML ARIMA_PLUS, Prophet attempts to decompose each time series into trends, seasons, and holidays, producing a forecast using the aggregation of these models’ predictions. An important difference, however, is that BQML ARIMA+ uses ARIMA to model the trend component, while Prophet attempts to fit a curve using a piecewise logistic or linear model.

Answer 110

A

At most 100 times move videos of the most common label to the least common one.
100 or more training video frames per label are recommended.
For video frame resolution much larger than 1024 pixels by 1024 pixels, some image quality may be lost during the frame normalization process used by Vertex AI.

Answer 111

A

Colab: Project priorities are collaboration, experimentation, and avoiding spending time setting up infrastructure.

Workbench: Priorities are control and customization and pipeline development.

Answer 112

A

TabNet uses sequential attention to choose which features to reason from at each decision step. This promotes interpretability and more efficient learning because the learning capacity is used for the most salient features. It trains classification and regression models.

Answer 113

A

Wide & Deep jointly trains wide linear models and deep neural networks. It combines the benefits of memorization and generalization.

Answer 114

A

A kernel hosts a jupyter notebook session.

Answer 115

A

By setting up acluser and running the a notebook in it. Requires Dataproc Worker (roles/dataproc.worker) on your project
Dataproc Editor (roles/dataproc.editor) on the cluster for the dataproc.clusters.use permission.

Answer 116

A

using a custom container.

Answer 117

A

With a single hypothesis to 95% significance using a load balancer.

Answer 118

A

Running a production version and a mirror of it that has all requests replayed in it. Helps to check performance without taking affecting customers. It does however cost more and needs to be handled carefully to avoid bugs like overcharging, etc.

Answer 119

A

gradual rollout of a feature as performance is evluated in situ. It is however slow and needs substantial observability/monitoring.

Answer 120

A

By semantically embedding data you can map data to somantically similar groups, descriptions, queries or images.

Answer 121

A

Vertex ML Metadata lets you:

Analyze runs of a production ML system to understand changes in the quality of predictions.

Analyze ML experiments to compare the effectiveness of different sets of hyperparameters.

Track the lineage of ML artifacts, for example datasets and models, to understand just what contributed to the creation of an artifact or how that artifact was used to create descendant artifacts.

Rerun an ML workflow with the same artifacts and parameters.

Track the downstream usage of ML artifacts for governance purposes.

Answer 122

A

You request a batchPredictionsJob directly from the model resource without needing to deploy the model to an endpoint.

Answer 123

A

Before sending a request, you must first deploy the model resource to an endpoint. This associates compute resources with the model so that it can serve online predictions with low latency.

Answer 124

A

Avoid sigmoid for internal laters and use ReLu or advanced ReLU nonlinearities.

Answer 125

A

Use gradient clipping. (clipnorm/clipvalue)

Answer 126

A

When your team manage only a few models, are still experimenting or the models are modified infrequently.

Answer 127

A

1.) Design features with disclosures built in.
2.) Consider giving a few answers and let the user decide.
3.) Model potential adverse feedback and have a iterative roll out plan.
4.) Engauge with a diverse set of users and use feedback to guide further development.

Answer 128

A

1.) Use metrics form users feedback as well as product performance (click through, customer use sliced across different groups.
2.) Ensure metrics are appropriate for the context.

Answer 129

A

1.) Check data for mistakes, accuracy, bias and representation.
2.) Check for training-serving skew.
3.) Remove redundant features.

Answer 130

A

1.) Correlation does not equal cauation.
2.) Models are a reflection of the training data.
3.) Communicate limitations where possible.

Answer 131

A

1.) Rigourously test each component in isolation.
2.) Conduct integration testing.
3.) Detect input drift.
4.) Use a gold standard test set to ensure models work consistently.
5.) Build quality checks such that unintended failures trigger an immediate response.

Answer 132

A

Training-Serving Skew: Attribution differs in production than in training.

Drift: Attribution changes over time.

This is done on an online prediction endpoint in Vertex AI for Tabular data.

Answer 133

A

Periodic evaluation of your model on new, incoming data to determine if model performance is degrading.

Answer 134

A

1.) Set up the GCP environment. Determine if using TFX or Kubeflow.
2.) Design the pipeline around the model you plan to execute.
3.) If using KFP compile to a .yaml file.
4.) run the pipeline (kfp = job.submit(), tfx = tfx.orchestration.{runner}.run()

Answer 135

A

Vertex AI Pipelines is a Google Cloud managed service that allows you to orchestrate and automate ML pipelines where each component of the pipeline can run containerised on Google Cloud or other cloud platforms.

Answer 136

A

A user interface for managing and tracking experiments, jobs, and runs.
An engine for scheduling multistep ML workflows.
A Python SDK for defining and manipulating pipelines and components.
Integration with [Vertex ML Metadata] to save information about executions, models, datasets, and other artifacts.

Answer 137

A

Cloud Scheduler is publishing messages on a schedule and therefore triggering the pipeline.
Cloud Composer is publishing messages as part of a larger workflow, for example a data ingestion workflow that triggers the training pipeline after new data are ingested in BigQuery.
Cloud Logging publishes a message based on logs that meet some filtering criteria. You can set up the filters to detect the arrival of new data or even skew and drift alerts generated by the Vertex AI Model Monitoring service.

Note: This can also be done with the scheduler api.

Answer 138

A

Cloud build can be automatically or manually triggered to clone your repo, run unit and integration tests, build ML images (containers), compile the pipeline, upload to artifact registry and run.

Answer 139

A

Analyze runs of a production ML system to understand changes in the quality of predictions.
Analyze ML experiments to compare the effectiveness of different sets of hyperparameters.
Track the lineage of ML artifacts, for example datasets and models, to understand just what contributed to the creation of an artifact or how that artifact was used to create descendant artifacts.
Rerun an ML workflow with the same artifacts and parameters.
Track the downstream usage of ML artifacts for governance purposes.

Answer 140

A

Monitoring experiments.

Brainscape's Knowledge GenomeTM

MLE Flashcards

Brainscape's Knowledge Genome^TM