ML Training 2 Flashcards by owly dabs

What is Precision?

From all the test examples that were assigned a label, how many actually were supposed to be categorized with that label.

TP/(TP+FP)

How well did you know this?

Not at all

Perfectly

What is Recall?

From all the test examples that should have had the label assigned, how many were actually assigned the label.

TP/(TP+FN)

How well did you know this?

Not at all

Perfectly

What are other ways to evaluate an AutoML model?

Precision, Recall, Confusion Matrix (see diagonal line), use Precision-Recall curve to decide score threshold (possible to assign to labels individually).

How well did you know this?

Not at all

Perfectly

Difference between Colab Enterprise vs Vertex AI Notebook

Colab Enterprise: A collaborative, managed notebook environment with the security and compliance capabilities of Google Cloud. Choose this if your project’s priorities are to collaborate with others and to avoid spending time managing infrastructure.

Vertex AI Workbench: A Jupyter notebook-based environment provided through virtual machine (VM) instances with features that support the entire data science workflow. Choose this if your project’s priorities are control and customizability.

How well did you know this?

Not at all

Perfectly

What platforms or features does Vertex AI Workbench support?

Importing conda environments, access data from Cloud Storage or BigQuery, automated notebook runs and idle shutdown, custom containers, third party credentials, monitoring instance, full control over infrastructure (VM instance).

How well did you know this?

Not at all

Perfectly

How do you overcome imbalanced datasets?

Downsample the majority class examples and upweight the downsampled examples to reduce prediction bias. Experiment with this rebalancing ratio, just like a hyperparameter. The batch size should be several times greater than the imbalance ratio (>=5).

How well did you know this?

Not at all

Perfectly

What is prediction bias?

A value indicating how far apart the average of predictions is from the average of labels in the dataset.

How well did you know this?

Not at all

Perfectly

What is selection bias?

Errors in conclusions drawn from sampled data due to a selection process that generates systematic differences between samples observed in the data and those not observed.

Includes Coverage bias, sampling bias, non-response/participation bias.

How well did you know this?

Not at all

Perfectly

What is coverage bias?

The population represented in the dataset doesn’t match the population that the machine learning model is making predictions about.

How well did you know this?

Not at all

Perfectly

What is sampling bias?

Data is not collected randomly from the target group.

How well did you know this?

Not at all

Perfectly

What is non-response/participation bias?

Users from certain groups opt-out of surveys at different rates than users from other groups.

How well did you know this?

Not at all

Perfectly

What is collaborative filtering model?

Collaborative filtering is a recommendation technique that filters and predicts items a user might like based on the reactions and preferences of similar users.

The fundamental premise is that people who agreed in their evaluation of certain items are likely to agree again in the future.

How well did you know this?

Not at all

Perfectly

What are the three main approaches to building recommendation systems on Google Cloud?

The three approaches are Matrix Factorization in BigQuery Machine Learning (BQML), Recommendations AI, and Two-Tower built-in algorithm.

How well did you know this?

Not at all

Perfectly

What is required to train a matrix factorization model on BigQuery?

A table with three input columns: user(s), item(s), and a feedback variable (implicit or explicit, such as ratings).

How well did you know this?

Not at all

Perfectly

What are the main benefits of Matrix Factorization on BigQuery?

The benefits include minimal ML expertise required (uses SQL), simple data input requirements, and ability to discover new user interests through collaborative filtering.

How well did you know this?

Not at all

Perfectly

What are the limitations of Matrix Factorization?

Study These Flashcards

The limitations include inability to handle large feature sets (cannot handle more than 2 dimensions (user vs items), difficulty with incorporating new items into the matrix (cannot be continuously updated) and requirements for sufficient feedback data due to sparse input matrix.

How does one test recommendation systems?

Study These Flashcards

Set up A/B experiments.

What is Recommendations AI?

Study These Flashcards

It’s a fully managed service that deploys scalable recommendation systems using state-of-the-art deep learning techniques, including two-tower encoders.

Using ML, it solves the limitations of Matrix factorization.

How often are Recommendations AI models updated?

Study These Flashcards

Models are automatically retrained daily and tuned quarterly to capture changes in customer behavior, product assortment, pricing, and promotions.

How do Recommendations AI achieve low serving latency?

Study These Flashcards

It utilizes a scalable approximate nearest neighbors (ANN) service for efficient item retrieval during inference, resulting in low latency.

What feature ensures data consistency in Recommendations AI?

Study These Flashcards

It employs a scalable feature store that maintains consistency between online and offline tasks, preventing data leakage and training-serving skew issues.

What makes the deployment process reliable in Recommendations AI?

Study These Flashcards

It uses a robust CI/CD routine that validates models before deployment and ensures zero-downtime transitions to production.

What is the main purpose of Two-Tower encoders?

Study These Flashcards

They surface the most relevant items for users by encoding both candidate and query data into the same embedding space.

What are the key benefits of the Two-Tower approach?

Study These Flashcards

Benefits include greater control over model training, ability to handle various feature types (text, images), and better handling of cold-start cases.

What are the requirements for implementing Two-Tower encoders?

You need training data combining query/user data with candidate/item data, including matched pairs, and an input schema describing the combined training data.

What other services can be used together with Two-Tower encoders?

Vertex AI Matching Engine and ScANN provide a high-scale and low-latency Approximate Nearest Neighbor (ANN) service so you can more easily identify similar embeddings. ## Footnote Hyperparameter tuning service such as Vizier can help you identify the optimal hyperparameters. Hardware Accelerators such as GPUs or TPUs.

Who should consider Two-Tower encoders?

It's best for users seeking greater control and flexibility, who have strong technical ML expertise and can work in a managed notebook environment. Note that you cannot do real-time personalization with this.

Who should use Matrix Factorization?

It's best for users with simplified datasets looking to quickly develop a baseline recommendation system.

Who is Recommendations AI best suited for?

It's ideal for teams lacking technical experience with production recommendation systems or those wanting to allocate resources to other priorities.

With new data, do you need a new CT pipeline?

No, the previously deployed CT pipeline is executed. No new pipelines or components are deployed; only a new prediction service or newly trained model is served at the end of the pipeline.

What's the function of a CI/CD pipeline in ML training?

To deploy new implementations of ML pipelines and components quickly. If given new implementation, a successful CI/CD pipeline deploys a new ML CT pipeline.

When should you maximise AUC Precision-Recall?

Working with highly imbalanced datasets (e.g., fraud detection, rare disease diagnosis), The positive class is your primary interest, False positives are costly or important to minimize, You have many negative examples and few positive ones

When should you maximise AUC ROC?

Working with balanced or nearly balanced datasets, Both classes are equally important, False positives and false negatives have similar costs, You need a general measure of model discrimination

ML Training 2 Flashcards

(34 cards)