1. Framing ML Problems Flashcards

Question 1

Q

What are the key factors for translating business use cases?

Answer

A

First identify impact, success criteria, and data available for a use case. Then, match this with a machine learning approach (an algorithm and a metric)

Question 2

Q

What is the equation for recall?

Answer

A

Recall = True Positive (TP) / (True Positive (TP) + False Negative (FN))

Question 3

Q

What is the equation for precision?

Answer

A

Precision = True Positives (TP) / (True Positives (TP) + False Positives (FP))

Question 4

Q

What are the two types of machine learning?

Answer

A

Supervised and unsupervised
The hybrid is called semi-supervised

Question 5

Q

What are the common ML problem types?

Answer

A

Tabular:
1. Supervised: Regression, Classification
2. Unsupervised: K-means clustering, PCA
Series:
1. Supervised: Forecasting
Image:
1. Supervised: Image classification, Image segmentation, Object detection
Video:
1. Supervised: Video classification, Video object tracking, Video action recognition
Text:
1. Supervised: Sentiment analysis, Entity extraction, Translation
2. Unsupervised: Topic modelling
Mixed:
1. Supervised/Unsupervised: Collaborative filtering / recommendations

Question 6

Q

What is semi-supervised learning?

Answer

A

Some data are labeled and others are not.

Question 7

Q

What are precision, recall and F1 use for?

Answer

A

Precision: Lower false positive
Recall: Lower false negative
F1: Lower false positive and false negative together

Question 8

Q

What is AUC ROC?

Answer

A

Area Under the Receiver Operating Characteristic Curve (AUC-ROC) is a performance metric for classification models at various classification thresholds. It measures the ability of a model to distinguish between positive and negative classes for balanced datasets.
1: Perfect separation of positive and negative classes
0.5: Random guess
It is threshold-invariant, scale-invariant, and robust to outliers.

Question 9

Q

What is AUC PR?

Answer

A

The Area Under the Curve for the Precision-Recall curve (AUC-PR) is a performance measure for binary classification problems in machine learning for imbalanced datasets.
1: Perfect separation of positive and negative classes

Question 10

Q

What are the metrics for regression?

Answer

A

MAE: Average absolute difference between the actual and predicted values.
RMSE: Penalize very large value
RMSLE: Penalize under predictions
MAPE: Proportional difference between actual and predicted value.
R^2: Square of the correlation coefficient between the labels and predicted values. Higher value indicates better fit.

Question 11

Q

What do you need to consider when comes to responsible AI practices?

Answer

A

General best practices: Includes different perspectives
Fairness: academic, legal, cultural. Use statistical methods and test ML models for bias
Interpretability: Model explanations quantify the contributions of each input feature towards making a prediction
Privacy: Minimize leakage.
Security: Protection starts from data collection, training and deployment.

1. Framing ML Problems Flashcards

(11 cards)