ML-02 - Logistic regression & classification Flashcards
ML-02 - Logistic regression & classification
Describe what a classification problem is.
A classification problem is when the output (𝑦) is discrete or categorical, e.g., email classified as spam, not spam.
ML-02 - Logistic regression & classification
What are the two types of classification problems?
- Binary, 2 classes (Dog or not dog; cat or dog)
- Multiclass (Dog, cat, giraffe or zebra)
ML-02 - Logistic regression & classification
Why do we use logistic regression over linear regression for classification problems?
Logistic regression handles discrete output for classification, unlike linear regression’s continuous values.
ML-02 - Logistic regression & classification
What’s the most common logistic function?
The sigmoid function (See image).
ML-02 - Logistic regression & classification
What’s the name of the pictures function? (See image)
The sigmoid function.
ML-02 - Logistic regression & classification
How do we apply logistic regression?
(See image)
ML-02 - Logistic regression & classification
How do we interpret the outputs of a logistic regression model?
Estimated probability of the positive case being true.
E.g. P(y = 1 | x ; w)
ML-02 - Logistic regression & classification
What is another name for the outputs of a logistic function?
Logits.
ML-02 - Logistic regression & classification
What is a logit?
The output values of a logistic function are often called logits.
ML-02 - Logistic regression & classification
What is a linear decision boundary?
A line that separates one class of data from another class.
(See image)
ML-02 - Logistic regression & classification
What is a non-linear decision boundary?
An equation that separates different classes of data.
(See image)
ML-02 - Logistic regression & classification
What is an advantage to using non-linear decision boundaries?
They can represent more complex decision boundaries.
ML-02 - Logistic regression & classification
How do you write a decision boundary in matrix form?
(See image)
ML-02 - Logistic regression & classification
What is the loss function used for logistic regression?
Cross-entropy loss
ML-02 - Logistic regression & classification
What’s another name for cross-entropy loss?
Log loss
ML-02 - Logistic regression & classification
What’s another name for log loss?
Cross-entropy loss
ML-02 - Logistic regression & classification
What’s the formula for cross-entropy loss?
(See image)
ML-02 - Logistic regression & classification
What loss fomula is depicted? (See image)
Cross-entropy loss
ML-02 - Logistic regression & classification
What loss function would you use for a binary class problem, when using logistic regression?
Binary cross-entropy loss
ML-02 - Logistic regression & classification
What is the formula for binary cross-entropy loss?
(See image)
ML-02 - Logistic regression & classification
What loss function is this? (See image)
Binary cross-entropy loss
ML-02 - Logistic regression & classification
What’s the full formula for binary cross-entropy loss?
(See image)
ML-02 - Logistic regression & classification
What are 3 more advanced optimization algorithms mentioned in this chapter?
- CG
- BFGS
- L-BGFS
ML-02 - Logistic regression & classification
What’s the optimization algorithm CG short for?
Conjugate gradient
ML-02 - Logistic regression & classification
What’s the optimization algorithm BFGS short for?
Broyden, Fletcher, Goldfarb, and Shannon
ML-02 - Logistic regression & classification
What’s the optimization algorithm CG short for?
Conjugate gradient
ML-02 - Logistic regression & classification
What are the advantages of CG, BFGS and L-BFGS?
- no need to manually pick learning rate (𝛼).
- often faster (converging) than gradient descent.
ML-02 - Logistic regression & classification
What are the disadvantages of CG, BFGS and L-BFGS?
They’re more complex than gradient descent.
ML-02 - Logistic regression & classification
How does algorithm performance scale with data size?
Algorithms performance similar when dataset is large.
It’s not who has the best algorithm that wins, it’s who has the most/best data.
ML-02 - Logistic regression & classification
How do you deal with computational cost as dataset size grows? (2)
- Use variants of gradient descent.
- Map/reduce + parallelism
ML-02 - Logistic regression & classification
How do you parallel process in machine learning?
Split training dataset into different pieces and compute them in parallel, and then combine
(See image)
ML-02 - Logistic regression & classification
What are the names of the two big GPU platforms/programming models?
- CUDA
- OpenCL
ML-02 - Logistic regression & classification
Who’s behind CUDA?
Nvidia
ML-02 - Logistic regression & classification
Who’s behind OpenCL?
Apple and Khronos Group
ML-02 - Logistic regression & classification
What’s Nvidia’s GPU computing platform/programming model called?
CUDA
ML-02 - Logistic regression & classification
What’s Apple + Khronos Group’s GPU computing platform/programming model called?
OpenCL
ML-02 - Logistic regression & classification
How do you handle multiclass problems?
Use “1 vs. rest”?
ML-02 - Logistic regression & classification
What is “1 vs. rest”?
- Treat multiclass classification as multiple binary class problems.
- Train 1 classifier for each problem.
(See image)
ML-02 - Logistic regression & classification
What do you call the depicted type of classifier?
(See image)
1 vs. rest (for multiclass classification)