Module 3 Flashcards
1
Q
What is Confusion Matrix?
A
Performance measurement for machine learning classification
2
Q
Logistic regression?
A
Outputs probabilities
if P > 0.5 = Data is 1
If P < 0.5 = Data is 0
3
Q
Linear Regression?
A
Choosing parameters
4
Q
Ridge/Lasso Regression?
A
Choosing alpha
5
Q
K-Nearest Neighbors?
A
Choosing n_neighbors
6
Q
Parameters like alpha and k?
A
Hyperparameters
7
Q
Choosing the correct hyperparameter?
A
Different hyperparameter values
Fit all of them separately
Essential to use cross-validation
8
Q
Why scale data?
A
Many models use some form of distance to inform them
Features on larger scales can unduly influence the model
9
Q
Ways to normalize the data?
A
Standardization - subtract the mean and divide by variance