Module 3 Flashcards

1
Q

What is Confusion Matrix?

A

Performance measurement for machine learning classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Logistic regression?

A

Outputs probabilities
if P > 0.5 = Data is 1
If P < 0.5 = Data is 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Linear Regression?

A

Choosing parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Ridge/Lasso Regression?

A

Choosing alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

K-Nearest Neighbors?

A

Choosing n_neighbors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Parameters like alpha and k?

A

Hyperparameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Choosing the correct hyperparameter?

A

Different hyperparameter values
Fit all of them separately
Essential to use cross-validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why scale data?

A

Many models use some form of distance to inform them

Features on larger scales can unduly influence the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Ways to normalize the data?

A

Standardization - subtract the mean and divide by variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly