model evaluation Flashcards

1
Q

What is the requirement for a goodness metric for backpropagation to work?

A

has to be differentiable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you define the goodness of your model?

A

Ability to generalize to unseen data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can affect generalizability?

A
  • algorithm
  • hyperparameter values
  • training data
  • random initialization (can affect accuracy)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 4 methods of quantifying generalization goodness?

A

1) accuracy, precision, recall
2) mean absolute error
3) RMSE
4) area under ROC curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Come up with a metric for Predicting lung cancer from chest x-rays

A

Recall, precision, FN/P

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Come up with a metric for predicting high school GPA

A

MAE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Come up with a metric for evaluating search engine results

A

recall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Come up with a metric for predicting the location of an object in 3D space

A

euclidean/cosine distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Come up with a metric for predicting if twitter user is a liberal or conservative

A

AUC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is sensitivity?

A

True positive rate, also recall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is specificity?

A

True negative rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is precision?

A

A measure of exactness. The percentage of tuples labeled as positive and are actually such

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the F-measure?

A

Harmonic mean of precision and recall, giving equal weight to each

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What metrics can you use for regression?

A

MAE, RMSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do we take the square of metrics?

A

Taking the square results in steeper gradients when searching and gives higher penalities. RMSE accentuates the impact of outliers. Otherwise might take too long t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a condition of the AUC

A

Binary prediction metric that only applies when label is binary, predictions have to be probablistic.

17
Q

What is the AUC?

A

Probability that a randomly chosen positive label prediction will be greater than a randomly chosen negative label prediction

18
Q

What does an AUC of 0.5 mean?

A

AUC = 0.50 means no better than random chance

19
Q

What is K-fold CV?

A

Tuples randomly partitioned into K mutually exclusive subsets

20
Q

What is a standard value for K?

A

5 or 10

21
Q

What is the algorithm for K-fold CV?

A

1) Choose a K and randomly assign each row to a single fold (between 1 and K)
- This fold assignment indicates the fold in which that row will serve as a tuple in the test set
2) Conduct K phases of training/testing, start with fold 1, which will serve as the test set, with the other folds serving as the training set
3) Calculate error metric on the whole dataset (concatenation of results from each fold)

22
Q

Do you calculate an error metric for each fold in K-fold CV?

A

No, it is still only calculated once

23
Q

What is CV used for?

A

Model selection

24
Q

When would CV NOT give you better results than handpicking your train and test data?

A

With temporal data

25
Q

What is ROC curve?

A

Shows tradeoff between TPR and FPR. It allows us to visualize the tradeoff between the rate at which it mistakenly identifies negative cases as positive for different portions of the test set