Classification Flashcards

1
Q

How is LDA similar to Logistic regression?

A

Both LDA and Logistic regression produce linear decision boundary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True Positive

A

Actual = 1 | Prediction = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

False Negative

A

Actual = 1 | Prediction = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When can accuracy be misleading?

A

When we have imbalanced data set, and prediction of minority class is critical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Macro Precision (or Recall)

A

Average of Precision (or Recall) of all classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Confusion Matrix?

A

Confusion matrix is a matrix which shows the relation between the actual value and the predicted value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do we only calculate Precision and Recall for positive class?

A

This is because our area of concern is positive class.

Spam Detection, Spam - Positive Anomaly Detection, Anomaly - Positive Disease Detection, Diseases - Positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Weighted Precision (or Recall)?

A

Weighted Average of Precision (or Recall) of all classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How much accuracy is good?

A

It depends on the problem.
- In the case of medical data, even the accuracy of 99% is bad.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is accuracy?

A

Correct Prediction/Total Prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Problem with accuracy

A

Accuracy does not tell about the nature of incorrect prediction.

Actual = 0 | Prediction = 1
Actual = 1 | Prediction = 0

The above both cases are different, however in terms of accuracy they are same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Type I error

A

False Positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When to use Logistic regression or LDA?

A

LDA assumes that the observations are drawn from a Gaussian distribution.

  • When the data follows Gaussian distribution then LDA performs better than Logistic regression.
  • If Gaussian assumptions are not met, Logistic regression can out perform LDA.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Type II error

A

False Negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Precision

A

How accurate you are at predicting positives i.e., of all the classes that we predicted positive, how many actually are!

Precision = TP/(TP+FP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is LDA different from Logistic Regression?

A
  • In case of LDA co and c1 are computed using estimated mean and variance from a normal distribution.
  • In the case of Logistic regression, B0 and B1 are estimated using maximum likelihood.
17
Q

Why do we consider harmonic mean not Arithmetic mean in F1 Score.

A

This is because harmonic mean penalizes the smaller value, be it Precision or Recall.

Harmonic Mean is more tilted towards smaller value.

18
Q

True Negative

A

Actual = 0 | Prediction = 0

19
Q

False Positive

A

Actual = 0 | Prediction = 1

20
Q

F1 - Score

A

Maintains the balance between Precision and recall, when both Precision and recall are necessary.

Harmonic mean of Precision and Recall.

F1 = 2.Precision.Recall/(Precision+Recall)

21
Q

When to use QDA for classification?

A
  • QDA assumes a quadratic decision boundary.
  • QDA serves as a compromise between KNN and linear LDA and Logistic regression.
  • It performs better in the presence of a limited number of training observations because it does make some assumptions about the form of decision boundary.
22
Q

Recall

A

Of all the classes that are positive, how many we predicted correctly.

Recall = TP/(TP+FN)

23
Q

Why can’t we use linear regression for classification?

A

This is because, there is no natural way to convert a qualitative response variable with more than two levels into a quantitative response that is ready for Linear regression.

However, for a classification problem of two variables, we could fit a linear regression after 0/1 coding of variable.