Classification Flashcards
How is LDA similar to Logistic regression?
Both LDA and Logistic regression produce linear decision boundary.
True Positive
Actual = 1 | Prediction = 1
False Negative
Actual = 1 | Prediction = 0
When can accuracy be misleading?
When we have imbalanced data set, and prediction of minority class is critical.
What is Macro Precision (or Recall)
Average of Precision (or Recall) of all classes.
What is Confusion Matrix?
Confusion matrix is a matrix which shows the relation between the actual value and the predicted value.
Why do we only calculate Precision and Recall for positive class?
This is because our area of concern is positive class.
Spam Detection, Spam - Positive Anomaly Detection, Anomaly - Positive Disease Detection, Diseases - Positive
What is Weighted Precision (or Recall)?
Weighted Average of Precision (or Recall) of all classes.
How much accuracy is good?
It depends on the problem.
- In the case of medical data, even the accuracy of 99% is bad.
What is accuracy?
Correct Prediction/Total Prediction
Problem with accuracy
Accuracy does not tell about the nature of incorrect prediction.
Actual = 0 | Prediction = 1
Actual = 1 | Prediction = 0
The above both cases are different, however in terms of accuracy they are same.
Type I error
False Positive
When to use Logistic regression or LDA?
LDA assumes that the observations are drawn from a Gaussian distribution.
- When the data follows Gaussian distribution then LDA performs better than Logistic regression.
- If Gaussian assumptions are not met, Logistic regression can out perform LDA.
Type II error
False Negative
Precision
How accurate you are at predicting positives i.e., of all the classes that we predicted positive, how many actually are!
Precision = TP/(TP+FP)
How is LDA different from Logistic Regression?
- In case of LDA co and c1 are computed using estimated mean and variance from a normal distribution.
- In the case of Logistic regression, B0 and B1 are estimated using maximum likelihood.
Why do we consider harmonic mean not Arithmetic mean in F1 Score.
This is because harmonic mean penalizes the smaller value, be it Precision or Recall.
Harmonic Mean is more tilted towards smaller value.
True Negative
Actual = 0 | Prediction = 0
False Positive
Actual = 0 | Prediction = 1
F1 - Score
Maintains the balance between Precision and recall, when both Precision and recall are necessary.
Harmonic mean of Precision and Recall.
F1 = 2.Precision.Recall/(Precision+Recall)
When to use QDA for classification?
- QDA assumes a quadratic decision boundary.
- QDA serves as a compromise between KNN and linear LDA and Logistic regression.
- It performs better in the presence of a limited number of training observations because it does make some assumptions about the form of decision boundary.
Recall
Of all the classes that are positive, how many we predicted correctly.
Recall = TP/(TP+FN)
Why can’t we use linear regression for classification?
This is because, there is no natural way to convert a qualitative response variable with more than two levels into a quantitative response that is ready for Linear regression.
However, for a classification problem of two variables, we could fit a linear regression after 0/1 coding of variable.