Chapter 1: Introduction Flashcards
What is a supervised learning model?
Building a statistical model for predicting an output based on one or more inputs
What is an unsupervised learning model?
There are inputs but no supervising outputs for the data
What is the formula to calculate accuracy using false positives, true positives, fase negatives, and true negatives?
Accuracy = (TP + TN) / (TP + TN + FP + FN)
What is the formula for precision?
TP/(TP+FP)
What does precision of a model reflect?
How accurate the positive predictions are
What is the formula for recall sensitivity?
Recall = TP/(TP+FN)
What does recall sensitivity indicate?
Measures the proportion of positive cases that are correctly identified by the model. Higj sensitivity indicates the model is good at identifying positive cases
What is the formula for specificity?
Specificity = TN/(TN+FP)
What does specificity indicate?
Measures the proportion of actual negative cases that are correctly identified by the model. High specificity indicates the model is good at identifying negative cases
What is the formula for the F1 score?
F1 = 2TP/(2TP + FP + FN)
What is the formula for mean squared error?
(1/n)*Sig{n=1 .. t}(et^2)
Where et is the error of observation t
What is the formula for mean absolute error?
(1/n)*Sig{n=1 .. t}(|et|)
Where et is the error of observation t
What is model bias?
The difference between the expected prediction and the correct model we are trying to predict for the given data points
What is variance?
The variability of the model estimate for given data points
What is the deviation/variance tradeoff?
The simpler the model, the higher the bias. The more complex the model, the higher the variance
What is a symptom of underfitting a model?
Higher training error, high bias, training error close to test error
What is a symptom of overfitting a model?
Very low training error, high variance, training error significantly lower than test error