Model Accuracy - pt1 Flashcards
Model Accuracy
To quantify model accuracy, we analyze the disparity between the predicted and actual values (Y^ and Y). The smaller the disparity, the better the accuracy.
Metric:
–may measure accuracy or model error
–can compute on training and test set
–training not suitable to gauge model accuracy/error; use the test metric
->Test metric allows us to select the model with the least total error.
–Usually, as flexibility increases, training accuracy increases/error decreases
->Makes it easier to narrow the difference between each pair of y_i and y^_i.
–Usually, as a function of flexibility, test accuracy is n shaped & test error is u shaped
->accuracy is worse when f^ is either too inflexible or too flexible.
->It is generally a moderately-flexible model which produces the lowest test error
–At the same flexibility level, training error is lower than test error
->f^ is optimized by the training data & its predictive performance on test data is expected to be weaker.
–A very flexible f^ likely suffers from overfitting; an excellent training metric with a much worse test metric signals an overfit
For two models, one unrestricted and one restricted (think GLM vs regularized GLMR):
-if the test performance on the more flexible (GLM) model is worse than the less flexible (regularized GLM), the more flexible model is overfit to the training data.
Model Accuracy - Bias
Measures the average closeness between f^ and f. There is a large bias when the mean of f^ is very different from f; there is no bias when they are equal.
Can be described as the expected error that arises from f^ not being flexible or complex enough. Bias is the expected loss caused by the model not being complex enough to capture the signal in the data.
Once a sufficient level achieved, adding more flexibility won’t lower bias as much.
With high bias (underfitting), the model will perform poorly on both the training set and the test set.
Model Accuracy - Variance
Measures the variability in the shape of f^ when different training data is used.
Variance can be described as the expected error that arises from f^ being too flexible or complex. Variance is the expected loss from the model being too complex and overfitting to the training data.
Generally, methods with higher flexibility also have higher variance
With high variance (overfitting), the model will perform better on the training set than on a test set.
Model Accuracy - Bias-Variance Trade-off
We typically think of the expected loss as Bias + Variance + Unavoidable error. When building models we are trying to minimize this expected loss, but to do so we often need to find a balance between Bias and Variance. Models with low bias tend to have higher variance and vice versa.
RMSE can be expressed as sqrt(Var(f^) + (bias(f^))^2 + Var(e))
-the last term can be described as the irreducible error that will not change as f^ changes
We desire an f^ with both low variance and low bias
The bias-variance tradeoff is the tradeoff between building a more complex model that detects more patterns in the data and building a less complex model that will generalize better to unseen data. A more accurate method is unlikely to have the lowest variance or the lowest bias, even though both are desirable. There is a trade-off between variance and bias to consider.
When evaluating a single model, using a test set will help detect whether we have high variance because we can see a difference between the training and test set performance. When comparing models with different levels of complexity, comparing the test set performance and selecting the best performing model can also help us select the model design with the least total error.
Model Accuracy - Regression
f is the mean target, also E[Y], for regression. Let Y^ denote f^.
Root Mean Squared Error (RMSE) is a common choice to measure accuracy for regression problems. = sqrt( Sum[y - y^)^2/n)
Model Accuracy - Classification problems
Binary target: ‘success’ or ‘failure’; ‘1’ or ‘0’. These are considered positive and negative.
4 groups - can use confusion matrix
Prediction Actual
0 1
0 TN - y=0 & y^ = 0 FN - y=1 & y^ = 0
1 FP - y=0 & y^ = 1 TP - y=1 & y^ = 1
Total = N
Model Accuracy - Classification - Metrics
Classification error rate = # wrong preds / total = [FP + FN] / N
Accuracy = # correct preds / total =[TP + TN] / N
–These two can misjudge the prediction quality when the composition of positive and negative observations in a dataset is unbalanced
–These measure at a single cutoff value
Sensitivity (TPR) = correct positive preds / # positive = TP / Actual = 1
Specificity (TNR) = correct negative preds / # negative = TN / Actual = 0
Sensitivity and Specificity can be combined into an accuracy measure called AUC which we use for classification problems -> Want high AUC
False Positive Rate = Incorrect negative preds / # negative = FP / Actual = 0
Precision = correct positive preds / # positive preds = TP / Pred = 1
AUC
Changing the cutoff results in different numbers of predicted positives and negatives. It also results in changing sensitivity and specificity.
The receiver operating characteristic (ROC) curve is the plot of sensitivity and specificity from considering all the cutoff values.
AUC is the area under the ROC curve. A higher AUC implies there are more cutoffs that produce higher sensitivity and/or specificity, which is preferable.
–Want higher AUC when comparing models.
–AUC measures performance across the full range of cutoff thresholds while accuracy measures performance only at the selected threshold.