Wronged Questions: Statistical Learning Flashcards
Adding more predictions will never (increase/decrease) R^2
Decrease
T/F: The expected test MSE formula applies to both quantitative and qualitative responses
False. It only applies to quantitative responses
T/F: In the classification setting, the bias-variance trade-off does not apply since y_i is quantitative.
False. The bias-variance trade-off does indeed apply in the classification setting, albeit with some modifications due to
being categorical rather than quantitative.
T/F: The training error rate is defined as the proportion of correct classifications made when applying our estimate
to the training observations.
False. The training error rate is defined as the proportion of incorrect classifications made when applying our estimate
to the training observations.
T/F: A classifier’s effectiveness is determined by the magnitude of its training error rate rather than its test error rate.
False. A classifier’s effectiveness is determined by the magnitude of its test error rate rather than its training error rate.
T/F: The Bayes classifier is known to produce the highest possible test error rate, known as the Bayes error rate.
False. The Bayes classifier is known to produce the lowest possible test error rate, known as the Bayes error rate.
T/F: The Bayes error rate serves a role similar to that of the irreducible error in the classification setting.
True. The Bayes error rate is analogous to the irreducible error in classification, representing the lowest error rate that can be achieved by any classifier and is due to the noise in the data itself.
Less flexible, more interpretable
Lasso, subset selection
Moderately flexible and interpretable
Least squares, regression trees, classification trees
More flexible, less interpretable
Bagging, boosting
T/F: The K-Nearest Neighbor algorithm determines the function for f
that includes free parameters.
False. KNN is non-parametric
T/F: Random forest makes no assumption about f’s function form.
True, random forest is non-parametric
T/F: Compared to non-parametric methods, parametric methods are more versatile in fitting various forms of f.
False. Non-parametric methods are more versatile because they make no assumptions about f’s form.
T/F: Compared to non-parametric methods, parametric methods typically need a larger number of observations to accurately estimate f.
False. Parametric methods need smaller sample sizes because they already assume a pattern
T/F: Compared to non-parametric methods, parametric methods are generally more difficult to interpret.
False, parametric methods are easier to interpret due to them already having built-in structure