Chapter 6 - Model Selection Flashcards

Question 1

Q

Define model selection

Answer

A

In linear regression training error (RSS) decreases as p increases, and when n < p there is no least squares solution. so we must find a way to select fewer predictors

Question 2

Q

Best Subset Selection

Answer

A

overview: lets compare all models with k predictors (there are p choose k of these). Choose the model with the smallest RSS. Do this for every possible k. select an optimal k (# of predictors) by minimizing CV error

Question 3

Q

Alternatives to minimizing CV error (3)

Answer

A

minimize {1) Akaike Information Criterion 2) Bayesian Information Criterion, * both penalize models with extra predictors} maximize {1) Adjusted R^2 = 1 - (RSS/(n-k-1))/(TSS/(n-1))

Question 4

Q

How do the alternatives to computing CV error compare? (3)

Answer

A

1) much less computationally expensive 2) motivated by asymptotic arguments and rely on model assumptions (e.g, normality of errors) 3) equivalent concepts for other models (e.g., logistic regression)

Question 5

Q

2 problems with best subset selection, and how to mitigate them

Answer

A

1) very computationally expensive
2) if for a fixed k, there are too many possibilities, we increase our chance of overfitting (e.g., the model has a lot of variance (changes a lot between training sets))

Solution: restrict our search space for the best model (reduces model variance at the expense of higher bias)

Question 6

Q

Forward Stepwise Selection

Answer

A

1) start with the null model 2) augment the predictors in K with 1 additional predictor, choose among these the best possible predictor (minimizes RSS, highest R^2) 3) select a single best model using CV error, Cp, AIC, BIC, or adjusted R^2
note: results not the same as best subset!

Question 7

Q

Backward Stepwise Selection

Answer

A

1) start with full model {x1, …., xP}
2) at each step remove one of the predictors from the model, choose the one that causes the minimum jump in RSS (also remove model that is less significant in a t-test)

Question 8

Q

Comparison of Forward and Backward Stepwise Selection

Answer

A

you cannot apply backward selection when p>n, they don’t have to vie the same result

Question 9

Q

Alternatives Selection Methods

Answer

A

mixed - do forward selection, but at every step, remove variables that are no longer necessary

forward stagewise selection - modify/transform predictors after each setp (the span of X matters). decreases variance of procedure but bias increases