Chapter 6 - Model Selection Flashcards

1
Q

Define model selection

A

In linear regression training error (RSS) decreases as p increases, and when n < p there is no least squares solution. so we must find a way to select fewer predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Best Subset Selection

A

overview: lets compare all models with k predictors (there are p choose k of these). Choose the model with the smallest RSS. Do this for every possible k. select an optimal k (# of predictors) by minimizing CV error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alternatives to minimizing CV error (3)

A

minimize {1) Akaike Information Criterion 2) Bayesian Information Criterion, * both penalize models with extra predictors} maximize {1) Adjusted R^2 = 1 - (RSS/(n-k-1))/(TSS/(n-1))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do the alternatives to computing CV error compare? (3)

A

1) much less computationally expensive 2) motivated by asymptotic arguments and rely on model assumptions (e.g, normality of errors) 3) equivalent concepts for other models (e.g., logistic regression)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

2 problems with best subset selection, and how to mitigate them

A

1) very computationally expensive
2) if for a fixed k, there are too many possibilities, we increase our chance of overfitting (e.g., the model has a lot of variance (changes a lot between training sets))

Solution: restrict our search space for the best model (reduces model variance at the expense of higher bias)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Forward Stepwise Selection

A

1) start with the null model 2) augment the predictors in K with 1 additional predictor, choose among these the best possible predictor (minimizes RSS, highest R^2) 3) select a single best model using CV error, Cp, AIC, BIC, or adjusted R^2
note: results not the same as best subset!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Backward Stepwise Selection

A

1) start with full model {x1, …., xP}
2) at each step remove one of the predictors from the model, choose the one that causes the minimum jump in RSS (also remove model that is less significant in a t-test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Comparison of Forward and Backward Stepwise Selection

A

you cannot apply backward selection when p>n, they don’t have to vie the same result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Alternatives Selection Methods

A

mixed - do forward selection, but at every step, remove variables that are no longer necessary

forward stagewise selection - modify/transform predictors after each setp (the span of X matters). decreases variance of procedure but bias increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly