Chapter 6 - Dimensionality Reduction Flashcards

1
Q

Dimensionality Reduction - idea, types

A

define a small set of M predictors which summarize the information in all p predictors. Principle components regression, partial least squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Principal Components Regression - algorithm and 3 facts

A

1) reduce the original predictors with the first M score vectors z1, zM
2) make a linear model y = theat_i * z_i , and perform least squares regression to obtain theta_i

beta_j has the form sum[theta_m*phi_ jm]

this constraint on beta_j introduces bias but can reduce the variance

The coefficients shrink as we decrease M (due to similarities between PCR and Ridge)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Relationship between PCR and Ridge

A

-write out forms of linear, ridge, and PCR-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 Simulated Examples - all predictors, 2 predictors, 5 predictors

A

all predictors - PCR doesn’t do so well
2 predictors - moderately ok
5 - bias increases only below p = 5, PCR and ridge do better than lasso

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Partial Least Squares Regression - algorithm

A

in contrast with PCR, we use Y when creating Z_i

1) Z_1 = sump[phi_j1X_j] where phi_j1 is the coefficient of regressing Y onto X_j
2) X_j(2) is the residual of regressing X_j onto Z_1
3) Z_2 = sump[phi_j2
X_j(2)] where phi_j1 is the coefficient of regressing Y onto X_j(2)
4) X_j(3) is the residual of regressing X_j(2) onto Z_2
….

stop at a level Z_m where m <p then choose a value of m though CV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Partial Least Squares Regression - theory (3)

A

1) at each step we find linear combination of predictors most correlated to response
2) after each step, we transform the predictors such that they are independent
3) compared to PCR it has less bias but more variance

  • how would you do CV for this method (similar to right/wrong way to do CV)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 Problems with High Dimension Data

A

p&raquo_space; n is now very common, so we know least squares won’t work here. so we can use regularization methods.

when n = p, we can find a fit that goes through every point, methods of training error are bad, and it becomes difficult to estimate noise. measures of model fit, CIP AIC, BIC fail.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2 Takeaways from working with high dimensional data

A

1) adding predictors that are uncorrelated with results can hurt the performance of regression (test error)
2) when p > n there is multilinearity, many subsets will produce good results, so don’t overstate the importance of any one subset of predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly