Week 12 Flashcards

Question

Answer 1

A

dimension reduction – i.e., we transform the original X’s and work with M transformed variables, where M < P.

unsupervised learning method that is used to summarize a large set of correlated variables.

Answer 2

A

linear weights

Answer 3

A

inputs for supervised learning methods or for data visualization.

extract variables corresponding to the directions along which the data vary the most, CHOOSE PHI JM

Answer 4

A

extract variables corresponding to the directions along which the data vary the most.

Answer 5

A

min(p,n-1) independent principal components

Answer 6

A

linear combination of All x with linear weights called loadings

Answer 7

A

principal component
loadings vector
sign

Answer 8

A

share of the total variance of X explained. This share equals each component’s variance divided by the sum of variances of all PC’s.

Answer 9

A

will obtain all the p components

Answer 10

A

PERPENDICULAR

Answer 11

A

Scaling up one variable by a constant would blow up its variance and change it’s loading

have to standardize data unless same units

Answer 12

A

highly correlated data (R VALUE ABOVE 0.5)

when continuous var ( cannot have categorical variable, or else need to use CATPCA)

Answer 13

A

value of z variable

Answer 14

A

left and bottom: component score(zm)
top and right : loadings

Answer 15

A

proportion of total variance of X explained by each subsequent component

look for an “elbow” in the plot, where contribution to variance drops sharply and flattens. keep retaining components until elbow appears in the plot

Answer 16

Study These Flashcards

A

Highly correlated variables provide redundant information on one another. Should be possible to extract a set of independent factors that explain the bulk of variation over time.

Answer 17

Study These Flashcards

A

level: loadings are similar for all of the components

slope:loadings are similar for all of the components (downward sloping all the way past 0)

curvature: flip sign twice, u shape graph, induces movement in yield curve like butterfly

Answer 18

Study These Flashcards

A

Constraints on β’s may incur bias. However, dimensionality reduction has the potential to greatly reduce variance.

Answer 19

Study These Flashcards

A

we can run the principal components regression.

Answer 20

Study These Flashcards

A

Run PCA on the predictor matrix X.

Construct the sample principal components Z.

Remove all but M < P first principal components.

Regress y on z1, …, zM .

Answer 21

Study These Flashcards

A

unsupervised PCA, there is no clear rule: some recommend to plot proportion of variance explained on a scree plot and retain the PC’s until the plot starts to drop off (at the “elbow”).

For PCR, we can simply use K-fold cross-validation

Answer 22

Study These Flashcards

A

directions of highest variability in X are those most associated with y

Answer 23

Study These Flashcards

A

useful signal may “hide” in the low variance component that would be discarded.

Answer 24

Study These Flashcards

A

NO

ALL ARE USED TO FORM PC

Week 12 Flashcards

(25 cards)