Week 12 Flashcards
What is principal component analysis? What kind of data ?
dimension reduction – i.e., we transform the original X’s and work with M transformed variables, where M < P.
unsupervised learning method that is used to summarize a large set of correlated variables.
What are loadings?
linear weights
What can PCA be used for?
inputs for supervised learning methods or for data visualization.
extract variables corresponding to the directions along which the data vary the most, CHOOSE PHI JM
What is the primary purpose of PC
extract variables corresponding to the directions along which the data vary the most.
How many components can we decompose X into?
min(p,n-1) independent principal components
What are the principal components obtained?
linear combination of All x with linear weights called loadings
The _____________ and __________ are unique up to ______
principal component
loadings vector
sign
What are principal components ordered on ?
share of the total variance of X explained. This share equals each component’s variance divided by the sum of variances of all PC’s.
If you have fewer x than observatrions, …
will obtain all the p components
How to tell whether 2 pca are independent?
PERPENDICULAR
Are principal components scale variant? Why ?
Scaling up one variable by a constant would blow up its variance and change it’s loading
have to standardize data unless same units
When does pca work best? What kind of data must it be used on ?
highly correlated data (R VALUE ABOVE 0.5)
when continuous var ( cannot have categorical variable, or else need to use CATPCA)
What is component score?
value of z variable
What are the axes in pca biplot?
left and bottom: component score(zm)
top and right : loadings
What is scree plot? How to find number of componennts?
proportion of total variance of X explained by each subsequent component
look for an “elbow” in the plot, where contribution to variance drops sharply and flattens. keep retaining components until elbow appears in the plot
Why is it best to use pca on highly correlated variable?
Highly correlated variables provide redundant information on one another. Should be possible to extract a set of independent factors that explain the bulk of variation over time.
What is the level , slope , and curvature component?
level: loadings are similar for all of the components
slope:loadings are similar for all of the components (downward sloping all the way past 0)
curvature: flip sign twice, u shape graph, induces movement in yield curve like butterfly
What is the problem of constraining betas? How does pca balance this flaw?
Constraints on β’s may incur bias. However, dimensionality reduction has the potential to greatly reduce variance.
What do we do once we know PCA?
we can run the principal components regression.
What are the steps to running PCR?
Run PCA on the predictor matrix X.
Construct the sample principal components Z.
Remove all but M < P first principal components.
Regress y on z1, …, zM .
How to choose M for unsupervised PCA and PCR?
unsupervised PCA, there is no clear rule: some recommend to plot proportion of variance explained on a scree plot and retain the PC’s until the plot starts to drop off (at the “elbow”).
For PCR, we can simply use K-fold cross-validation
What is the underlying assumption in PCR?
directions of highest variability in X are those most associated with y
When does the assumption fail?
useful signal may “hide” in the low variance component that would be discarded.
Does PCR use variable selectrion?
NO
ALL ARE USED TO FORM PC