W11 - PCA - MULTIVARIATE Flashcards
What does PCA stand for and what is it used for
Principal Component Analysis
It describes the variation in multivariate data
What are the benefits of using PCA’s
Allows for the descriptions of p >2 data clouds in fewer dimensions (reduces complexity)
Find biological interesting interactions between variables
Defines new variables free from multicollinearity
In a PCA, what happens to the axis of the graph
They become vectors
What is an eingenvector
The longest vector in space taken from a multivariate dataset
It accounts for the vector with the most variance in the dataset
Are eigenvectors correlated to one another?
no
How do we graph eigenvectors
Eigenvectors are unstructured data and we consider both variables as responses (plot variables on Y and X axis)
No biological reason to assume one variable causes change on the other
Describe eigenvalue
Eigenvalue = length
variance in direction described by the corresponding eigenvector
Describe eigenvector
Eigenvector = direction
Coefficients or loading of measured variables to describe the direction
What are the assumptions of eigenanalysis
Best fitted for linear relationships between variables
Does not assume multivariate normality
Is influenced by outlier
Sensitive to sampling error