Lecture 5: Principle Component analysis Flashcards
What does a PCA attempt to do?
Attempts to explain variables in as few new ‘components’ as possible
How does a PCA attempt to explain variables in as few new ‘components’ as possible? What are these components
Transforms variables by rotating axes, these are the new
components.
•Axes are chosen to maximise explained variance.
•Total variance of new components is the same as total variance of the original variables.
Explain the process of axis rotation
First we rotate the axis on which the data points are plotted on to draw a line through the dimensions (variables/ data points) which accounts for the most variance. You could describe this as the line that describes the data the best or costs to the data points in a p-dimensional space. For our second component we find a line that is perpendicular to the first one in one of the p-1 dimensions (if 2D then simply perpendicular.) If there are more than 2 dimensions then we repeat step 2 for each of the remaining p dimensions, finding a line through the data which is perpendicular to all previous components and explains the most possible variance in the data.
In other words:
1. Maximises amount of variance accounted for by the 1st (principal) component.
2. For each subsequent component: maximises variance accounted for as long as:
• component is orthogonal (perpendicular, uncorrelated) to all previous components
How is the maximum number of components decided?
The number of dimensions/ variables
How are the different principle components related to each other?
They’re not; Each principal component loading vector is unique (excluding sign flip)
How does total variance differ following a PCA?
It stays the same
What is meant by the eigen vectors in terms of PCA?
The orientation of the components (lines through the data)
Finish this from the first lecture:
All p x p matrices A have _______ for which __ = __
All p x p matrices A have associated scalers λj and vectors xj for which Axj = λjxj
In regards to PCA, what do the variables represent in the following equation?
Axj = λjxj
A represents an R (correlation) or S (covariance) matrix. λj represents the eigen value component, how much variance the component explains. xj represents the eigenvectors; how much each variable in the dataset loads onto each component (e.g 40% of petal width and 90% of petal length.)
How can we project data onto the new component axis?
We can project the data onto the new component axis using a weighted sum of the original variables with the weights being the eigen vectors.
Describe two important constraints when determining these component vectors
- We choose the eigen vector that maximises the accounted variance.
- The squared elements elements of an eigen vector add up to one
Should you centre your variables in a PCA? What happens if you do or don’t?
You should always centre your variables and if variances differ greatly, you can also scale them with their standard deviation. If you don’t centre your variables then you’ll have to centre rotate the data after the PCA, so the origin of the new components at the new zero point.
The eigen vectors are also known as the _____ and tell us how _____
The eigen vectors are also known as the rotation matrix and tell us how much to rotate the original axis.
What does this matrix look like and how is it used?
the vector matrix may look like this in R; (0.38772 -0.92178) (0.92178 0.38772) The rotation matrix template looks like: (cosθ -sinθ) (sinθ cosθ)
Since 0.388 represents the cosine of an angle, we take the arcosine to solve for the angle theta. arccoss .388 equals about 67 degrees so we rotate our component counterclockwise by 67 degrees.
What is the next step after we have rotated the axis?
Plug the values into the components and we can have a new formula to draw the new axis for z1 and z2.9. Each component consists of the column values.
Component 1: z1 = 0.388y1 + 0.922y2
Component 2: z2 = 0.922y1 + 0.388y2