Week 4 Flashcards
What does PCA stand for?
Principal Component Analysis (PCA)
What essentially is Principal Component Analysis (PCA)?
Combining variables into weighted sums
What essentially is Factor Analysis
A technique for showing relationships among variables by relationships to hypothetical underlying factors aka LATENT VARIABLES
Principal Component Analysis (CFA) uses correlation as the underlying association among variables, but Factor Analysis does not, True/False?
FALSE
They both use correlations as the underlying thing
How would you characterise the difference between Principal Component Analysis (CFA) and Factor Analysis?
PCA looks for ‘optimal linear transformations’ - I don’t know what this means but I’m imagining smushing variables together.
Factor Analysis makes a theoretical assumption that there is an underlying latent variable that explains the observed variables
What determines where you draw the line for the FIRST PRINCIPAL COMPONENT?
The line that minimises the diagonal lines to each of the data points (there’s probably a more technical way to say that) - and I think that line is based on something called the EIGEN VECTOR
What even is the SECOND PRINCIPAL COMPONENT?
I think it might actually be the original data, but I’m not sure
What is the name given to the VARIANCE of the FIRST PRINCIPAL COMPONENT?
The first EIGENVALUE
What do the weights of the first principle component always add up to?
1
What is one of the rules for determining the SECOND PRINCIPAL COMPONENT?
It must have zero correlation with the FIRST principle component
What’s the limit to number of principal components you can have?
It is based on the number of variables you have.
Number of variables = max principal components
Is principle component analysis a MATHEMATICAL or a STATISTICAL technique? And why
It’s mathematical
It’s based on matrix algebra
It doesn’t contemplate error values
According to Kaiser’s rule, when should you cut off the number of factors/components?
When the Eigenvalue drops below 1
Aka Kaiser-Guttman
Do we like the Kaiser-Guttman rule?
NO!
Always tends to chose a third of the variables.
Schmitt says it is “the most inaccurate” of all methods
What are the four methods for determining how many components/dfactors to use
- Kaiser-Guttman (don’t use)
- Scree plot
- Parallel Test (the one that uses random data)
- MAP test (the Minimum Average Partial Correlation test) Velicer(1976)
What do you use the Parallel Test for?
Deciding how many factors/components to use.
When doing the Parallel Test, do you use the mean of the random data or the 95th percentile?
I think you use the 95th percentile.
When examining your component loadings - I think it’s in a FACTOR LOADING MATRIX - it is common practice to remove any loadings that are less than 0.5, True/False?
FALSE
But close. You can remove any with a loading of less than 0.3.
This is called SUPPRESSING THE LOADINGS
In factor analysis, after extracting your factors/components, you are going to want to do some ROTATION. What are the two types of rotation you can do
Orthogonal and Oblique
What assumption underpins ORTHOGONAL rotation?
The factors and UNcorrelated
What assumption underpins OBLIQUE rotation?
The factors ARE correlated
In psychology, are you more likely to do ORTHOGONAL or OBLIQUE rotation?
OBLIQUE
In other words, the one that allows the factors to correlate
When you do oblique rotation, you will get two matrices of loadings. What are they called?
The factor PATTERN matrix, and
the factor STRUCTURE matrix
What does the factor PATTERN matrix contain?
The regression coefficients… of something
What does the factor STRUCTURE matrix contain?
The correlations… of something
Of the two matrices of leadings that we get upon completing a rotation, which one do we generally report?
The factor PATTERN matrix
If you do an OBLIQUE rotation, the PATTERN and the STRUCTURE matrix are the same thing, T/F?
FALSE
They are only the same thing if you do an ORTHOGONAL rotation
With FACTOR ANALYSIS, what is our aim in relation to PARTIAL CORRELATIONS between observed variables?
We want the PARTIAL CORRELATIONS to be as close to zero as possible
In factor analysis, what is the name given to the VARIANCE that is die to the COMMON FACTORS?
The COMMUNALITY
In factor analysis, what is the opposite of COMMUNALITY?
The UNIQUE VARIANCE
… because communality refers to the variance in the observed factors that is due to the common factors.
Exploratory Factor Analysis works best with ordinal data, T/F?
FALSE
It assumes continuous data (ie Interval or ratio)
You can proceed, but you’ve just gotta note the limitations
Missing data is a big deal for Exploratory Factor Analysis, T/F
TRUE
Can’t have missing data. Either impute the data or delete the case.
Is Exploratory Factor Analysis sensitive to sample size?
Big time
If you have LOW CORRELATIONS between variables to begin with, is this a good or bad sign for Exploratory Factor Analysis?
Bad sign
Factor analysis is based on the assumption that there is some common thing going on, so low correlations means you might be on a hiding to nothing
How can you test whether your variables have a high enough correlation to run Exploratory Factor Analysis?
Yep, it’s called BARTLET’S TEST, but we don’t like it
When doing Exploratory Factor Analysis, does linearity matter?
Totes
If you have low PARTIAL CORRELATIONS between variables to begin with, is this a good or bad sign for Exploratory Factor Analysis?
Good sign
Don’t ask me why
When doing Exploratory Factor Analysis, do OUTLIERS matter?
Yup
Cos Pearson’s r isn’t robust to outliers
If you have CRAZY HIGH correlations (multicollinearity) between variables to begin with, is this a good or bad sign for Exploratory Factor Analysis?
Bad sign
You don’t want them two high or two low
What is Kaiser’s Measure of Sampling Adequacy about?
It’s an overall statistic that tells you how suitable your dataset is for factor analysis
It has something to do with image and anti-image
What is the minimum value for Kaiser’s Measure of Sampling Accuracy - that is, the one that signals unacceptability?
below .5
When you’re looking at an Anti-Image matrix following a Kaiser’s Measure of Sampling Accuracy, what do we want to the values ON the DIAGONAL to approach
Approach 1
When you’re looking at an Anti-Image matrix following a Kaiser’s Measure of Sampling Accuracy, what do we want to the values OFF the DIAGONAL to approach
Approach zero
What are the TWO major methods of finding factors in SPSS (the ones that Schmitt (2011) recommends.
- Maximum likelihood (ML)
2. Principle axis factoring (PA)
Under what circumstances would you use PRINCIPAL AXIS FACTORING (PA) when finding factors?
If your data is not normally distributed
Under what circumstances would you use MAXIMUM LIKELIHOOD (ML) when finding factors?
I think when you want to generalise to the general population
Note: requires normal distribution
What does it mean when you see .999 in a factor loading table?
It’s a Heywood case
It means there is an error
What is the term given to when you weight each factor loading the same (hint: equivalently)?
Tau-equivalent
What are the three ways SPSS provides for estimating factor scores? And which is the one Geoff recommends, and why?
- a regression method
- the BARTLETT method (actually Maximum Likelihood or Weighted Least Scores)
- ANDERSON-RUBIN
Geoff recommends BARTLETT, because for oblique solutions Anderson-Runin is misleading because it assumes uncorrelated scores.
What are the SIX differences that Geoff highlights between PCA and FA?
- Principle Components = linear combinations of obersved variables. Factors = theoretical entities.
- In FA, error is explicitly modelled. Not so for PCA.
- In FA, if factors are removed/added, the other factor loadings change. Not so for PCA
- FA is a theoretical modelling method (so can test model fit). Not so for PCA.
- FA ‘fragments’ variability into common and unique parts. Not so PCA
- PCA has a canonical algorithm that always works. FA has many, which need to be matched to the data
Do FA and PCA have the same general form?
Yes, and it goes like this:
Observed Variable = Loading x F + error
Where F is either a factor or a component
Do FA and PCA typically deliver similar results?
Yes, particularly if they’re applied to a large number of variables and a large sample size
What is the tip provided by a past tutor in the subject about when to use PCA and when to use FA
Use FA is you assume or wish to test, a theoretical model of latent factors causing observed variables
Use PCA if you simply want to reduce your correlated observed variables into a smaller set of important uncorrelated composite variables
With Varimax rotation, why would you not get both pattern and structure matrices?
Because Varimax is a form of ORTHOGONAL rotation, and because orthogonal rotation produces perfectly uncorrelated stage, you end up with pattern and structure matrices that are identical. Hence no need to produce both.
The Promax rotation technique is an OBLIQUE technique, which means factors are allowed to be correlated, thus it produces pattern AND structure matrices. But why?
The pattern matrix shows factor loadings
The structure matrix shows correlation between the variable and the factor.
The factor loadings in the pattern matrix PARTIAL OUT any common variation among factors
If you square and sum each of the factor loadings for a single VARIABLE, what do you get?
The COMMUNALITY for that variable
If you square and sum each of the factor loadings for a single FACTOR, what do you get?
The EIGENVALUE for that Factor
If your data is not normally distributed, should you be cautious about using EFA?
Nope, you just need to make sure you use the PRINCIPAL AXIS FACTORING extraction method
If you have as many items measured as you have participants, should you be cautious about using EFA
YES FFS
Some people say you need st least 20 items per participant
If your MSA is below 0.5 should you worry about using an EFA? Why tho
Yes
Because a low MSA score means the correlation between your variables is quite low.
This means there is little common variance for you to extract, so you probably won’t get any factors that explain multiple items well
If some of your commonalities are equal to or greater than 1.0, should you be worried about doing an EFA?
Totes
A communality score of 1 means there’s basically no unique variance
If this happens, computer will set a very high communality factor as a kind of error signal. This is the origin of Heywood cases.
What is an Eigenvalue anyway? No, really, what actually is it
It is the variance explained by a principal component
If it is about components, why are we thinking about it in relation to EFA? Nobody knows
Imagine you’re doing a EFA.
Using SPSS outputs, how do you tell which item is explained WORST by the extracted factors?
First, you look at the COMMONALITIES table
Then you find the one with the LOWEST score
REASON: Communalities reflect the sum of squared factor loadings and thus, a low communality means low factor loadings
In EFA, one of the benefits of rotating the factors is to increase the total variance explained, T/F?
FALSE
Rotation will never affect the total explained variance, it will only affect how each factor contributes towards this total.