exploratory factor analysis ii Flashcards
Scree plot useful for…
Deciding how many factors to keep
Two possible criteria for scree plot
Cut off where the eigenvalues fall approx. linearly (‘inflection point’)
If variables had previously been z-standardised, cut off where lambdhax < 1. K1 rule
Interpretations of factors based on factor loadings
PCA does not care how easy they are to interpret.
Only cares about
- Orthogonality
- and extraction of maximum variance
To make the interpretation easier, one can further rotate the coordinate system
This is specific to Factor Analysis (no longer PCA)
Orthogonal rotation
Rotate factors 90 deg etc.
Oblique rotation
Angle does not remain 90deg, changed to fit factors
Fits both boths
Distinct, simple structure
However, angle change
Rotation techniques
Factor analysis goes beyond PCA in that it involves further rotation of the principal components with the objective to make the factor loadings more distinct (simple structure)
Varimax rotation
- Most commonly used technique:
· Orthogonal rotation, meaning the factors remain uncorrelated
· maximises the variance of the factor loadings for each factor (meaning there is large heterogeneity of these…)
Oblimin / Promax rotation:
- Frequently used technique:
· Factors loose their orthogonality - allow correlations to achieve ‘simple structure’
(factors nevertheless provide non-redundant information, since they rotate Principal components, see map illustration)
· Promax is simpler and quicker than Oblimin
If we have no external validators (e.g. a priori knowledge about the underlying latent variables from different sources)….
neither factor solutions are wrong or right, they are equivalent
Bartlett test of sphericity
if significant, the covariance matrix is suitable for analysis.
- If a covariance matrix is spherical, there is no need to run a factor analysis because the raw data variables are already (mostly) orthogonal in the first place
- Kaiser Meyer Olkin test for “sampling adequacy”
- Reports the proportion of variance across variables that is shared (with at least one other variable), relative to the total variance (=shared variance + sum of variance unique to each variable
Number of variables over factors, participants (or observations) etc…
· N (participants) / P (items) : between 5:1 and 10:1 recommendable; minimum 100 participants
· P (items) / M (factors) : 4:1
· N (subjects) / M (factors): 6:1
- Cross-loadings
· Ideal, if not too many cross-loadings.
· Cross-loadings are factor loadings of variables that have factor loadings of >0.3 on more than one factor, unless the difference to the highest factor is smaller than 0.2 (e.g. loading on F1: 0.7 and on F2: 0.4, then not considered a problem)
- Communalities h2
Explains how much variance of one (original, empirically measured) variable is explained by all extracted factors together
- Communality & Sample Size
· All communalities (greater )>0.6: N≥100 are sufficient
· Communalities ≈0.5 & only a few factors: 100 < N < 200 sufficient
· Communalities <0.5 & many factors: N>500 needed