M8 - EFA Flashcards
Why do we need to know about factor analysis.
- 5% of recent papers in Australian Journal of Psychology rely on FA or multivariate design
- need to be able to understand results of FA and critically assess scientific articles even as a practitioner
- multivariate analysis also relies on FA
What is exploratory factor analysis and what types are there?
Factor analysis is a statistical analysis process that examines the relationships between measured variables and their relationships with underlying latent factors.
Types of EFA fitting techniques
- Principle Components
- ->estimates components that account for 100% of TOTAL variance in variables
- Factor Techniques
- ->estimates components that account for 100% of SHARED variance between variables
- -> Least Squared - minimises the difference between data and FA model
- ->Maximuim Likelihood - finds most probable FA
What can exploratory factor analysis be used for?
It helps us to examine relationships between multiple variables at once, and their relationship to latent variables unlike MR
What is meant by Dimensions
Dimension refers to the number of variables (k)
Each variable contributes a dimension to the analysis
What is meant by Eigenvectors
Eigenvector refers to factor direction
Factor relate to the relationship between two or more variables, so the eigenvector is the direction of this relationship
Only need the same number of eigenvectors as there are dimensions/variables
What is meant by Eigenvalues
Eigenvalue refers to the factor variance (length of the vector)
Sum Total of eigenvalue will = number of variables
Eigenvalues = 1 = average
> 1 better than average
< 1 worse than average
What the basic requirements of a factor analysis are.
Sample size
n = 50 very poor - can’t generalise to wider population
n = 1000 excellent
n = 300 guide for minimum required
Monte Carlo testing reveals 5000 cases provide robust solution
Missing data - deal with prior to analysis
1) pairwise
- cases with the missing data are removed
- this bias’ results
2) listwise
- cases with missing data are deleted
- removes bias
- but if missing data not random then can’t apply to both cases
Replace missing data
- regression replacement - tends to reduce variance
- expectation maximisation (EM) Replacement - tends to confirm assumptions about the data
- multiple Imputation - MOST ROBUST, technically difficult, tends to confirm assumptions
Assumptions
- Normality constraints
- Multivariate normality (across all variable)
- -> Mardia’s coefficient
- -> Mahalanobis distance
- Linearity between variables - transform or use non-linear stats
- Outliers - should be removed or reduced
- Multicollinearity and singularity
When one should stop extracting factors.
The total number of factors can theoretically max out at the total number of variables - but we are trying to reduce the data
Tools to decide - different resaerch use different tools
- Variance explained (eigenvalue > some value eg .6)
- Variance explained (eigenvalue > 1 ) - 1 is mean
- Residual correlations - not > .2 ideally not > than .05
- scree plot - stop at or before the elbow
- specify a fixed # of factors
How to assess if the factor analysis has worked.
- Goodness of fit
chi2 significant means not a good fit - Check Reproduced Correlation matrix
If prime diagonal values are the same as in the communalities matrix (R2)
Then look at correlations off the main diagonal and compare them to the original correlation matrix - Look at residuals
> .2 not enough factors
> .05 might consider adding factor
How to make the factor analysis fit better - rotations.
Rotation helps to change the balance of the variance explained between the factors. It doesn’t increase the variance.
- orderly movement of the axes such that they are uncorrelated
- improves the interpretation
- moves the coordinates not the actual data
Different types of rotations
Orthogonal types
- to deal with multicollinearity
- Quartimax - reduces number of factors to explain each variable
- Varimax (most common) - reduces # of variable with high loadings - simplified interpretation
- Equamax - combination of above two
Oblique types
- Direct oblimin
- Promax
What is KMO and Sphericity?
When runnings an EFA, the KMO and Sphericity are tests to help justify the running of the FA
Run FA and Check Output
- KMO Keiser-Meyer-Olkin Test
> .05 then doing FA is justified - Bartletts test of sphericity tests significance of the difference between correlation matrix values and hypothetical situation where the correlation is 0.
p < .1 then this justifies doing an FA