Exploratory Factor Analysis Flashcards
What techniques are utilised for identifying clusters of variables?
Factor analysis and Principle component analysis (PCA)
What three uses do these techniques have?
- Understanding the structure of a set of variables
- Construct a questionairre to measure an underlying variable
- Reduce a dataset to a more manageable size while retaining as much of the original information as possible
How does factor analysis attempt to achieve parsimony?
By explaining the maximum amount of common variance in a correlation matrix using the smallest number of explanatory constructs (latent variables )
How does this differ to PCA?
PCA attempts to explain the maximum amount of total variance in a correlation matrix by transforming the the original variables into linear components
What is meant by the term factor loading?
A factor loading refers to the coordinate of a variable along a classification axis (e.g. Pearson correlation between factor and variable)
What does a factor loading tell us
It tells us something about the relative contribution that a variable makes to a factor.
How are scores on the measured variables predicted in factor analysis?
From the means of those variables plus the person’s scores on the common factors multiplied by their factor loadings, plus scores on any unique factors within the data
What is meant by common factors?
factors that explain the correlations between variables
How are components predicted in PCA?
In PCA, the components are predicted from the measured variables.
Name a major assumption of factor analysis
One major assumption of factor analysis is that the algebraic factors represent real-world dimensions.
How is the weighted average calculated after PCA
Multiplying their score by the factor loadings
Why is the weighted average rarely used
as it is over simplistic and is influence by the measurement scores
What is the simplest technique for calculating factor score coefficients?
A regression technique;
What does the mean and variance look like using the regression technique?
the resulting actor scores have a mean of 0 and a variance equal to the squared multiple correlations between the estimated factor scores and the true factor values.
What is a downside to the regression model?
A downside is that the scores can correlate with other factor scores from a different orthogonal factor
What can be done to overcome this problem of the regression model? (2)
The Bartlett method and the Anderson-Rubin method can be used to overcome this problem. The Bartlett method produces factor scores that are unbiased and only produce correlations with their own factor. The Anderson-Rubin method is a method which produces factors which are uncorrelated and standardized.
How is the matrix of factor score coefficients obtained?
By multiplying the factor loadings by the inverse of the original correlation or R matrix (dividing the factor loadings by the correlation coefficients)
When is the Anderson-Rubin method best?
When uncorrelated scores are required
What does the method for discovering factors depend on?
whether the results should be generalized from the sample to the population (1) and whether you are exploring your data or testing a specific hypothesis (2).