Data Reduction: Exploratory Factor Analysis (EFA) Flashcards
What is the difference between PCA and EFA?
in PCA we don’t care why our variables are correlated, our only goal is to reduce the number of variables
in EFA we believe there are underlying causes as to why our variables are correlated and we have 2 goals:
1) reduce the number of variables
2) learn about and model the underlying (latent) causes of variables
What are latent variables?
latent variables are a theorised common cause of responses to a set of variables
- they explain correlations between measured variables
- they are held to be true
- there is no direct test of this theory
PCA does not have latent variables
Practical steps
How do we move from data and correlations to EFA?
1) check the appropriateness of the data and decide the appropriate estimator
2) decide which methods to use to select a number of factors (same as PCA methods)
3) decide conceptually whether to apply rotation and how to do so
4) decide criteria to assess and modify your solution
5) run the analysisi
6) evaluate the solution (decided in step 4)
7) select a final solution and interpret the model, labelling the results
8) report your results
Interpreting EFA output
Factor loadings
M1 M2 (with numbers underneath)
the numbers show the relationship of each measured variable to each measured factor
we interpret our factor models by the pattern and size of these loadings
Interpreting EFA output
Factor loadings
What are primary loadings
= the factor for which a variable has its highest loading
Interpreting EFA output
Factor loadings
What are cross-loadings?
= refer to all the other factor loadings (not primary loading) for a given measured variable
Interpreting EFA output
h2
= explained item variance
the square of the factor loadings tells us how much item variance is explained
Interpreting EFA output
us
= uniqueness
unexplained item variance
Interpreting EFA output
com
= complexity
Interpreting EFA output
SSloadings
= give the strength of the relationship between the item and the component (same as PCA)
range from 1 to -1 → higher = stronger relationship
Differences in PCA vs EFA
Dependent variable
PCA = the component
EFA = observed measures
Differences in PCA vs EFA
Independent variable
PCA = observed measures (x1, x2 …)
EFA = the factor (is regressed on the item)
Differences in PCA vs EFA
Aim
PCA = explains as much variance in the measures (x1, x2, …) as possible
EFA = models the relationship (correlation) between the variables
Differences in PCA vs EFA
Components vs factors
PCA = components = are determinant (there is only one solution for the component)
EFA = factors = are indeterminant (there is an infinite number of factor solutions that could be extracted from a given dataset)
What does it mean to model the data in EFA
EFA tries to explain patterns of correlations
e.g. if there is a correlation between y1 and y2 die to some factor - if we remove the factor there should be no correlation
if the model (factors) is good, it will explain all the interrelationships
What is the modification index?
it is R (i think) saying “there should be a correlation here and you said there isn’t one”
Variance in EFA
total variance
total variance = common variance + specific variance + error variance
Variance in EFA
true variance
common variance + specific variance
common variance = variance common to one item and at least one other item
specific variance = variance specific to an item that is not shared with any other items
Variance in EFA
unique variance
specific variance + error variance
specific variance = variance specific to an item that is not shared with any other items
error variance = random noise
EFA assumptions
1) the residuals/error terms should be uncorrelated
2) the residuals/error terms should not correlate with factors
3) relationships between items and factors should be linear (there are models that can account for non-linear relationships)
Data suitability
How do we know our data is suitable for EFA?
this boils down to: “is the data correlated?”
so initially we check our correlations to check they are moderate (>0.2)
Data suitability
squared multiple correlations (SMC)
this is another way to check our data is suitable for EFA
SMC = tells us how much item variance in an item is explained by all other items
SMC are multiple correlations of each item regressed on all other (p-1) variables
- this tells us how much variation is shared between an item and all other items
this is one way to estimate communalities
Estimating EFA
What do we do?
for PCA we use eigen decomposition but this is not an estimation method, it is simply a calculation
As we have a model for the data in EFA we need to estimate the model parameters (primarily the factor loadings
Estimating EFA
what are communalities?
communalities = estimates of how much true variance any variable has
therefore they also indicate how much variance in an item is explained by other variables
if we consider that EFA is trying to explain true common variance, then communalities are more useful to us than total variance
Estimating EFA
Estimating communalities
Difficulties
Estimating communalities is hard as population communalities are unknown
- they range from 0 (no shared variance) to 1 (all variance is shared)
- Occasionally estimates will be >1 (Heywood cases)
- methods of estimation are often iterative and ‘mechanical’
Estimating EFA
Methods of estimating communalities
Principal axis factoring (PAF)
this approach uses SMC to determine the stability of the values on the diagonal of our correlation matrix
1) compare initial communalities from SMC
2)Eigen decomposition = once we have these reasonable lower bounds, we substitute the 1s in the diagonal of our correlation matrix, with SMCs from step 1
3) obtain the factor loadings using eigen values and eigen vectors of the matrix obtained in step 2
some versions of PAF use an iterative process where they replace the diagonal with the communalities obtained in step 3, then do step 3 again, then replace the diagonal again etc.
Estimating EFA
Methods of estimating communalities
Method of minimum residuals (MINRES)
this is an iterative approach and the default of the FA procedure
it tries to minimise communalities on the diagonal
1) starts with some other solution e.g. PCA or principal axes, extracting a set number of factors
2) adjust the loadings of all factors on each variable so as to minimize the residual correlations for that variable
Estimating EFA
Methods of estimating communalities
Maximum likelihood estimation (MLE)
this is the best estimation method BUT it doesn’t always work (the other two will work no matter how bad your data is)
the procedure works to find values for these parameters that maximize the likelihood of obtaining the observed covariance matrix
Estimating EFA
Methods of estimating communalities
Maximum likelihood estimation (MLE)
Advantages
- provides numerous ‘fit’ statistics that you can use to evaluate how good your model is compared to other data
- MLE assumes a distribution for your data (e.g. normal distribution)
Estimating EFA
Methods of estimating communalities
Maximum likelihood estimation (MLE)
Disadvantages
- it is sometimes not possible to find values for factor loadings that equal MLE estimates - this is referred to as non-convergence
- MLE may produce impossible values of factor loadings (e.g. Heywood cases) or factor correlations (e.g. >1)
*MLE assumes data is continuous and this is not always the case
How to select a number of factors to keep in EFA?
We use the same methods mentioned in PCA
- variance explained …
- skree plots
- MAP
- parallel analysis
Use all of these to decide a plausible number of factors
- use MAP as the minimum
- use parallel analysis as the maximum
Factor rotation
Why are factor solutions hard to interpret?
- the pattern of factor loadings is not always clear
- the difference between primary and cross loadings can be small
Factor rotation
What is rotational interdeterminancy
= it means that there are an infinite number of pairs of factor loadings and factor score matrices which will fit the data equally well and are thus indistinguishable by any numeric criteria
in other words - there is no one unique solution to the factor problem
this is why the theoretical coherence of a model plays a bigger role in EFA than PCA
Factor rotation
Analytic rotation
Rotation aims to maximise the relationship of a measured item with a factor
= make primary loadings big and cross loadings small
original correlations are very noisy and difficult to find patterns so we rotate to simplify
although we can’t tell the methods apart numerically, we can select the rotation with the most coherent solution
Factor rotation
Simple structure
All factor rotations seek to optimize one or more aspects of simple structure:
1) each variable (row) should have at least one 0 loading
2) each factor (column) should have the same number of 0s as there are factors
3) every pair of factors (columns) should have several variables which load on one factor but not the other
4) when >4 factors are extracted, each pair of factors should have a large proportion of variables that do not load on either factor
5) every pair of factors should have a few variables that load on both factors
Factor rotation
Orthogonal rotation
Correlations between factors are 0
Axes are at right angles
Includes varmax and quartimax rotations
Factor rotation
Oblique rotation
This method is most recommended
Correlations between factors are NOT 0
this is useful as it is more like reality and as this whole thing is exploratory there is no need for this constraint
Axes are NOT at right angles
Includes promax and oblim rotations
Factor rotation
Oblique rotation interpretation
Pattern matrix
pattern matrix = matrix of regression weights (loadings) from factors to variables
Factor rotation
Oblique rotation interpretation
Structure matrix
structure matrix = matrix of correlations between factors and variables
Structure matrix = pattern matrix multiplied by factor correlations
(in orthogonal rotations, structure and pattern matrices are the same)
Evaluating results
Checking the results
start by examining how much variance each factor accounts for and the total amount of variance
we evaluate factors based on the size and sign (+/-) of the loadings that you deem to be salient ( generally loadings >0.3)
Evaluating results
Checking for trouble
REMEMBER - if you delete any items you must re-run FA starting from when you figure out how many factors to extract
*Heywood cases
= items with loadings >1 → this means that something is wrong and you should not trust these results
- items with no salient loadings?
= could be a signal of a problem which should be removed
= could be a signal of an additional factor - items with multiple salient loadings? (cross-loadings)
= indicated by item complexity values
*do any factors load on <3 items?
= 3 should be minimum
= may have over extracted
= might have too few items
Evaluating results
EFA checklist
✅ all factors load on 3+ items at salient levels
✅ all items have at least one loading above the salient cut off
✅ No Heywood cases
✅ complex items are removed (in accordance with the research goals)
✅ solution accounts for an acceptable level of variance (given in the research goals)
✅ item content of factors is coherent and substantively meaningful
Factor Congruence
Replicability
It is always good to test whether your study replicates well. This can be done by:
- collecting data on another sample
- splitting one large sample into two
then we can test these as exploratory vs confirmatory data
There are numerous methods for this
Factor Congruence
Replicability
congruence coefficients
= correlations between vectors of factor loadings across samples
“how similar are the loadings for M1 across the two samples?”
“how similar are they for M2”
Calculating congruence:
1) run factor model on sample 1
2) run factor model on sample 2
* ensure the same items are included and same number of factors specified
3) Calculate congruence
Factor Congruence
Replicability
(Tucker’s) congruence coefficients
= measures similarity independent of the mean size of the loadings
it is insensitive to a change in the sign of any pair of loadings
Basics:
<0.68 = terrible
>0.9 = good
>0.98 = excellent
Factor Congruence
Replicability
Confirmatory FA (CFA)
This is the better solution
In EFA all factors load on all items - these loading are purely data driven
In CFA we specify a model and test how well it fits the data
- we explicitly state which items relate to which factor
- we can test if the loadings are the same in different samples / groups / across time etc
Factor scores
What are factor scores?
they provide variables to represent what we measured in EFA so we can test our constructs
they use different pieces of information from the factor solution to compute a weighted score
- the scores are a combinations of observations, factor loadings and factor correlations (method dependant)
Factor scoring
Types of scores
Sum scoring (unit weighting)
This is the simplest approach to factor scoring
= sum the raw scores on the observed variables which have primary loadings on each factor
- which items to sum is a matter of defining what loadings are salient
These require strict properties to be present in the data (but these are rarely tested)
Factor scoring
Types of scores
Ten Berge Scores
this is the preferred method
= focus on producing scores with correlations that match to the factor correlations
Factor scoring
Types of scores
Structural equations modelling
= includes a measurement component (CFA) and a structural component (regression)
- doesn’t require you to compute factor scores
- requires good theory of measurement and structure
if your constructs don’t approximate simple structure you may have to turn to alternatives
How do you determine sample size in FA?
in the past, rules were based around the participant to item (N:p) ratio
BUT the crucial determinant is the communalities and items to factors (p:m)
fewer participants needed if:
- communalities are high and wide
- p:m was high (e.g. 20:7 = interaction effect)
general rule of psychology = more is better
GIGO
garbage in = garbage out
always check the quality of your data
PCA and FA can not make bad data → good data
Reliability
Aim of measurement
= to develop and use measurements of constructs to test psychological theories
Reliability
Measurement
Classical test theory
describes data from any measure as a combination of:
- the signal of the construct / the ‘true score’
- noise or ‘error’ the measure of other unintended things
observed score = true score + error
Reliability
Measurement
True score theory
if we assume of our test that:
1) it measures some ability or trait
2) in the world, there is a ‘true’ value or score for this test for each individual
then the reliability of the test is a measure of how well it reflects the true score
Reliability
Parallel tests
under certain assumptions (parallelism, tau equivalence, congeneric tests) the correlations between two parallel tests of the same construct provided an estimate of reliability
Parallel tests can come from several sources
Reliability
Sources of Parallel tests
Alternative forms of reliability
= correlation between two variants of a test
e.g. randomisation of stimuli, similar but not identical tests
Alternative tests (should) have equal means and variances
if the tests are perfectly reliable, then they should correlate perfectly
↳ since they won’t - this deviation provides the measure of reliability
alternatives can be expensive and time consuming - but they are becoming easier
Reliability
Sources of Parallel tests
Split-half reliability
= indicates how internally consistent a measure is
1) split the items (randomly) into a pair of equal subsets on n items
2) score the two subsets
3) correlate the scores
With an increasing number of items, the number of random splits becomes increasingly large
Reliability
Sources of Parallel tests
Split-half reliability
Cronbach’s alpha
the best known estimate for split half reliability is Cronbach’s alpha
- tells us “to what extent are observations consistent across items”
BUT it does not indicate whether items measure one unidimensional construct
cronbach’s alpha increases as you increase the number of items
this is represented by Spearman-Brown prophecy formula
Reliability
Sources of Parallel tests
Split-half reliability
McDonald’s Omega
Any item may measure:
- a general factor that loads on all items
- a group or specific factor that loads on a subset of items
Given this, we can derive two internal consistency measures
1) Omega hierarchical (ωh) = the proportion of an item variance that is general
2) Omega total (ωt) = the total proportion of reliable item variance
These are much more robust to the structure of your data and how it will work in the real world
Reliability
Sources of Parallel tests
Test-retest reliability
= correlation between tests taken at (at least) two different points in time
poses tricky questions:
- what is the appropriate time between measures?
- how stable should the construct be if we are to consider it a trait?
Reliability
Sources of Parallel tests
Interrater reliability
= do all the raters involved have consistent measures
We can determine interrater reliability by means of intraclass correlation coefficients
Reliability
Sources of Parallel tests
Interrater reliability
Intraclass correlation coefficients
this splits variance of ratings into multiple components:
- variance between subjects (across targets)
- variance within subjects (across raters, same targets)
- variance due to raters (same rater, across targets)
Uses of reliability?
it is useful to know how reliable our measure is for:
- implications of validity
- also allows us to correct for attenuation ( = estimates of effects are limited by reliability)
Validity
What is validity?
there are debates over the definition but basically:
it determines whether a test really measures what it is supposed to measure
debates over the definition lead to debates over what counts as evidence for validity
Evidence of validity …
… related to content
Content validity
= a test should contain only content relevant to the intended construct
= it should measure what it is intended to measure
Evidence of validity …
… related to content
Face validity
= does the test ‘appear’ to measure what it was designed to measure?
Evidence of validity …
… related to scale
Construct validity
= do the items measure a single intended construct
this is the most important
FA provides limited information towards this
Evidence of validity …
… relationships with other concepts
Convergent validity
= measure should have high correlations with other measures of the same construct
Evidence of validity …
… relationships with other concepts
Discriminate validity
= measure should have low correlations with measures of different constructs
Evidence of validity …
… relationships with other concepts
Nomological Net validity
= measure should have expected pattern (+/-) correlations with different sets of constructs
also some measures should vary depending on manipulations
Evidence of validity …
… relationships in terms of temporal sequence
Concurrent validity
= correlations with contemporaneous measures (tests done at the same time)
Evidence of validity …
… relationships in terms of temporal sequence
Predictive validity
= related to expected future outcomes (longituinal)
Evidence of validity …
… related to response processes
not commonly considered in validation studies
e.g. do tests of intelligence engage ‘problem solving’ behaviours
Evidence of validity …
… related to consequences
= should potential consequences of the test be considered part of the evidence for the test’s validity
Important measures for the use of tests:
- is the measure systematically biased or is it fair for all groups of test takers?
- does bias have social ramifications?
Reliability vs Validity
reliability = relation of true score to observed score
validity = correlations with other measures plays a key role
A score/measure can not correlate with anything more than it correlates with itself and therefore, reliability is the limit on validity