10. Testing scales and CFA Flashcards
What is CFA?
CFA is a statistical technique used to verify the factor structure of a set of observed variables. CFA allows the researcher to test the hypothesis that a relationship between observed variables and their underlying latent constructs exists.
- CFA is extremely versatile method to test different psychometric
properties of the scale - Psychometric properties- quality of the scale, determining how
we can use it in further research, how trustworthy are the results
we get utilizing the scale - CFA- confirms specific structure of the scale- dimensions
- CFA- shows how reliable are indicators- which are significant
and how much do they correlate with specific factors - CFA- confirms integrity of the scale- separated from other
constructs - CFA- shows whether we can use scale for different groups of
respondents
Explain the structure of the CFA model
Estimated parameters of the model:
If we are providing information about the variance-covariance of
items only
* Factor loadings- lambda
* Variances (covariances) of errors- theta
* Factor variances- xi (and if more than one covariances- phi)
Can be extended providing the means of items beside var-cov
* Means for latent factors- kappa κ
* Intercepts for each indicator/item- tau τ (hence tau-equivalence)
How to do model identification
In order to estimate CFA the model has to be IDENTIFIED
* How much information are we feeding into the analysis?
* Variance and covariance of items is known
* Estimating coefficients, variance of factor and errors
* Identification formula: b= p(p+1)/2
* Difference between knowns and unknowns is degrees of freedom for the model- is used when testing the distribution chi-sq
* Underidentified model- known < unknowns
* Just identified model- knowns = uknowns
* Overidentified model- knowns > unknowns
WHAT ABOUT THE SCALE OF THE FACTOR?
* Marker= reference for the factor scale
* Another way- set the variance of factor to 1 this parameter will therefore not be estimated
What does CFA do?
- Tries to estimate parameters (factor loadings) so the sample var-cov matrix can be “reproduced” back as
closely as possible - Very similar logic as in EFA- wrap up relationship between multiple indicators to as few factors as possible
In most models the var-cov matrix will never be completely reproduced by CFA solution, however the goal is to
get as close as possible - Maximizes the probability of observing the available data if it was to be drawn from the same population
again
Why using CFA
Testing errors and cross-loadings
Helps us to refine scales, possibly kick out items, notice some problems with
phrasing
Allows for comparison of models
* Statistical evaluation of the fit with and without restrictions on the model
* Nested models- subset of freely estimated parameters of another model
* Fixing parts of the model to a specific value (f.e. zero)
* Constraining parts of the model (f.e. factor variance needs to be equal)
All necessary when evaluating quality of scales- such as tau equivalence
Validty of the scale and checking common method variance bias
* Restricting parts of the model between groups, methods
Checking whether the scale works the same for different groups of individuals
Explain logic of significance testing
In most models the var-cov matrix will never be completely reproduced by CFA solution, however the goal is to
get as close as possible
CFA fitting function
𝑭𝑴𝑳=𝒍𝒏 𝑺 − 𝒍𝒏 𝜮 + 𝒕𝒓𝒂𝒄𝒆 𝑺 𝜮−𝟏 − 𝑷
𝜒2 = 𝐹𝑀𝐿 𝑁 − 1
The classic old-school chi-squared test whether H0: Σ=S
* Serves as base for calculation of other indices and can be used to compare nested models
* Needed for the test: Degrees of freedom- Critical value
* P-value- n null hypothesis significance testing, the p-value is the probability of obtaining test results at least
as extreme as the results actually observed, under the assumption that the null hypothesis is correct.
Explain goodness of fit evaluation
The classic old-school chi-squared test whether H0: Σ=S
* Chi-squared is not a great approximation of the distribution in many instances (small dataset)
* For big datasets it is severely inflated- the hypothesis is obviously not true
* Serves as base for calculation of other indices and can be used to compare nested models
OTHER GOODNESS OF FIT MEASURES
* SRMR- standardized root mean square residual- absolute fit
* Hu and Bentler (1999) Cut off point- lower than or close to 0.08
* RMSEA- root mean square error of approximation
* Hu and Bentler (1999) Cut off point- lower than or close to 0.06
* CFI-comparative fit index
* TLI- Tucker-Lewis index
* Hu and Bentler (1999) Cut off point- higher than or close to 0.95
AIC- Akaike information criterion and BIC -Bayes information criterion
* No cut-off points, but smaller number better
* Can compare non-nested models
How to go from EFA to CFA
Taking the same data as you had in EFA, but other half of the sample (in the application video, Alexander looks only at 2
factors, for simplicity, but you could easily test all three, exactly as they came out of EFA)
* You expect the three-factor structure based on EFA- set F5, F9 and F13 as the marker variables- fixed loadings to 1
* Allow for covariance between factors (you expect correlation)
Is the construct reliable?
- Question of accuracy- often shown as level of “inter-connectedness
between the items” their ability to uncover the true construct - Should be shown in every sample (standard report of Cronbach alpha)
- Test-retest
- Internal consistency measures
Is the construct valid?
Are we really measuring what we want to measure? Is the measure
distinguishable from other constructs and does it fit into the
nomological net- predicts relevant behaviour/attitudes?
* Content validity
* Criterion validity
* Construct validity
Is the construct invariant?
Is the construct stable enough to compare individuals across genders,
nationalities, language groups, age groups etc…
* Measurement invariance (different levels)
How do we test reliability with CFA?
Cronbach alpha- one of the most frequently used measures for sum scales
* 𝛼 = (𝑘2 ∗ 𝐶𝑂𝑉) / σ 𝑆2 , 𝐶𝑂𝑉
* k- number of items squared * mean inter-item covariance / the sum of the squared var/cov matrix
* Will increase with more items and positive correlation between the items
* Proportion of variance the scale would explain in the “true scale” (that is imagined)
* CRONBACH ALPHA ASSUMES THAT THE CONSTRUCT IS PARALLEL OR AT LEAST TAU-EQUIVALENT
* For absorption Cronbach alpha would be 0,757
McDonald’s omega
* (Standardized factor loadings)2 / (Standardized factor loadings)2 + their stand. error variances (for uncorrelated errors only)
* Does not assume parallel or tau-equivalent model
* Same interpretation as alpha- how much true variance does our scale explain?
* For Absorption omega would be 0,765
The more the construct doesn’t fulfil assumptions, the more biased is Cronbach alpha
Name the 3 constraining parameters
To be able to test reliability with Cronbach alpha- we need to prove
Congeneric, tau-equivalent and parallel indicators
* Congeneric indicators just have to be independent and predict the same factor
* Tau-equivalent indicators have to predict the same factor in the same amount- equal
factor loadings
* Parallel indicators have to measure the same construct with the same precision- have
to be psychometrically interchangeable- same factor, same factor loadings, same
errors variances
Validity with CFA
- Content validity- are the items good representation of the targeted construct? Do they cover the content
domain? - Criterion validity- Convergent validity- can the construct predict criterion variable?
- Concurrent and predictive- is the data collected on both at the same time or separately?
- Construct validity- does the construct fit with other constructs already in existence?
- Discriminant validity- distinguished from constructs it should be distinguished from (especially if they could
be very similar- like cultural intelligence and global mindset)
Measurement invariance with CFA
What can cause measurement variance and why do we need to establish invariance?
* We develop constructs to draw comparisons
* We assume that the observed relative or absolute differences on the construct are result of true differences and not a results
of a measurement error
* Measurement invariance testing- ARE WE COMPARING ON THE SAME SCALE?
* Measurement invariance testing proves that the construct is conceptualized/understood and scaled in similar manner across
groups