M6 - SEM Flashcards
Part C - Question: Which of the following is true about modifying your model? (Tick all that apply)
If the MI values suggest a directional relationship between IV and DV (IV -> DV) and a covariance (IV DV), and supposing a relationship makes sense here, you need to choose one of the two to include, but cannot include both in your revised model..
Inclusion of new paths should be justified on theoretical and statistical grounds..
Smaller MI values indicate greater improvement in model fit from including a recommended path..
Removal of paths from your model should be based on modification index (MI) values..
If the MI values suggest a directional relationship between IV and DV (IV -> DV) and a covariance (IV DV), and supposing a relationship makes sense here, you need to choose one of the two to include, but cannot include both in your revised model..
Inclusion of new paths should be justified on theoretical and statistical grounds..
Model 1
Parenting–>Loneliness–>Depression
Model 2
Parenting–>Depression–>Loneliness
The models should be compared using…
- BIC values since the models are nested.
- a chi square difference test since the models are non-nested.
- a chi square difference test since the models are nested.
- BIC values since the models are non-nested
- BIC values since the models are non-nested
Model
Negative Mood ---> Binge Eating Body Dissatisfaction --> Binge Eating Dietary Restraint --> Binge Eating Body Dissatisfaction --> Negative Mood Body Dissatisfaction --> Dietary Restraint eBE --> Binge Eating eNM --> Negative Mood eDR --> Dietary Restraint
Is the model…
- Unclear from information provided.
- Under-identified.
- Over-identified.
- Just identified
Over-identified…
(v(v+1))/2 = (4*5)/2 =10 things that we can estimate
Things to estimate: 3 error terms (e1-e3), 1 variance (BD), and 5 relationships between variables = 9 things to estimate
What is structural equation modelling?
- Umbrella term for Path analysis, CFA and combination of the two
- Specific term of the combination of Path Analysis and CFA in the one model
It looks at the relationships between variables and latent factors based on statistical measurement of how variables load onto factors also accounting for error
What is a “structural model” and what is a “measurement model”?
Structural model is Path Analysis
- contains measured variables + errors
- only the error variance of the DV is accounted for, not the error variance of the IV
Measurement Model is CFA
- contains measured variables + error
- contains implied latent constructs
- accounts for error of the variables that load onto construct
- more pure estimate of true variance
In an SEM
- structural component connect the constructs via paths
- measurement component estimates the indicators of the construct and portions our error variance
How does SEM differ from Path Analysis?
SEM contains both path analysis (structural) components and CFA (measurement) components
SEM contains latent factors that are estimates in the measurement component. Their relationships are indicated via the structural path analysis component.
Path Analysis contains no latent variable, just measured ones
SEM is able to portion out the error of a latent factor by measuring the variables that load onto that factor and estimating their error component, thus forming a more pure, accurate estimate of the relationships between the variables and factors
Path Analysis expresses the IVs grouped together with its error variance
Explain the assumptions of SEM (excluding identification)?
Same as regression
- independence of error
- normality of distribution
- outliers
- linearity
- multicollinearity/singularity
- sample size
additional assumptions
- Nature of the DV - difficult to analyse with categorical DVs. Not impossible but tricky in AMOS
- Missing Data - needs to be dealt with (ie in SPSS) before entering into AMOS
- Samples size testing a bit more complicated in SEM - Rules of thumb 5:1, 10:1, 20:1 per parameter
Note - If assumptions are violated and not dealt with the SEM model and analysis will be invalid
What does identification mean and how is it expressed as an equation in SEM?
Identification refers to the limit of relationships that can be tested simultaneously dependent on the number of variables or parameters
This can be expressed as:
(v(v+1))/2 (testing without means) where v = measured variables
nPAR = # df available for use in the model
(# parameter)
DF = degree of freedom left over
What are the three types of identification in SEM and what do they mean?
just identified –> all df used. none left over
under-identified –> too many df used. (not enough parameters have been identified to carry out analysis). Model wont run
over-identified –> df left over. We have not used all the df available to be used in the analysis
Over identified is more parsimonious and there are still df to examine global goodness of fit and add paths if desired
What is empirical under-identification and what are potential causes?
Empirical under-identification
- AMOS message warns –>Model wont run
- this is due to:
- -> Feedback loops (non-recursive model - computer speak for not going through to finish point ie going round in circles)
- -> Multicollinearity
How many variables are needed for sufficient identification for one, two or more factors and what is ideal?
# of variables needed for sufficiently identified models One Factor -->at least three Two Factors + --> two variables per factor providing each factor correlates with at least one other factor (oblique relationship) Generally --> three variables per factor Ideally --> four variables per factor to be able to get meaningful global fit statistics
What should you do when you have a poor fitting model?
Look to the local fit and global fit statistics to see where changes might be made or necessary
Look to the modification indices
- the larger the MI, the more the chi2 will decrease (by the amount of the MI if that path is added) and the more the model will be improved
an MI change > 3.84 will significantly improve the model at p = .05 level
Add paths that make sense to increase complexity, explain more of the data relationships and decrease chi2 (improving fit statistics)
trim non-significant paths to increase parsimony
- will generally worsen fit statistics
except those that penalise complexity like TFI and RMSEA
change only one path at a time
How would you compare several competing models? (Nested)
Need to look at overall fit statistics - both global and local, to assess different models
Nested
nested models need to be compared use the LRT - Likelihood Ratio Test
LRT is difference is Chi2
–> Chi2 model1 - Chi2 model2
–> df model 1 - df model2
Significant if Chi2 change is > 3.84 at 1 df
Compare using global and local fit statistics also R2 more the better RMSEA < .08 / .11 if n = 200 TLI CFI > .95 SRMR < .06 normative chi2 chi2/df <3 chi2 lower the better
How would you compare several competing models? (Non-nested)
Non-nested
- meaningless to compare chi2 of non-nested models
- Use BIC Bayesian Information Criterion
- low BIC is better fitting
- BIC difference > 10 is very strong support for mode with lower BIC (can’t get significance because no Chi2 comparison)
Compare using global and local fit statistics also R2 more the better RMSEA < .08 / .11 if n = 200 TLI CFI > .95 SRMR < .06 normative chi2 chi2/df <3 chi2 lower the better standardised b weights (beta)
What you gain from losing a non-significant pathway (parsimony) usually outweighs what you lose from having less of the data explained (complexity)