SEM- structural equation modelling Flashcards

Question

Total effects in more complex models

Answer 1

1 direct effect = .30 2 indirect effects  N – Avoid – Dep: 0.34 * -0.03 = -0.01  N – Avoid – Cog Inflex – Dep: 0.34 * 0.15 * .45 = 0.02 o The total causal effect for N on Dep is the sum of the direct and indirect effects  Direct effect = 0.30  Sum of indirect effects = (-0.01 + 0.02) = 0.01  Total effect = 0.31

Answer 2

4 empirical conditions must be met to support causal inference (e.g. Kline, 2005/11) o Relationship: X should be correlated with Y o Temporal precedence: X must precede Y in time o Non-spuriousness: X-Y relationship should hold after controlling for other variables experimentally or statistically e.g. third variable issue o Correct effect priority: there are no reciprocal relationships between X and Y, or reversals of this relationship  The more well-specified a model is in terms of theory, logic etc, the more persuasive a case is made for a real model if model turns out to ‘fit’ data if direction of relationship specified a priori this strengthens the plausibility of the model. Can also specify path weights a priori. Omitting paths from model can also help with plausibility- see parsimony

Answer 3

•df is the difference between full and reduced model •Reduced model is more simple and parsimonious • Parsimonious models (if plausible) have several advantages: o (i) Simplest (but sufficient) models preferred in science  Occam's razor - ‘all other things being equal, the simplest model is the most preferred’ o (ii) Easy for a reduced model to be a statistically worse fit than full model - if survives this test of fit then more credibility as plausible model

Answer 4

o Bad model fit – model doesn’t explain data as well as other models might (e.g. a model with paths dropped/added) - refine or discard model o Good model fit – fails to disconfirm your model, the data is well explained by the paths specified in the model. But remember, you may have good model. But ‘fit’ is with reference to variables in your model (i) an alternative model with different specification of paths might be even better – still worth testing alternative models (ii) maybe there is a more complete model (more variables)

Answer 5

o Remember that correlation = direct+indirect+unanalysed effects; i.e. summing all effects will give original sample correlation o Sample correlation - If all possible paths (i.e. effects) are included in model (saturated model) they will sum to original sample correlation o Implied correlation - if only some paths estimated (reduced model), sum of effects will not automatically equal sample correlation – but give a predicted or implied correlation o Most measures of model fit are based on the discrepancy between sample and implied correlations (residual correlations) • If correlations from saturated model not that different from reduced model than you have a ‘good’ model

Answer 6

• Saturated model o all paths are estimated in model- 0df o the sample correlation matrix can therefore be reproduced perfectly (by adding up all effects) • Reduced model (called ‘default’ in AMOS) o not all paths are included (e.g. earlier example) o implied or predicted correlations therefore usually different from sample correlations

Answer 7

1) Chi-square test 2) SRMR (Standardised Root Mean Square Residual) 3) RMSEA (Root Mean Square Error of Approximation) 4) GFI (Goodness of Fit Index)

Answer 8

* THE χ 2 VALUE IS IN A COLUMN CALLED 'CMIN' IN AMOS. YOU WANT TO LOOK AT THE ROW FOR THE DEFAULT MODEL, THIS IS THE REDUCED MODEL. NON-SIGNIFICANT IS GOOD MODEL FIT * χ 2 M (model chi-square) =0 for a saturated model with 0 df (all paths have been estimated)  analogous to ‘error variance’ * χ 2 M (‘error’) increases when more paths are omitted * If χ2 M is significant, reduced model is significantly worse than a saturated model – i.e. is a ‘bad fit’ to data * Look for non-significance – indicates good fit (i.e. not significantly different from saturated model despite fewer paths) * Problem: with large samples, your model likely to be significantly worse even when differences in fit are substantively small

Answer 9

A model fit index Standardised Root Mean Square Residual • a residual correlation is the difference between a sample correlation and the implied correlation • the SRMR is based on the average absolute value of the residual correlations • an SRMR of zero would equal perfect fit (no residual, reduced model implied correlation is very similar to the full model sample correlation, so reduced model is not worse than full model, so we opt for the reduced model as it is more parsimonious) SRMR < . 1 indicates good fit

Answer 10

``` A model fit index Root Mean Square Error of Approximation • popular fit measure designed to asses the approximate fit of a model rewarding parsimony • of two models with similar explanatory power, the simpler model -- fewer paths (df) -- will be favoured • Browne and Cudeck (1993) suggested: o RMSEA < .05 – good fit o RMSEA < .08 – reasonable fit o RMSEAs above .10 poor fit ```

Answer 11

A model fit index goodness of fit index • different approach to model fit • compares researcher’s model with the independence model o independence model predicts all variables are independent (i.e. zero correlations) o analagous to R2 – estimates total variance accounted for by our model • Hu & Bentler (1999) guidelines: o GFI >.95 = good fit o GFI >.90 = adequate fit

Answer 12

• A full SEM is simply a combination of a measurement model (i.e. a CFA model) and a structural model (i.e. a path model, observed variables) • Full SEM extends observed variable path analysis (structural model) by creating a latent variable measurement model, and then examining relationships between these latent variable factors • This essentially involves a two-step process (nb. some approaches break these steps down further): o Specify and estimate a candidate measurement model (aka Confirmatory Factor Analysis) o Once you have a viable measurement model (CFA model), you re-specify the model as a structural model and examine the relationships between the latent factors

Answer 13

* The typical difference between the two is that in CFA we constrain factor loadings (usually to be 0) ie we do not allow all observed items/indicators to load freely on all of the factors * We can also specify the number of factors we want to extract * So the CFA model is a more constrained version of the EFA model

Answer 14

• Factors o are latent (or unmeasured) variables e.g extraversion, impulsivity, verbal IQ etc o represented by circles in path notation o Factors are typically assumed to cause variation in the indicators, so the single-headed directional arrow moves from the factor to the indicator e.g. general verbal IQ causes one to think of more words on a specific vocabulary test.

Answer 15

•  Indicators o are measured or indicator variables o are represented by squares (like any observed variable) o represent the actual items or measures directly assessed

Answer 16

• Factor Loadings o estimate the relationship between the factor and the observed indicator o can be thought of as the correlation between the factor and the indicator in standard CFA models o we would typically like these to be >.50

Answer 17

• Factor Covariances o estimate the relationship between the latent factors o we can use this information to examine the convergent and discriminant validity of the factors

Answer 18

• Error terms o these model variation in the indicator variable not accounted for by the factor e.g. anything that accounts for word vocabulary excluding verbal IQ o these error terms are usually uncorrelated with each other, but you could model error correlations if you expected that response across indicators would be caused by something other than the factors e.g. method effects (?)

Answer 19

1. refer to theory/previous research to ascertain appropriate model 2. Specify the model 3. Model identification 4. Model estimation 5. Testing model fit 6. Interpret model effects 7. Modifying models 8. Reporting results

Answer 20

• Error terms o cannot know the variance of unmeasured variables o fix the error variances to 1 in model specification o or, fix raw error loadings to 1 (AMOS default) – sets error variance based on indicator variance o important for model identification • Factor variance o factors unmeasured so variance also unknown o fix factor variance to 1 or set raw factor loading to 1 o only need one factor loading to be set to 1 per factor o again, important for identification of model

Answer 21

• CFA uses known values in the variance/covariance matrix to estimate unknown values e.g. factor loadings • To find if model is identified: o knowns: calculate number of observed covariances and variances e.g. v * (v + 1)/2, where v equals number of variables o unknowns: count up number of paths in the model (excluding the 1 factor loading for each factor that has been set to 1) and the error variances (count only the circles, their arrows have been set to 1 so you don’t need to count their small arrows) o calculate knowns – unknowns for model df o If model df greater than or equal to 0 then proceed o If not, need to re-specify model

Answer 22

• Simple heuristics for standard CFA models e.g. models with uncorrelated error terms and where each indicator loads on just one factor o (A) If a model with a single factor has 3 or more indicators it will be identified o (B) If a model with 2 or more factors has 2 or more indicators per factor it will be identified

Answer 23

o factor correlations > .75-.80 suggest that the model is ‘overfactored’ – a more plausible model might involve collapsing the factors in to one and re-estimating model o this is where you also need to rely on what theory and previous research suggest • factor loadings >.5 are good o this suggests each indicator is doing a good job of representing the factor o if factor loadings are low, it may suggest you should remove this indicator from your measurement in the future

Answer 24

Way of refining CFA model  Starts with a bare-bones model then adds path(s)  If extra paths significantly improve fit these are added to model

Answer 25

Way of refining CFA model o Model trimming  Typically starts with a saturated model (usually but not necessarily) and simplifies it by eliminating paths  If the model fit does not significantly deteriorate then paths can be removed (model is no worse but is simpler- parsimonious models more favourable)

Answer 26

* Modification indicies (MI) can be used to add individual paths to the model * The larger the MI the greater the improvement in model fit. * Usual convention is MI >4 suggest an improvement in model fit and path should be added. * In the case below, neither of the added paths improve the model fit.

Answer 27

• A chi-square difference test can be conducted using chi-square values and degrees of freedom from any two ‘nested’ models. • A nested model is a model that uses the same variables (and cases!) as another model but specifies at least one additional parameter (path?) to be estimated. A. Calculate the X² (chi-square) for the first model (X² m1) B. Calculate the X² for the second model with paths added (X² m2) C. Calculate the difference X² D (i.e. X² m2-X² m1) • If X² D is significant, then the model is significantly improved by adding paths and these can be retained in your refined model (see below for explanation of X² D significance) • the X² for the 2 factor model was 3.25 (4df). • the X² for the 1 factor model was 51.08 (5df) • so, to test the difference in fit we calculate the difference in X² between the models (X² D = 47.83) and evaluate this against the chi-square critical value for 1 df (difference between 2 factor and 1 factor model df, i.e. one path removed). If x2 d exceeds the critical value, then the test is significant.

Answer 28

Calculate the X² (chi-square) for the first model (X² m1) B. Calculate the X² for the second model with paths added (X² m2) C. Calculate the difference X² D (i.e. X² m2-X² m1) • If X² D is significant, then the model is significantly improved with more paths. If it is nonsignificant, the model is not significantly worse with fewer paths, so we accept the trimmed/reduced model for parsimony.

Answer 29

• Theoretical approach o model trimming/building guided by theoretical a priori considerations • Empirical approach o Paths are added or deleted from model purely on basis of statistical criteria o In model building, MIs for all paths are examined to see which ones significantly improve model o can capitalise on chance correlations o this type of SEM is more exploratory (cannot claim you are ‘confirming’ theory) o credibility of model improved if model structure replicated in another sample

Answer 30

* Test an SEM across a categorical variable e.g. gender, cultural group etc * We might want to look at model estimates in different groups, or see whether a particular model holds across group ie it is invariant across group * This can be done for a CFA model or a full SEM * Uses the principle of iteratively constraining parameters in the model to equality across the groups (implying they are the same in each group), and then looking to see if this produces a significant decrement in model fit * If a significant decrease in model fit occurs, you then have to identify which parameter/s have caused this problem i.e. you can iteratively free parameters to identify the source of the misfit

Answer 31

• The assumptions largely follow from those for correlation/regression analyses (see the appropriate lecture). • Linearity- dependent (endogenous) variables should be linearly related to independent variables • SEM programmes can handle continuous and categorical variables, but check for coding of categorical variables and make sure programme knows what codes are being used • Normality- residuals should be normally distributed and homoscedastic • Identification- models cannot be under-identified • Adequate sample size o Kline recommends at least 10 times as many cases as parameters (paths) – ideally 20 times • Proper Model Specification o specification error occurs when common causal variables are left out of the model • Disturbances uncorrelated with endogenous variables (same as MR – errors uncorrelated with independent variables) • No multicollinearity • Exogenous variables are reliably measured

SEM- structural equation modelling Flashcards

(55 cards)