SEM Flashcards
What is SEM a combination of ?
Confirmatory factor analysis (the measurement model)
Path analysis (the structural model)
What is path analysis?
An extension of multiple regression
what is the aim of path analysis?
Its aim is to provide estimates of the magnitude and significance of hypothesised causal connections between sets of variables.
in path analysis, we are interested in….
Interested in the size and direction of the direct and indirect effects between multiple variables.
so Simple path models are essentially
mediated regression.
Mediation implies a….
causal chain…. as there is a series of relationships…..
but path analysis
Path analysis is making a causal claim – but it is separate from the statistical analysis - more theory driven
When specifying a model, we must…
Ð Use theory and previous research and logical relations between variables to justify path model
Ð Then draw model using path diagram
Name the components of the path diagram
– Observed variables = Squares
– Unobserved (latent) variables = Oval / Circle
– Single headed arrows = causal relationships (direct paths)
– Double-headed arrows = correlation
– ERROR TERMS / Distrubance
– endogenous
– exogneous
what are endogenous variables?
Ð Considered the DVs
Ð Directional arrow inputs
Ð Can have arrow output turning DV into a mediator
Ð ‘Downstream’ variables caused by exogenous variables.
Ð extra point measuring or accounting for the other possible causal inputs not specified in the model (always going to unspecified and unaccountable variables that we haven’t measured taking effect – we model/associate these with an ERROR or DISTURBANCE term)
Ð Error terms or disturbance terms are represented by ovals as latent variable
WHAT ARE exogneous variables?
Ð They are IVs in the model
Ð No specified arrow input. We don’t specify that in the model.
Ð You can have multiple exogenous variables (can be correlated)
In order for analysis to be run, the model needs…
to be identified
what is model identification?
Ð REQUIRES sufficient UNIQUE pieces of information (i.e. correlations in the observed data) – this allows mathematical estimation of the model
Ð Is tricky with more complex model, but a rule of thumb below
what is the rule of thumb?
Maximum number of single connections between observed variables must equal or exceed the number of paths specified in the model
how to calculate model identification ?
Calculate using (v x v+1 /2) where v = knowns- variances
then compare to unknowns - variances
this includes:
FOR CFA / SEM errors factors factor loadings (not including the 1 denoted for the fixed error term) covariates between factors
what are the three types of identified models?
Ð Over-identified model = More correlations than free paths in the model
Ð Just-identified model (saturated model) = Correlations equal to number of free paths in the model
Ð Under-identified model = Fewer correlations than free paths – model cannot be estimated
what can you also calculate with the vxv+1/2 formula?
degrees of freedom
Recursive models are…. and are always….. in comparison to reciprocal
Recursive models = those with connections moving in the same direction (always identified)
Reciprocal = More complex, identification more complex – not common in psychology
when estimating the model, there are three types of effects?
Ð Direct effects and indirect effects
Ð Global model fit
what are direct and indirect effect analogous to?
(analogous to regression coefficients for ind predictors)
what are global model effects analogous to?
(analogous to ANOVA for R2)
formal definition of direct effect?
The path regression coefficients reflect DIRECT relations between one variable and another (controlling for the effect of any other variable also effecting the endogenous variables
Ð Same as Beta weights in MR
formal definition of indirect effect?
The effects of one variable on another variable via a mediator
to calculate indirect effects we….
simply multiply the standardised beta weights together
Ð Difficult with two or more mediators
Ð Again Sobel test (z test on the ratio of unstandardised indirect effect to its standard error – needs large samples) and bootstrapping (McKinnon)
total effect are…
Ð These represent the total causal DIRECT AND INDIRECT effects on one variable to another
Calculated by add/sum all direct and indirect effects
what are the “Tracing rules”?
Ð You cannot enter and exit a variable on an arrowhead
Ð You cannot enter a variable twice on the same trace
even if you find good model fit, it is important to remember ….
It is important to remember that just because you find a good model fit, doesn’t exclude the possibility another model will explain the data better
if a path weight is set to 0, what does it mean?
Paths omitted
are as important to model as paths included. Their absence is making a theoretical statement (even if not not explicitly expressed); e.g. ‘I hypothesise there are no direct effects of ethnicity and family background on grades’
why would we omit paths?
for model parsimony
Model is therefore simpler or more parsimonious than a full model with all possible paths required to be estimated
Parsimonious models (if plausible) have several advantages: Ð Simplest (but sufficient) models preferred in science Occam's razor - ‘all other things being equal, the simplest model is the most preferred’ Ð Easy for a reduced model to be a statistically worse fit than full model - if survives this test of fit then more credibility as plausible model
Explaining more with less – A saturated model would explain everything – but if just 2 variables used explaining 85% of the model would be a great parsimonious model.
take home message ….
Constraining the model in various ways as a way of testing theory or particular set of research questions: we want to rigorously test our data the best we can by having the most parsimoniously model.
recap… Ð Error terms reflect the
unmeasured
Be wary of models that are
close to saturation
what is a basic summery of Quantifying model fit ?
basic notion is difference between observed correlations the saturated full sample correlation and the implied correlation (reduced model) is the RESIDUAL and we want this to be as small as possible