SEM Flashcards

1
Q

What is SEM a combination of ?

A

Confirmatory factor analysis (the measurement model)

Path analysis (the structural model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is path analysis?

A

An extension of multiple regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the aim of path analysis?

A

Its aim is to provide estimates of the magnitude and significance of hypothesised causal connections between sets of variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

in path analysis, we are interested in….

A

Interested in the size and direction of the direct and indirect effects between multiple variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

so Simple path models are essentially

A

mediated regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mediation implies a….

A

causal chain…. as there is a series of relationships…..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

but path analysis

A

Path analysis is making a causal claim – but it is separate from the statistical analysis - more theory driven

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When specifying a model, we must…

A

Ð Use theory and previous research and logical relations between variables to justify path model
Ð Then draw model using path diagram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name the components of the path diagram

A

– Observed variables = Squares
– Unobserved (latent) variables = Oval / Circle
– Single headed arrows = causal relationships (direct paths)
– Double-headed arrows = correlation
– ERROR TERMS / Distrubance

– endogenous
– exogneous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are endogenous variables?

A

Ð Considered the DVs
Ð Directional arrow inputs
Ð Can have arrow output turning DV into a mediator
Ð ‘Downstream’ variables caused by exogenous variables.
Ð extra point measuring or accounting for the other possible causal inputs not specified in the model (always going to unspecified and unaccountable variables that we haven’t measured taking effect – we model/associate these with an ERROR or DISTURBANCE term)
Ð Error terms or disturbance terms are represented by ovals as latent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

WHAT ARE exogneous variables?

A

Ð They are IVs in the model
Ð No specified arrow input. We don’t specify that in the model.
Ð You can have multiple exogenous variables (can be correlated)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In order for analysis to be run, the model needs…

A

to be identified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is model identification?

A

Ð REQUIRES sufficient UNIQUE pieces of information (i.e. correlations in the observed data) – this allows mathematical estimation of the model
Ð Is tricky with more complex model, but a rule of thumb below

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the rule of thumb?

A

Maximum number of single connections between observed variables must equal or exceed the number of paths specified in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how to calculate model identification ?

A

Calculate using (v x v+1 /2) where v = knowns- variances

then compare to unknowns - variances
this includes:

FOR CFA / SEM
errors
factors
factor loadings (not including the 1 denoted for the fixed error term)
covariates between factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the three types of identified models?

A

Ð Over-identified model = More correlations than free paths in the model
Ð Just-identified model (saturated model) = Correlations equal to number of free paths in the model
Ð Under-identified model = Fewer correlations than free paths – model cannot be estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what can you also calculate with the vxv+1/2 formula?

A

degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Recursive models are…. and are always….. in comparison to reciprocal

A

Recursive models = those with connections moving in the same direction (always identified)
Reciprocal = More complex, identification more complex – not common in psychology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

when estimating the model, there are three types of effects?

A

Ð Direct effects and indirect effects

Ð Global model fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what are direct and indirect effect analogous to?

A

(analogous to regression coefficients for ind predictors)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what are global model effects analogous to?

A

(analogous to ANOVA for R2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

formal definition of direct effect?

A

The path regression coefficients reflect DIRECT relations between one variable and another (controlling for the effect of any other variable also effecting the endogenous variables
Ð Same as Beta weights in MR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

formal definition of indirect effect?

A

The effects of one variable on another variable via a mediator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

to calculate indirect effects we….

A

simply multiply the standardised beta weights together

Ð Difficult with two or more mediators
Ð Again Sobel test (z test on the ratio of unstandardised indirect effect to its standard error – needs large samples) and bootstrapping (McKinnon)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

total effect are…

A

Ð These represent the total causal DIRECT AND INDIRECT effects on one variable to another

Calculated by add/sum all direct and indirect effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what are the “Tracing rules”?

A

Ð You cannot enter and exit a variable on an arrowhead

Ð You cannot enter a variable twice on the same trace

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

even if you find good model fit, it is important to remember ….

A

It is important to remember that just because you find a good model fit, doesn’t exclude the possibility another model will explain the data better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

if a path weight is set to 0, what does it mean?

A

Paths omitted

are as important to model as paths included. Their absence is making a theoretical statement (even if not not explicitly expressed); e.g. ‘I hypothesise there are no direct effects of ethnicity and family background on grades’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

why would we omit paths?

A

for model parsimony

Model is therefore simpler or more parsimonious than a full model with all possible paths required to be estimated

Parsimonious models (if plausible) have several advantages: 
Ð	Simplest (but sufficient) models preferred in science Occam's razor - ‘all other things being equal, the simplest model is the most preferred’ 
Ð	Easy for a reduced model to be a statistically worse fit than full model - if survives this test of fit then more credibility as plausible model 

Explaining more with less – A saturated model would explain everything – but if just 2 variables used explaining 85% of the model would be a great parsimonious model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

take home message ….

A

Constraining the model in various ways as a way of testing theory or particular set of research questions: we want to rigorously test our data the best we can by having the most parsimoniously model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

recap… Ð Error terms reflect the

A

unmeasured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Be wary of models that are

A

close to saturation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

what is a basic summery of Quantifying model fit ?

A

basic notion is difference between observed correlations the saturated full sample correlation and the implied correlation (reduced model) is the RESIDUAL and we want this to be as small as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

what are the four ways of quantifying model fit in both path analysis / SEM / confimatory factor analysis ?

A

come in sr. mr. Ramseay: go fuck it

1) Chi-square test (as a minimum – CMIN in AMOS)
2) Standardised Root Mean Square residual (SRMR)
3) Root mean square error of approximation (RMSEA)
4) Goodness of fit index

35
Q

a chi 2 of 0 would mean

A

Ð If a x2 of 0 it will be the saturated model

36
Q

as the number of paths are reduced, chi 2 …. and why?

A

goes up, fewer degrees of freedom - and error increases

37
Q

what do we want the chi 2 value to be?

A

high

Significance of x2 is a measure of bad fit.

So we are looking for non-significance as we are looking to show the results are not significantly different from the saturated model despite having fewer paths.

38
Q

what is Standardised Root Mean Square residual (SRMR)

?

A

Ð a residual correlation is the difference between a sample correlation and the implied correlation

Ð the SRMR is based on the average absolute value of the residual correlations

39
Q

what value are we looking for?

A

Ð an SRMR of zero would equal perfect fit (no residual)

Ð SRMR

40
Q

what is Root Mean Square Error of Approximation (RMSEA)

?

A

Ð popular fit measure

Ð designed to assess the approximate fit of a model rewarding parsimony of two models with similar explanatory power, the simpler model – fewer paths (df) – will be favoured

41
Q

what value are we looking for?

A

Ð Browne and Cudeck (1993) suggested:
• RMSEA < .05 – good fit

  • RMSEA < .08 – reasonable fit
  • RMSEAs above .10 poor fit
42
Q

what is the goodness of fit?

A

Ð Analogous to R2 (estimates total variance accounted for by the model)

43
Q

what values good?

A

Ð Values closer to 1 are better fit

Ð Hu & Bentler (1999): >.95 = good >.90=adequate

44
Q

theoretical issues to remember with model fit…

A

Purpose of model fitting is to rule out bad models – cannot prove a model is good

Bad model fit – model doesn’t explain data as well as other models might (e.g. a model with paths dropped/added) - refine or discard model

Good model fit – fails to disconfirm your model – you may have good model. But ‘fit’ is with reference to variables in your model. Ur model is not the 1 and only model.

(i) an alternative model with different specification of paths might be even better – still worth testing alternative models 
(ii) maybe there is a more complete model (more variables) 

		But status of ‘not yet disconfirmed’ is powerful in science
45
Q

the FIRST step toward building a full SEM is?

A

Confirmatory Factor Analysis

46
Q

what does CFA the do?

A

Ð Helps to confirm a structure and test a theoretically driven model of psychological measures

Ð i.e. Once we have an EFA-derived measure, we can administer it to a new sample, and see if we can confirm the original measurement model.
Ð Provides imp info on how a measurement tool is structured /and /or how latent factors are related to each other.

47
Q

principal difference between EFA and CFA?

A

Ð In CFA we CONSTRAIN factor loadings (usually to 0) – i.e. we do NOT allow observed items/indicators to load freely on all of the other factors (cutting off some data from some factors – EFA is saturated as they can load freely on all factors without any constraints – CFA is about CONFIRMING
Ð So the CFA model is more constrained then the EFA model

48
Q

what shows the strength of the relationships between factors?

A

Factor loadings: estimate the relationship. Can be thought of as correlation. Need to be >.50

49
Q

what are Factor Covariances: ?

A

Factor Covariances: estimates the relationship between latent factors. USED to examine the convergent and discriminant validity of factors

50
Q

what do error terms represent?

A

♣ model variation in the indicator variable not accounted for by the factor e.g. anything that accounts for word vocabulary excluding verbal IQ

these error terms are usually uncorrelated with each other, but you could model error correlations if you expected that response across indicators would be caused by something other than the factors e.g. method effects

51
Q

7 Steps to setting up a CFA model?

A

1) Specify the model
2) Model identification
3) Model estimation
4) Testing model fit
5) Interpret model effects
6) Modifying models
7) Reporting results

sperm molesting inmates effect modest reporter

52
Q

when we specify the model, we are generally

A

(SETTING UP STRUCTURE)

Ð cannot know the variance of unmeasured variables

Ð fix the error variances to 1 in model specification
Ð Factor also unmeasured so again variance unknown
Ð Set to 1 again – but only need one factor loading per factor
Ð IMPORTANT FOR IDENTIFICATION
Ð Software does it for you

53
Q

how to calculate model identification ?

A

o knowns: calculate number of observed covariances and variances
e.g. v * (v + 1)/2, where v equals number of variables

o unknowns: count up number of free paths and variances

o calculate knowns – unknowns for model df

o If model df greater than or equal to 0 then proceed

If not, need to re-specify model

54
Q

model estimation is really 2 things….

A

Ð Estimate model parameters (factor loadings and covariances)
Ð Test global model fit (and against alternative models)

55
Q

how do we test model fit?

A

o we use the same model fit indices from earlier e.g model chi- square, RMSEA etc

o there is no gold standard fit index

o lot of debate about golden rules (and otherwise) for various fit indices

o need to consider and report a range of fit indices

o think about the fit indices in the context of your specific model, rather than blindly apply rules of thumb

56
Q

what is the Standardised Root Mean Square residual (SRMR)?

A

Ð a residual correlation is the difference between a sample correlation and the implied correlation

Ð the SRMR is based on the average absolute value of the residual correlations

Ð an SRMR of zero would equal perfect fit (no residual)

Ð SRMR

57
Q

what is chi 2?

A

Ð If a x2 of 0 it will be the saturated model
Ð Error increases as paths are reduced and x2 goes up
Ð Significance of x2 is a measure of bad fit. So we are looking for non-significance as we are looking to show the results are not significantly different from the saturated model despite having fewer paths.

58
Q

what is goodness of fit?

A

Ð Analogous to R2 (estimates total variance accounted for by the model)
Ð Values closer to 1 are better fit
Ð Hu & Bentler (1999): >.95 = good >.90=adequate

59
Q

what is Root Mean Square Error of Approximation (RMSEA)?

A

Ð popular fit measure

Ð designed to assess the approximate fit of a model rewarding parsimony of two models with similar explanatory power, the simpler model – fewer paths (df) – will be favoured

Ð Browne and Cudeck (1993) suggested:
• RMSEA < .05 – good fit

  • RMSEA < .08 – reasonable fit
  • RMSEAs above .10 poor fit
60
Q

after testing all this - what can we do?

A

We can test different factor models against each other – the hunt for parsimony

61
Q

and the Model can be refined by

A

building or trimming - i.e. adding or deleting paths to or from original model

62
Q

model building is….

A

Ð Starts with a bare-bones model then adds path(s)

Ð If extra paths significantly improve fit these are added to model

63
Q

model trimming is….

A

Ð Typically starts with a saturated model and simplifies it by eliminating paths
Ð If the model fit does not significantly deteriorate then paths can be removed (model is no worse but is simpler)

64
Q

to achieve SEM, once we have a viable CFA measurement model, you

A

re-specify the model as a path model

65
Q

This is reflected by the fact that WHAT COULD HAVE BEEN

A

DOUBLE HEADED ARROWS are specified as a PATH. Turning a measurement model (CFA) turns into full SEM by changing double headed into direct single headed arrow paths.

66
Q

So taken the principals of path analysis and CFA – and combined them both together to get

A

structural equation model SEM.

67
Q

You can still test for alternative models using the same methods as provided earlier, and Deletion/adding of paths can be

A

Deletion/adding of paths can be theoretically or empirically driven

68
Q

Theoretical approach INCLUDES

A

Ð model trimming/building guided by theoretical a priori considerations

e.g. ‘ I hypothesise that ethnicity & family background have no direct effect on grades (effects are likely to be indirect ones) and therefore adding them as direct paths will not result in a significantly improved model’

69
Q

Empirical approach includes?

A

Ð Paths are added or deleted from model purely on basis of
statistical criteria

Ð In model building, Modification Indices (MI) – another route (improvement in chi 2 value)
for all paths are examined to see which ones significantly improve model

Ð can capitalise on chance correlations

Ð this type of SEM is more exploratory (cannot claim you are
‘confirming’ theory)

Ð credibility of model improved if model structure replicated in another sample

70
Q

name some extensions to SEM

A

Categorical variable = Multiple – group SEM (testing SEM across categorical variable like gender)

Hierarchical data = Multi-level SEM for data with hierarchical structure

Repeated measures = latent growth modelling

Categorical Latent variables = Mixture modelling

71
Q

name assumtpions

A

The assumptions largely follow from those for correlation/regression analyses (see the appropriate lecture).

Linearity
dependent (endogenous) variables should be linearly related to independent variables 
SEM programmes can handle continuous and categorical variables, but check for coding of categorical variables and make sure programme knows what codes are being used 
Normality  residuals should be normally distributed and homoscedastic 

Identification
models cannot be under-identified
Adequate sample size
Kline recommends at least 10 times as many cases as parameters (paths) – ideally 20 times
5 times as many cases is often insufficient
Proper Model Specification specification error occurs when common causal variables are left out of the model
Disturbances uncorrelated with endogenous variables same as MR – errors uncorrelated with independent variables
No multicollinearity
Exogenous variables are reliably measured

72
Q

sperm molesting inmates effect modest reporter

A

7 Steps to setting up a CFA model?

1) Specify the model
2) Model identification
3) Model estimation
4) Testing model fit
5) Interpret model effects
6) Modifying models
7) Reporting results

73
Q

what is the Comparative Fit Index in SEM?

A

The comparative fit index (CFI) analyzes the model fit

74
Q

how does CFI work?

A

by examining the discrepancy between the data and the hypothesized model

75
Q

what does CFI also account and adjust for?

A

for the issues of sample size inherent in the chi-squared test of model fit,[20] and the normed fit index

76
Q

what is the mnemonic for CFI/GFI

A

god fucks in 1 huge bentley

77
Q

what does god fucks in 1 huge bentley stand for?

A

CFI values range from 0 to 1, with larger values indicating better fit. Previously, a CFI value of .90 or larger was considered to indicate acceptable model fit.[31] However, recent studies have indicated that a value greater than .90 is needed to ensure that misspecified models are not deemed acceptable (Hu & Bentler, 1999).

78
Q

RMSEA mnemonic ?

A

run d-m-c cued the brown note on the decks in 1993

79
Q

what does run d-m-c cued the brown note on the decks in 1993 – mean?

A

Browne and Cudeck (1993) suggested RMSEA fit:
• RMSEA < .05 – good fit

  • RMSEA < .08 – reasonable fit
  • RMSEAs above .10 poor fit
80
Q

What 3 advantages are there to estimating this model as an SEM model as opposed to running separate regressions on scale totals?

A

1) You get an overall test of model fit that can disconfirm whether your model fits the data.
2) You also get indices of approximate fit. Parameter estimates are better estimated in one go if possible than estimating in multiple steps as bias is introduced in unnecessary multiple step estimation.
3) it is possible to estimate the impact of the unreliability of the composite measures and their impact on the regression coefficients, this is a key advantage of SEM.

81
Q

what is a composite variable ?

A

However, we’re still fascinated by the idea of bundling different variables together into a single causal effect, and maybe evaluating the relative contribution of each of those variables within a model.

In SEM, this is known as the creation of a Composite Variable. This composite is still an unmeasured quantity – like a latent variable – but with no error variance, and with “indicators” actually driving the variable, rather than having the unmeasured variable causing the expression of its indicators.

82
Q

What more information do you require (past the overall model - trick was to search the info given) to assess the adequacy (or otherwise) of the model?

A

Sample size, Standard errors of estimates, whether estimates are standardized or not.

83
Q

Describe what we mean by ‘exogenous’ and ‘endogenous’ variables in path models.

A
Exogenous variables are specified to have no causal predictor in the model, although they
can co-vary with other exogenous variables. Endogenous variables are predicted by
exogenous variables and other endogenous variables included in the model, as well as
unspecified variables (via an error term).
84
Q

when calculating the effect total effect of one variable on another in a path, one must calculate…..

A

1) indirect pathways
2) but also a mediators relationship to another varible which also leads to the DV = three-way multiplication of the constituent paths
3) Add up all the pathways