SEM Flashcards

1
Q

What is path analysis typically used for?

A

Examine the size and direction of direct and indirect effects between multiple variables

Examine the goodness of model fit between the researchers hypothesised model and the observed data

Compare the observed model fit of competing theoretical models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Path analysis is what to structural equation modelling?

A

Path analysis is a very simple form of structural equation modelling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When typically is the term ‘path analysis’ used?

A

When we are modelling observed variables

This means we have a single measure of the construct e.g. Word vocabulary test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is the term SEM used?

A

When we have multiple indicators of a construct and we create latent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is confirmatory factor analysis used in SEM?

A

Confirmatory factor analysis is used to create a measurement model

In SEM we then examine the relationship between these latent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Outline a full SEM model

A

A full SEM is simply a combination of a measurement model (confirmatory factor analysis) and a structural model (path analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give an example of a simple path model?

A

A mediated regression is a simple path model

In a mediated model, the relationship between an iv and out on is accounted for or ‘mediated’ by a third variable

Mediation implies a causal chain series of relationships between the three variables. (The researcher must have clear theoretical or logical grounds for choosing the mediator and iv variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the requirements for mediation?

A
  1. Predictor (X) must predict mediator (Z)
  2. Mediator (Z) must predict criterion (Y)
  3. Predictor (X) must predict criterion (Y)
  4. The X and Y relationship must shrink in the presence of (Z)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When assessing a mediation effect of the relationship between the predictor and the criterion shrinks (beta weight gets smaller) but is still sig. What does this mean?

A

Possibly partial mediation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What needs to happen in order to conclude that full mediation has occurred?

A

The x and y beta weight should be 0 (or at least non sig.) for full mediation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When do researchers argue for partial mediation?

A

If the beta weight drops substantively but does not reach 0

Sobel test of the indirect effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

State the 6 steps that should be undertaken when conducting a path analysis

A

Specify the model

Model identification

Model estimation

Interpret model effects

Evaluate model fit

Modifying the model (examining alternative models)

(So If Emma Interviewed Everyone’s Mum)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What arethe advantages of running a SEM over multiple ordinal least squares regressions?

A

When testing a large number of effects the analysis of multiple regressions can become very complex and SEM use maximum likelihood based methods to calculate the effects simultaneously.

Advantages of path analysis

  • simpler and quicker estimation of model effects
  • obtain global model fit indices
  • encourages researcher to specify causal relationships between variables beforehand
  • more direct and easier to tests of alternative theoretical models and their fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The first step in path analysis is to specify the model, how should this be done?

A

Using theory and/or previous research, as well as logical relations between variables, to justify your path model:

The path model can then be drawn using a path diagram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a path diagram what are typically represented by squares?

A

Observed variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In a path diagram what are typically represented by circles or ellipses?

A

Latent (unobserved) variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What do single- headed arrows represent in a path diagram?

A

Causal relationships between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is an exogenous variable?

A

These variables are considered as IV’s in the model

They have no specified predicted cause in the model, genes they have no single-headed arrow going into them

You can have multiple exogenous variables in the model; these are usually free to correlate with rah other, although you can specify that they be uncorrelated (correlations between two or more exogenous variables are represented by a double headed arrow between variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In a path diagram what do a double-headed arrow represent?

A

Correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are endogenous variables?

A

These variables are considered DV’s in the model

They will have a directional arrow coming into them & may also have one or more directional arrows moving away if it is a mediator variable

Basically these are downstream variables caused by exogenous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which variables typically have an error or disturbance term associated with them?

A

Endogenous variables

This reflects that there are also u measure and unspecified causal effects on these variables

These disturbance terms are usually modelled as latent variables, hense they are represented by circles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does a path model need to be in order to be analysed?

A

It needs to be identified

There needs to be sufficient unique pieces of information (i.e. Correlations in the observed data) to allow mathematical estimation of the model given the model that has been specified.

Identification can become tricky when dealing with complex latent variables and non-recursive models, but there are some shorthand methods for checking identification in observed variable path models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the basic rule for model identification?

A

Maximum number of single connections between observed variables must equal or exceed the number of paths specified in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the formulae to calculate the maximum number of single connections between observed variables?

A

(V*V-1)/2

Where V = number of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Using the formulae to calculate the maximum number of single connections between observed variables what do you have to compare this number to to check model identification?

A

Count all of the model pathways (ignoring disturbance/error terms)

And then compare these two numbers

The maximum number must equal or exceed the number of paths counted in the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

There are three outcomes when checking model identification, name and explain these & state which outcomes enable models to then be estimated?

A

Over-identified model (more correlations than free paths in the model)

Just-identified model (saturated model) (correlations equal the number free paths in the model)

Under-identified model (fewer correlations than free paths - model cannot be estimated)

-only over or just identified models can be estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is a recursive model?

A

This is a model where all causal pathways are moving in the same direction i.e. Effects are uni-directional. (This is the most common form of model and is always identifiers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a non-recursive model?

A

This is where there are reciprocal relationships between variables

  • more complex to analyse
  • identification issues can be very problematic in complex non-recursive models
  • not as common in the psychology literature
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

After the model has been specified and constructed what happens?

A

Model estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The are two primary interests estimated by the model, what are they?

A

The direct and indirect effects between variable

Global model fit

This in the context of regression would by
-regression coefficients for individual predictors
Test of overall regression model fit i.e. ANOVA for R squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Paths in models can be decomposed into what?

A

Direct and indirect effects (&error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Explain direct paths

A

The oath regression coefficients reflect direct relations between one variable and another (controlling for the effect of any other variable also effecting the endogenous variable).

These are the same as the beta weights in normal MR (we can obtain these by simply running separate OLS regression models)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

The number next to a path in a path diagram is what?

A

They are standardised regression coefficients the beta weights from a regression output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How do you interpret direct effect results from a path diagram?

A

You can fast the significance of (unstandardised direst effects)

However, you should consider the magnitude of direct effects not just the sig.

(Use last research as a guide, consider substantive real-world meaning of effects, use cohens rule of thumb .1 = small, .3 = medium and .5 = large)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What are indirect effects in model estimation?

A

These are the effects of one variable on another variable via a mediator variable

In a standard one-mediator mediated regression there is one indirect effect- the effect of the IV on the DV via the mediator.

36
Q

How do you calculate an indirect effect from a path diagram which shows the standardised regression coefficients?

A

By multiplying the constituent paths.

And then comparing this number to the direct effect pathway the relationship between the two variables should shrink in the presence of the mediator.

So the indirect path should be lower than the direct path.

37
Q

Take two variables neuroticism and depression say they have a direct pathway with a standardised regression coefficient of .22 & there is a mediator avoid
Neuroticism - avoid = .34
Avoid - depression = .5

Calculate the indirect pathway explain what these pathways show and how the indirect pathway can be interpreted

A

.34 * .5 = .17 - indirect path neuroticism to depression

Neuro roam has a .34 direct effect on avoid but only .5 of this is transmitted to depression via avoid

The indirect pathway means that an increase in depression of .17 SD units for every 1 SD unit increase in neuroticism via the effects of avoid

38
Q

What does total effects refer to in model estimation?

A

Total effects represents the total causal effect of one variable on another

This is calculated by summing all of the direct and indirect effects

39
Q

What are the two tracing rules? Nb given in the lecture slides

A

You cannot enter and exit a variable on an arrowhead

You cannot enter a variable twice on the same trace

40
Q

What is the primary goal of constructing a causal model?

A

To express relationships between variables in terms of direct and indirect effects, based on a causal model assumed to be correct (to qualify degrees of causality)

41
Q

Discuss model fit statistics in the context of plausibility

A

Plausible model is constructed independently of analysis using non statistical means

Model for statistics can give some indication of model plausibility

NB correlation does not equal causation & we cannot determine causal direction statistically

42
Q

List 4 processes that can be used to help specify causal direction of paths (& thus construct a plausible model…prior to data collection)

A
  1. Time precedence
  2. theory
  3. Previous research
  4. Logic/sound rationale
43
Q

Kline, 2005 discussed 4 empirical conditions that must be met to support causal inference, discuss these…

A
  1. Relationship: X should be correlated with Y
  2. Temporal precedence (X must precede Y in time)
  3. Non-spuriousness (X-Y relationship should hold after controlling for other variables experimentally or statistically e.g. Third variable issue)
  4. Correct effect priority (there are no reciprocal relationships between X and Y, or
    Reversals of this relationship)
44
Q

How do you work out the degrees of freedom in a path model?

A

The difference between the full saturated model and the reduced model

E.g. If the full model could have 10 pathways and 8 were specified DF = 2

45
Q

What is path analysis?

A

In its most basic form it is a simple extension of multiple regression

46
Q

When would you use a disturbance term and an error term in SEM?

A

Disturbance terms point towards latent factors and error terms to measured variables

47
Q

What is the major question asked by SEM?

A

Does the model produce an estimated population covariance matrix that is consistent with the sample (observed) covariance matrix?

Basically is the constrained model consistent with the saturated

48
Q

What will the chi-square statistic and degrees of freedom be for a saturated model?

A

Both will be 0

49
Q

If the chi square statistic for the reduced (default) model is sig. What can you conclude?

A

Bad fit of the reduced model to the data

50
Q

What is a problem with be chi-square test?

A

Sample size - with large samples, your model likely to be sig. worse even when differences in fit are substantively small

51
Q

What is the independence model?

A

It’s a model that specifies that all of the relationships between the variables are 0 so it will always be a bad fit to the data

It is used as sometimes fit indices actually compare the default model to the independence model ‘how much better is it?’

52
Q

Describe the standardised root mean square residual (SRMR) and say what a good fit would be

A

A residual correlation is the difference between a sample correlation and the implied correlation

The SRMR is based on the average absolute values of the residual correlations

An SRMR of zero would equal perfect fit (no residual)

SRMR

53
Q

Described the root mean square error of approximation (RMSEA) and Browne and Cudeck (1993) suggestion of fit level

A

-popular fit measure

Designed to
Asses the approximate fit of a model rewarding parsimony

Of two models with similar explanatory power the simpler model - fewer paths (DF) will be favoured

54
Q

Describe the goodness of fit index (GFI) and Hu and Bentler (1999) guidelines for fit

A
  • different approach to model fit

Compares researchers model with the independence model (independence model predicts all variables are independent i.e. Zero correlations)

Analogous to R2 - estimates total variance accounted for by our model.

GFI > .95 = good fit
GFI > .90 = adequate fit

55
Q

What is the purpose of model fitting and what limitations does it have?

A

Is to rule out bad models

Limitation is that it cannot prove a good model

Bad model fit means that the model doesn’t explain the data as well as others might

Good model fit - fails to disconfirm your model, you may have a good model but ‘fit’ is with reference to the variables in your model (alternative models with different specification paths might be even better - still worth testing alternative models & maybe that ther is a more complete model (more variables))

56
Q

What is full SEM?

A

Extends observed variable path analysis by creating a latent variable measurement model, and then examining relationships between these latent variable factors

57
Q

What is the two-step process for SEM?

A

Specify and estimate a candidate measurement model (aka confirmatory factor analysis)

Once you have a viable measurement mode, you re-specify the model as a structural
Model and examine the relationships between latent factors

58
Q

Confirmatory factor analysis (CFA) models test measurement models how and what does it do?

A

They are used to test theoretically derived models of psychological measures

Often used in the development of psychological measures after having used EFA (exploratory factor analysis)to initially develop and refine the measure

Once we have an EFA services measure we can administer it to a new sample and see if we can confirm the original measurement model

Can tell us important information about how a measurement tool is saturated and/or how latent factors refer to each other

59
Q

Discuss CFA vs EFA

A

The principles underlying CFA are largely the same as those in EFA

Before undertaking a CFA we should use the same assumption checks & data screening as EFA

The typical difference between the two is that in CFA we constrain factor loadings (usually to be 0) I.e. We do not allow all observed items/indicators to load freely on all of the factors

So the CFA model is a more constrained version of the EFA model

60
Q

What is an indicator variable and how is it represented in SEM?

A

Are measured or indicator variables (observed variables)

And a represented by a square

61
Q

What do the factor loadings do?

A

Estimate the relationship between the factor and the observed indicator

Can be thought of as the correlation between the factor and the indicator in standard CFA models

Typically like these to be >.50

62
Q

What are the factor covariances in CFA?

A

Estimate the relationship between latent factors

we can use this information to examine the convergent and discrimination validity of the factors

63
Q

What are error terms in CFA?

A

These model variation in the indicator variable not accounted for by the factor e.g. Anything else that accounts for variance in the indicator variable - other influences and error

These error terms are usually uncorrelated with each other, but you could model error correlations if you expected that response across indicators would be caused by something other than the factors e.g. Method effects

64
Q

Describe the 8 steps of designing a CFA model

A
  • refer to theory/previous research to a certain appropriate level

Specify the model

Model identification

Model estimation

Testing model fit

Interpret model effects

Modifying models

Reporting results

65
Q

When designing a CFA model you first need to specify the model what should you fix the error terms to?

A

As you cannot know the variance of unmeasured variables

Fix the error variances to 1 in model specification

Or fix raw error loadings to 1 (AMOS default) - sets error variance based on indicator variance

This is important for model identification

66
Q

When designing a CFA model you first need to specify the model what should you fix the factor variance to?

A

Factors are unmeasured so variance is unknown

Fix factor variance to 1 or set raw factor loadings to 1

(Only need one factor loading to be set to 1 per factor)

This is important for identification of the model

67
Q

What does CFA use to estimate unknown values e.g. Factor loadings, in the variance/covariance matrix?

A

Known values

Number of knows =

V*(v+1)/2

Where v equals the number of variables

68
Q

How can you find out the model is identified?

A

Calculate the knows v*(v+1)/2

And the unknowns (count up number of free paths and variances)

Subtract the unknowns from the know a to get DF

If model DF greater than or equal to 0 then proceed i.e. The model is identified

If not you need to re specify your model

69
Q

There is a simple heuristic for standard CFA models e.g. Models with uncorrelated error terms and where each indicator loads on just one factor- what is it?

A

If a model with a single factor has 3 or more indicators it will be identified

If a model with 2 or more factors has 2 or more indicators or factor it will be identified

70
Q

In CFA was are we typically looking to do? (Model estimation)

A

Estimate model parameters e.g. Factor loadings and factor covariances

Test global model fit
(We can also then compare the fit of competing measurement models, specify alternative models etc)

71
Q

In a CFA model what rule of thumb is used to suggest that two factors may be redundant?

A

If the factor correlations > .75- .80 then this may suggest that the model is ‘over-factored’ or that one of the factors is redundant -a more plausible model might involve collapsing the factors in to one and re-estimating (this is where you would also need to rely on what theory and previous research suggests)

72
Q

In a CFA model if factor loadings are low on to an indicator then what may this suggest?

A

That you should possibly remove this indicator from your measurement in the future (I.e. If a questionnaire and your factor does not load highly on to item 6 maybe this item is not really tapping into the factor that you want so remove it as it just adds noise to your data)

73
Q

In CFA what model fit indices should you use to evaluate the global model fit?

A

The same as in path analysis
Residual correlations (sample correlations minus implied correlations - sample correlations are observed correlations; implied correlations are calculated
From the model loadings - smaller residual correlations = better fitting model - larger specific residual correlations may indicate that part of the model is misspecified)
Chi-square (examine the fit of an individual model - comparing model with observed data, so we want a non-significant chi-square value i.e. No significant difference between model and data - can also directly
Test differences between chi square nested (hierarchical) models using difference between model DF as critical chi-square value)
RMSEA & GFI as well!
SRMR (average absolute value of the residual correlations - so the closer to 0 means perfect fit - SRMR

74
Q

Where do you look in Amos to check the chi-square and GFI for a CFA?

A

Check the default model in Amos (non significant = good)

75
Q

What does the RMSEA need to be to be a good fit?

A

Very close to 0

76
Q

Explain model building in CFA context

A

Start with a bare bones model and then add path(s)

If extra paths significantly improve fit these are added to the model

77
Q

Explain model trimming in the context of CFA

A

Typically start with a saturated model and simplify it by eliminating paths

If the model fit does not sig. Deteriorate then paths can be removed (model is no worse but simpler - more parsimonious)

78
Q

Discuss a model building example

A

Calculate chi-square for first model

& then second

Calculate difference between the chi-square statistic for each model

If chi-square is sig. then the model is sig. improved by adding paths and these can be retained in your refined model

NB when checking if the chai square statistic difference is sig. you use the difference between the models DF to then look up the statistic in the table

79
Q

What are modification indices (MI)

A

MI chan be used to add individual paths to the model

These are an output from Amos

The large the MI the greater the improvement in model fit

Usual conventions is MI > 4 suggests an improvement in model fit and path should be added

80
Q

If you have 2 models one is the saturated so has 0 DF and the 2 model has a DF of 2. The difference between the chi-square statistic for the 2 models is 4.03. Look up in a table a chai square for DF 2 and you get a critical value of 5.99. Is the new model a better fit to the data or not?

A

Yes as 4.03 is smaller than 5.99 so there is not a sig. Difference so the new model does not have a sig. Worse fit to the saturated so as its more
Parsimonious it is accepted.

81
Q

To use a chi-square test for differences between models the models must be nested, what model indices do you use if the models are not nested?

A

AIC and BIC

82
Q

When model trimming/building is guided by theoretical a priori considerations what is this approach called?

A

Theoretical approach

83
Q

Explain the empirical approach of respecification

A

Paths are added or deleted from model purely based on statistical criteria

In model building MIs for all paths are examined to see which ones significantly improve the model

Can capitalise on chance correlations

This type of SEM is more
Exploratory

Credibility of model improvement if model structure replicated in another sample

84
Q

Describe the extension to SEM that looks at multiple-group SEM

A
  • test an SEM across a categorical variable e.g. Gender

We might want to look at model estimates in different groups, or see whether a particular model holds across groups i.e. It is invariant across groups

This can be done for CFA model or a full SEM

Uses the principle of iteratively constraining parameters in the model to equality across the groups (implying they are the same in each group), and then looking to see if this produces a significantly decrement in model fit

If a sig. Decrease in model fit occurs, you then have to identify which parameter have caused this problem I.e. You can iteratively free parameters to identify the source of the misfit

85
Q

Multiple-group SEM is often undertaken across a series of steps, what are these?

A

Estimate the model simultaneously in the groups, freely estimating all of the model parameters - this is often referred to as a test of configurable invariance

If the above model shows good fit, you could then test a further model that constrains the factor loadings and/or path coefficients to equality across the groups - if this model shows good fit, you can then assume the model parameters are consistent across groups. If not, you can iteratively free paths to diagnose the ill- fit and establish what is referred to as partial invariance

86
Q

What are the assumptions for SEM?

A

Apply to correlation/ regression
Linearity (dependent (endogenous) variable should be linearly related to IV’s (exogenous))

SEM programmes can handle continuous and categorical variances, but check for coding of categorical variables and make sure programme knows what codes are being used

Normally (residuals should be nor annoy distributed and homoscedastic)

Disturbances uncorrelated with endogenous variables

No multicollinearity

Exogenous variables are reliably measured

Additional
Identification (models cannot be under-identified)

Adequate sample size
(Kline recommends at least 10 times as many cases as parameters (paths) - ideally 20)

Proper model specification (specification errors occurs when common causal variables are left out of the model)