Exam 2 Flashcards
Describe the primary advantages of structural equation modeling.
- Reduces measurement error by having multiple indicators of a latent variable
- Can test overall models and indiv. Parameters
- Can statistically compare nested and non-nested models
- Can test models with multiple DVs
- Can model mediator variables (i.e., processes)
- Can model error terms
- Can model relations across groups, across time.
What is the goal of SEM?
develop a model that explains why observed variables are related (i.e., explains the variance-covariance matrix).
What are the similarities between CFA and SEM?
- Both model shared variance amongst variables (unique variance becomes error).
- Both provide factor loadings: direct relations between observed and latent variables
- Factor variances/covariances (when standardized, these are factor correlations)
Explain confirmatory factor analysis.
- A priori measurement model: THIS IS NOT AN EXPLANATORY MODEL.
- Models direct relations between observed and latent variable(s)
What are the types of variables in SEM?
- Observed (OV): measured (items, subscales, scales)
2. Latent (LV): theoretical constructs defined by the observed variables
What is the goal of SEM?
Goal is to model the commonality in the observed variables (as in factor analysis)
Then examine relations between LVs (as in multiple regression)
What are the types of latent variables (LVs)?
- Exogenous: LVs that “cause” other LVs
- Endogenous: LVs that are caused by other LVs
- Pure DVs are only “caused”
- Mediators are a “cause” and a “caused”
What are the 10 commandments of SEM according to Thompson?
- Do not make conclusions that one model definitely describes data.
- After modifying models, re-test your model on an independent sample.
- Test multiple models, not just one.
- Evaluate measurement models prior to evaluating a structural model (we can still adjust the measurement model after the fact if the structural model indicates that would be beneficial).
- Use multiple criteria to evaluate model fit, to account for theoretical and practical considerations.
- Use multiple fit statistics.
- If your multivariate test requires normality, makes sure your observed variables can also be normally distributed.
- All other things equal, more parsimonious models, they are less likely to be “nearly just-identified”.
- Consider scale and distribution of variables (measured/observed) when selecting matrix of associations to analyze.
- Do NOT use SEM with small samples.
Using your own example, describe a structural equation model with 4 latent
variables, each with 3 indicators. As part of your answer, write the equations that
would test your model.
4 LVs X 3 Indicators = 12 DVs – you write an equation for each DV • S1 = l(Academic Self-Esteem) + e1 • S2 = l(Academic Self-Esteem) + e2 • ..... • S5 = l(Relationship Self-Esteem) + e5 • S6 = l(Relationship Self-Esteem) + e6 • .... • S9 = l(Exercise) + e6 • S10 = l(Exercise) + e6 • S11 = l(Diet) + e6 • S12 = l(Diet) + e6
– l represents a regression coefficient (factor loading)
– e represents error (or residual)
OV1 = l(LV) + e1 e.g., BDI = l11(Depression) + e1 relation among latent variables endogenous LV = b(exogenous LV) + d1 Depression LV = b21(Coping LV) + d1
these equations imply a model
this model attempts to explain the variance-covariance matrix (S)
i.e., relations among observed variables
Lambda: the weight, partial relationship b/w the observed var and outcome
These regression models imply a model. If your model fits well, the implied correlations look like the original correlations.
Explain to me how a confirmatory factor analysis is conducted.
- Design a measurement model we hypothesize will fit the data well.
- WE ESTIMATE THE MODEL USING MAXIMUM LIKELIHOOD ESTIMATION
• In order for the model to run, we need to specify the scale of a latent variable in CFA: we can either fix the variance of the LV to 1, or fix a factor loading for each LV to 1.
With CFA, how do we determine model fit?
WE DETERMINE MODEL FIT ON TWO LEVELS: THE OVERALL MODEL FIT, AND THE INDIVIDUAL PARAMETER FIT.
Define overall model fit in context of CFA/SEM.
We assess the fit of our measurement model by using several indices: the comparative fit index (CFI) with values greater than .95 indicating reasonable model fit and values greater than .90 indicating a plausible model, Root Mean Square Error of approximation (RMSEA)— an absolute index of overall model fit with values less than .08 indicative acceptable model fit and values less than .05 indicative of good model fit, and the Standardized Root Mean Residual (SRMR) an absolute index of overall model fit with values less than .08 indicative acceptable model fit and values less than .05 indicative of good model fit. We also report the chi-square goodness of fit test, for completeness.
Define individual parameter fit in context of CFA/SEM.
If our model fits well descriptively, we would interpret the parameters of the model (i.e., the factor loadings and the interfactor correlation, if there are multiple factors).
o The statistical tests we use for the factor loadings and covariances are the critical ratios (CRs), these are distributed as z-values. These CRs have p-values associated, we want them to be < 0.05.
o If the parametesr do not fit, we report that, and then respecify the model.
Define some practical issues to be aware of in CFA
o Identification: we need our model to have enough DF (to be at least just-identified) in order to run the model.
o DF = nonredundant elements in S (the var/covar matrix)
-parameters estimated elements in S = # variances & covariances
-this equals p (p+1) / 2, where p = # OVs
-parameters estimated:
• count up factor loadings, factor covariances, and IV variances estimated
What are modification indices?
Modification indices are useful when our model is ill-fitting and we need to consider changing it. They can be used to build a well-fitting model. They should not be used to generate theory, since they are purely math-driven, and might not make any theoretical sense. If we use modification indices, we need to be extremely careful, as SEM is a theory-driven technique.
Describe the goal(s) of multiple group analysis.
compare model fit across groups
groups can be “anything” (e.g. gender, ethnicity, age, disorder, etc.
How do you conduct MGA?
- establish a baseline measurement model and fit for that model separately in each group using CFA
- establish baseline fit of the structural model using SEM
- Assess for invariance across groups in the following parameters:
- factor loadings
- factor variances/covarianes
- structural/path coefficients - test each group to a baseline model
What is configural or pattern invariance?
- Configural or pattern invariance: Do the groups have the same parameters? We tst this by fitting the “best” model in each group.
What is metric invariance?
Only factor loadings are constrained to be invariant (equal) across groups. All other model parameters are freely estimated for each group separately (including factor variances/covariances, error variances).
i. With this model, you explicitly fix the value of each of these factors to be the same between the groups we care about comparing.
ii. If these constraints are true, then these loadings are equivalent between these two groups. If it’s true (we have measurement equivalence), then our model should fit well (we’re measuring the same construct in these two groups equivalently.
iii. This model is going to have more DF because we’re estimating fewer parameters (only factor loadings, but we’re saying they’re the same for males and females). This is a more parsimonious model.
iv. This model here is nested within the step-1 models. This model is relative to the configural invariance in step 1.
What is the factor variance/covariance invariance estimation?
i. Factor loadings are constrained to be invariant across groups (as in step 2)
ii. Factor variances/covariances are constrained to be invariant across groups
iii. Only error variances are freely estimated in each group separately.
Summarize the model steps that you go through with Multiple Group Analysis (MGA).
Model 1: Free estimation
Model 2: constrains factor loadings b/w groups
Model 3: constrains factor variance/covariance across groups
If model 3 fits best, we have the utmost confidence that we’re measuring the same construct in each group.
What is the chi square difference test?
To determine which model fits best, we use the chi-square difference test. If the chi-square is significant, then the baseline model fits better. If not significant, metric invariance fits better (more parsimonious).
• Difference in chi square = chi square more restrictive/More Parameters - chi square less restrictive/Fewer Parameters
• Difference in df = df more restrictive – dfless restrictive
Describe how one conducts a multiple group confirmatory factor analysis.
Our overall goal in MGA is to compare model fit across groups (e.g., gender). Basic question: are the groups equal (in terms of their model parameters)?
We determine model fit in the same way we do in SEM/CFA by using model fit indices (RMSEA, CFI, and SRMR), chi-square test for completeness, modification indices.
First, we establish a baseline fit of the measurement model in each group (using CFA)
Second, we establish the baseline fit of the structural model in each group (using SEM)
Third, we test the invariance across groups for the factor loadings, factor variances/covariances, and structural/path coefficients.
How to establish invariance in the measurement model:
- Configural or pattern invariance: the same parameters exist for all groups (e.g., same four factor loadings in both male and female models). We do this by testing the “best”/baseline model in each group.
- Metric invariance: We constrain the factor loadings to be equal across groups, all other model parameters are estimated freely for each group (including factor variances/covariances and error variances).
- Factor variance/covariance invariance: factor loadings still constrained to be equal, factor variances/covariances also constrained to be equal, only error variances are freely estimated.
Next, we interpret invariance at each step:
• Configural invariance: In this case, we have the “same” latent variable(s) in each group. I.e., “we eye-ball the factor loadings and make sure they’re in the same direction/approx. magnitude across groups
• Metric invariance: “we test what we eye-balled in the previous step”. If we have this, then we have the same LV in each group.
o The item has the same scaling units across groups
o This is a requirement in order to make substantive comparisons between groups on the LV
• Factor variances: groups use the same range of the construct continuum. i.e., the variance on the LV is the same across groups.
• Factor covariances: if these are equal, then we have the same associations between factor loadings across groups.
How are models directly compared in confirmatory factor analysis and structural equational modeling?
- We assess the fit of our measurement model by using several indices: the comparative fit index (CFI) with values greater than .95 indicating reasonable model fit and values greater than .90 indicating a plausible model, Root Mean Square Error of approximation (RMSEA)— an absolute index of overall model fit with values less than .08 indicative acceptable model fit and values less than .05 indicative of good model fit, and the Standardized Root Mean Residual (SRMR) an absolute index of overall model fit with values less than .08 indicative acceptable model fit and values less than .05 indicative of good model fit. We also report the chi-square goodness of fit test, for completeness.
- In the case of nested models, we use a use chi-square difference test to directly compare models: To determine which model fits best, we use the chi-square difference test. If the chi-square is significant, then the baseline model fits better. If not significant, metric invariance fits better (more parsimonious).
- difference in chi square = chi square more restrictive/More Parameters - chi square less restrictive/Fewer Parameters
- D df = df more restrictive – dfless restrictive
Define the latent intercept and slope factors from a latent growth curve analysis.
As part of your answer, describe what the correlation between these two latent
variables tells you.
Correlation between latent slope and intercept would indicate that those with higher intercepts have steeper (or flatter) slopes than those with lower intercepts, for example.
Describe how you would conduct and interpret analytic results from a latent
growth curve analysis.
Purpose: LCGM is a way to explain change over time, it can be used to determine risk levels and assess variation in risk over time. With LGCM, we can model nonlinear relationships and we can plot individual trajectories to measure how individuals vary from an overall trajectory/trend. We can also statistically test whether trajectories change over time.
Conduct:
• To conduct LGCM, we must have at least 3 time-points.
• In LGCM, we fix the factor loadings for both LVs (slope and intercepts) to specific values
How do you interpret analytic results from a latent growth curve analysis?
Interpret: Interpret the intercept, interpret the slope. Examine model fit indices.
• Intercept: Value of the outcome variable when the predictors are equal to 0. As such, the intercept can provide a test of initial status (i.e., outcome at first timepoint). You can set the intercept value to a different time-point depending on your interests. We can statistically test whether the intercept differs from 0 (if 0 is a meaningful value). We also statistically test whether there is variation around this value.
• Slope: Determines the shape of your trajectory, indicates both trajectory and magnitude. It represents the mean growth rate. We statistically test whether it differs from 0 and whether there is significant variation around the mean trajectory.
• Variation: significant variation around the intercept/slope provides us with interesting information (e.g., individuals change at different rates, or in different directions, individuals start in different places).
o When there is significant variation, we try to explain it:
Intercept and slope “factors” become outcome variables, “other” explanatory variables become predictors.
We do this via the confirmatory factor analysis component of SEM. LVs are slope and intercept, time is defined in the measurement model by factor loadings (time scores).
Importantly, we estimate the mean of the slope/intercept, but also the covariance structures of these new latent variables.
How are time scores used in latent growth curve modeling?
Time scores are used to build a theoretical model. Linear time scores, for example, model a linear trend (e.g., 0, 1, 2, 3), whereas quadratic time scores model a quadratic trend (e.g., 0. 1, 2, 4, 8). Time score are the factor loadings. We fix them to certain values to model different types of growth. When examining the intercept, we fix the factor loadings/ time scores to 1 to obtain the mean value at time 1. For the slope LV, we set the time scores according to our hypotheses (e.g., linear growth, quadratic change, etc.) .
What is hierarchical linear modeling and when should it be used?
We use HLM when we have data that are clustered, or grouped and therefore vary on both an individual level, and vary based on a group/cluster-level variable. Nested data structures are commonly observed (e.g., multiple measurements per person, multiple individuals within classrooms, members of the same family, etc.). Generally, in nested data structures, sub-units are grouped within larger units. Sub-units are the level-1 variables, and the larger units are the level-2 variables. Variables are measured at both levels.
Structural equation modeling is just ______ _______ with latent variables
path analysis
What are latent variables?
theoretical constructs defined by other observed variables
SEM merges that logic of what 2 concepts?
factor analysis and multiple regression
Why do we run CFA prior to SEM?
make sure the measurement models are equivalent across groups (if we’re doing a group analysis)
CFA tells us what type of relationship between variables
correlational (NOT causal)
How are latent variables developed?
CFA
How do we determine model fit in CFA/SEM?
2 levels:
- overall model fit
- parameter fit
How do we determine overall model fit?
Use 3 fit parameters:
- Comparative fit index (CFI). values greater than .95
- Root mean square error of approximation (RMSEA) want values less than .08-.05
- Standardized Root Mean Residual (.08-.05 values)
- report chi square goodness of fit for completeness
How do we determine individual parameter fit?
critical ratios (CRs), these are distributed as z-values. These CRs have p-values associated, we want them to be < 0.05.
What type of observed variables cannot be used in a CFA?
Our observed variables can be on any scale except for purely categorical (e.g., gender).
What are the two type of Modification indices we typically use?
- LaGrange Multiplier test (identifies parameters to add to the model)
- Wald test (identifies paths that you can remove from your model).
SEM is a _____-driven technique
theory
What must you specify a priori in the CFA model?
- factor loadings (direct relations between observed and latent variables—WE NEED TO SPECIFY THE FACTOR LOADING MATRIX ACCORDING TO THE THEORY)
- error term variance Error terms are the “leftovers
- Factor variances/covariances (Typically we are interested in the standardized covariances between the factors, these are the factor correlations)