Lecture 13_Path Analysis using MLR Flashcards
An introduction
What is Path Analysis (PA)?
a method for decomposing the correlation coefficients
into component parts (direct and indirect effects) within a system of causally related variables (another way of applying MLR).
– requires the investigator to theorize about the causal relations among a set of variables and apply this framework to decompose the correlations among the variables.
What are Path coefficients?
standardized regression coefficients (β) obtained from a set of inter-related regression models.
What are three explanations for why X and Y appear correlated?
X causes Y
Y causes X
X and Y have a common cause Z
What are the component parts of a correlation?
– Direct Effects (DE)
– Indirect Effects (IE)
– Spurious Effects (S) - due to common causes
– Unanalyzed Effects (U) - due to correlated causes
How could a path model provide support for the proposed theory?
If the estimates from the path model are consistent with the observed data
What are the 5 assumptions of Path Analysis?
- The relations among the variables are linear, additive, and causal.
- The residuals are uncorrelated (each residual is not correlated with the variables that precede it in the model).
- There is a one-way causal flow in the system. Reciprocal causation between variables is ruled out.
- The variables are measured on an interval level scale (categorical variables don’t work because no dummy coding).
- The variables are measured without error (use variables with high reliability).
What is an exogenous variable?
a variable whose variability is assumed to be determined by causes outside the model.
– No attempt is made to explain its variability or its relations to other exogenous variables.
– Treated as “givens” and remain unanalyzed.
What is an endogenous variable?
a variable that is predicted by other variables [variation is explained by exogenous (or other endogenous) variables in the system under study].
– can also be predictors too
Describe the notation for a path coefficient (p₃₁ or p₃₂)
– the first number indicates the variable that the arrow points to
– the second number indicates where the arrow originates
What is a just-identified path model?
number of path coefficients = the number of correlation coefficients
– has as many parameters as data points.
– always reproduce correlations perfectly
What is a over-identified path model?
has fewer path coefficients than correlations (more knowns than unknowns)
– may reproduce the correlations adequately
What is Theory Trimming in Path Analysis?
Deleting trivial paths to build a simpler model
– If does not significantly degrade the fit of the model, parsimony dictates that the simpler (over-identified) theory is to be preferred
How can we assess the fit of an over-identified path model against a just-identified model?
in terms of variance explained.
• The proportion of variance explained by the just-identified model is defined as:
R²(sub-m) = 1 - [(1 - R²₁)(1 - R²₂) … (1 - R²ᵢ)]
• The proportion of variance explained by the over-identified model is defined as:
M = 1 - [(1 - R²₁)(1 - R²₂) … (1 - R²ᵢ)]
** The smaller M is, when compared to R²(sub-m), the poorer the fit of the over-identified model to the data.
(M «_space;R²(sub-m) = over-identified model not good)
What is a measure of relative goodness-of-fit?
Q = [1 - R²(sub-m)] / [1 - M]
• When Q is close to 1.0, the over-identified model fits the data well relative to the just-identified model
• Q is distributed as χ² and can be tested for significance (desired result is a nonsignificant test).
χ² = - (N - d) ln(Q)
[N = sample size and d = # of paths dropped (fixed to 0) in the over-identified model]
Why do we want a nonsignificant Chi square test when measuring the fit of an over-identified path model?
a nonsignificant test indicates that the simpler (parsimonious) model is not significantly worse than the just-identified model