4 introduction to mediation analysis Flashcards
What is mediation analysis?
Mediation Analysis relies on the principles of regression to investigate if the relationship between variable X and variable Y is in any way mediated by a third variable (M).
what does X do in the context of Y and how does an underlying mechanism M interact?
if a change in variable X leads to a change in a Mediator variable, which subsequently changes our outcome, Y variable.
mediator is itself affected by X or Y
What is the difference between a mediator variable and confounding variables?
when confound -> no direct effect of X on Y
confound both influences X and Y
no X-M-Y
Give to examples of possible mediator paths.
e.g. A therapeutic method (X) might affect symptoms experienced after the termination of therapy (Y) because the method influences how people interpret
negative events that occur in life (M), and those interpretations then influence the extent to which symptoms are manifested.
e.g. traumatic experiences (X) might negatively influence happiness one gets from interpersonal interactions (Y) because traumatic experiences result in the manifestation of certain behaviors that others find uncomfortable to witness (M), and this in turn produces less pleasant interactions
What are the pathways of a simple mediaton model?
two pathways:
indirect effect through M
direct effect X on Y
Does M have a causal influence on Y?
Yes, this causation causes the variation in Y
however, the causal influence does not eliminate the association between X and Y
M = mediator variable, itermediary variable, surrogate variable, intermediate endpoint
Do X and Y need to be associated for a possible mediation?
“lack of correlation does not disprove causation”
“correlation is neither a necessary nor a sufficient condition of causality”
-> no longer a precondition that X and Y have simple association
EXAMPLE: Consider a scenario where a new educational program (X) is designed to improve students’ test scores (Y) by increasing their motivation (M). If the program doesn’t directly improve test scores but significantly boosts motivation, which in turn leads to better scores, a direct correlation between X (program) and Y (scores) might be weak or absent. However, the program still has a causal effect on the scores through the mediator (motivation).
What if X and M interact with each other? Does it change the statistical analysis?
if effect of M on Y is not straightforward
-> changes depending on X
-> this needs to be accounted for
-> include an interaction term XM (like in moderation analysis)
-> coefficient b needs to be reconsidered
-> direct effect of X on Y is affected
-> there is no longer a simple direct effect, because this changes depending on M
(key difference of mediation to moderation analysis!)
Should there be testing for a possible interaction XM?
No
selective testing
evidence-based decision
no reason for prioritisation
-> equal possibility for correlations!
overfitting a model is unnecessary
What is sufficient to conclude an indirect effect/mediation of X-M-Y?
A rejection of the null hypothesis that the indirect effect is zero (or an interval estimate that doesn’t include zero) is sufficient to support a claim of
mediation of the effect of X on Y through M.
tests of significance for the individual paths a and b are not required to determine whether M mediates the effect of X on Y, contrary to the causal steps logic which requires that both a and b are statistically significant.
Indeed, one does not even need to establish that the total effect of X as quantified by c is different from zero, since the size of c does not determine or constrain the size of ab.
⇒ Rather, all that matters is whether ab is different from zero by some kind of inferential standard such as a null hypothesis test or confidence interval.
What are three principles of mediation analysis?
- empirical claims should be based on a quantificaiton of the effect most directly relevant to that claim
- if ab quantifies the movement of Y by X through M, measure that
not a and b - it cannot be said, that if a and b are different from zero that ab is as well
- if ab quantifies the movement of Y by X through M, measure that
- a claim should be based on as few inferential tests as required in order to support it
- fallible by nature
- why require three, when you can do one for ab
- convey information about the uncertainty attached to estimates of quantities
- dichotomous decision of M
What is evidence of an existing mediation effect?
⇒ if the effect of X on Y when M is held constant (coefficient c’ in equation (3), called the direct effect of X) is closer to zero than is X’s effect without controlling for M (coefficient c in equation (1), the total effect of X), then M can be deemed a mediator of X’s effect on Y.
⇒ if M is held constant, the magnitude of the direct effect of X on Y diminishes
What is partial mediation, what is complete mediation?
partial mediation = patterns of findings where mediation is established in the presence of significant total effect of X and direct effect of X (c´) is different from zero
effect of X-Y is not fully explained by X-M-Y
complete/full mediation = all of the effect of X on Y is carried through the mediation process, meaning ab=c and c´=0
Which two linear models are required for a mediation analysis?
see notes.
M = im + aX + em
Y = iy + c´X + bM + ey
a = X on M
b = M on Y
c´= X on Y
What is OLS regression analysis?
fundamental statistical method used to estimate the relationships between a dependent variable and one or more independent variables
What is the linear equation that best predicts the dependent variable based on the independent variables?
What are some assumptions the OLS regression analysis makes?
- Linearity: The relationship between the independent and dependent variables is linear.
- Independence: The residuals (errors) are independent of each other.
- Homoscedasticity: The variance of the error terms is constant across all levels of the independent variables.
- Normality: The residuals are normally distributed (particularly important for hypothesis testing regarding coefficients).
What is the direct effect of X on Y?
c´ = adjusted mean difference
two cases that differ by one unit on X but are equal on M are estimated to differ c´ units on Y
-> adjusted for M (held constant)
Why is M held constant in the estimation of the direct effect of X on Y?
Keeping M constant (or controlling for M) ensures that the direct effect of X on Y is isolated. This way, we can see how X influences Y directly, not through its effect on M. It’s like holding all other variables steady to focus solely on the relationship between X and Y.