week 4 Mediation Flashcards
what is Mediation?
Mediation looks at where a variable X predicts Y via a third variable M, called the mediator. The phrase ‘X influences Y via M’ captures the essence of what is happening, although there are other ways this relationship can work. We can examine multiple mediators simultaneously (e.g. the effect of stress on wellbeing is mediated by each of the Big Five personality traits, but the weight of each will differ). We can also control for other variables while this mediation is going on. More complex models can be fit using SEM methods, in fact, you could use SEM instead of regression.
Mediation is where 1 i/v causes a mediating variable, which then causes a d/v. ie by definition, there is a causal relationship. Mediation asks questions regarding ‘HOW’ eg “How does this intervention influence health?” (Moderation by contrast, is not causal, and asks for “whom” and “when” questions).
The causal nature is not always clear, and sometimes theoretical.
eg.
testing for mediation
There are many ways to test for mediation effects, often using structured equation modelling (SEM). This is complicated though. An eg of this is”causal steps mediation model” which is now not recommended as too cumbersome.
B&K terminology
For the B&K “causal steps mediation model”, which is now no longer used, there are still many articles which describe things in this way. In this model, consider was the direct relationship (X to Y) significant? and then also consider was the indirect relationship (X mediated by Z, to Y) significant? In the situation of Partial Mediation, the direct relationship is significant, but so is mediation, but if thinking Y totallydepends on Z, if have partial mediation, it would indicate that there were other factors aside from Z, that are mediators but not yet determined. Total mediation would occur when the direct path was insignificant but the indirect path was significant. ie there is only a significant relationship because of mediation. BUT we’re not really using this model anymore…Problems with B&K method, are overcome with Preacher and Hayes path analysis bootstrap technique, which can be used in SPSS.
Direct and Indirect Effects
Direct effect =I/V predicting D/V (with no mediator). This effect, may or may not be present when there is mediation.
The Indirect effect is the effect on the D/V when there is a mediatorbeing effected by the i/v. There may be multiple mediator effects. When have 1 i/v, multiple mediators, and 1 d/v, the pathways are described as eg X to Mediator as a, and pathway Mediator to Y as b etc.
These models can be fitted with relatively arbitrary numbers of mediators, and when you develop equations using these Xs, Ms, as, bs and Ys, the value of ‘a’ in the equation is simply just the standard deviation of your predictor and your mediator and a correlation.
Finding ‘b’ can involve a long and complex formula. However, for that simple drawing, you can write down the equations that are needed. These are composed of correlations, standard deviations, and figures you can get straight out of the descriptors of an SPSS output. And these get derived via the bootstrapping process of re-sampling and re-sampling the data.
Bootstrapping
Preacher & Hayes macros,
and Hayes macros are examples of statistical bootstrapping. The process of bootsrapping is used extensively to test for mediation effects.
The process involves recognising variance in some data, and then looking at the spread within and between groups, and considering its likelihood. You can figure this out by making some assumptions about the underlying distributions. But what if you couldn’t make those assumptions? What if you didn’t know? In that case, you could run a simulation.
Take the data set—let’s say you’ve got 100 cases in your sample, and you sample 100 points of data. You are taking a sample from your own sample of 100 cases. Doesn’t that mean you just take the sample? No, what you do is take 100 data points from your actual sample, but do it with replacement. So, the same data point could come up multiple times in a particular sample, and you calculate the means, standard deviations, correlations and so forth for each sample. And then you repeat that experiment thousands of times and look at the distribution. In this way, you empirically derive a sampling distribution and use that to make inferences.
In fact, in the mediation analysis that we use, there are some significant problems with understanding the sampling distribution of the mediated path—that is, that path from x via a through M via b, to the DV, y. There are some significant problems with trying to figure out how that should be distributed—an empirical study of the pathway distributions can help overcome those problems.
In Bootstrapping methods such as PROCESS, you tell spss what relationships you expect, and is able to then solve the equation in terms of what a is, what b is, and c etc.
Now we have these possible outcomes:
1.NO EFFECT, NO MEDIATION; the direct effect (X to Y) is not significant on its own, the indirect effect (a x b) is not significant. ie nothing happening her.
2.DIRECT ONLY; (X to Y) significant but no mediation effect.
3.COMPLIMENTARY MEDIATION;there are both direct and indirect effects, and they are in the same direction. So the indirect effect increases the direct effect.
4.COMPETITIVE MEDIATION; there are both direct and indirect effects but they are in opposite directions. Doesn’t really fit B&K model, but could be a “partial effect”.
5.INDIRECT ONLY MEDIATION;this is the “full mediation” in B&K. Note, can have full mediation with an insignificant c!
Mediation with Process Macro
Usually use labels such as X (predictor)for i/v,
Outcome (or Y) for d/v, and Med1, Med2 etc for mediators.
select model 4 for simple mediation.
Dichotomous and polytomous IVs
Dichotomous and polytomous IVs
It has been stressed that regression analysis really requires quantitative variables and quasi-continuous ones (ones that have a smooth distribution). It depends on a number of factors, but you can use variables that have six, seven, or eight points in regression—that’s often close enough to continuous for it to work, as long as the distribution nicely covers that whole domain.
You can create a regression analysis with dichotomous variables using the macro, as long as you don’t use too many variables. Too many variables can make the regression hard to interpret.
The regression is done by dummy coding. A common example is to use something like gender. Code one group as 0, and one group as 1. In the context of a regression equation, x is going to be 0 or 1—and x is going to turn the b on and off. If X is 0, B1X1 is 0. If X is 1 B1X1 is just B1. This means that the variable coded 01 gives you the B weight, where the b weight is just the mean difference between your two groups—in this example, males and females. Putting an interaction between a dummy-coded dichotomous variable and some other variable will produce a difference of slopes between, in this example, two genders.
The Process macro can do some analysis using categorical predictors, but Hayes’s book contains more details on how those models work. SPSS also has some built-in techniques, including ‘CATREG’ (categorical regression), and HETCORR, which can compute correlations for polygamous variables. These kinds of outputs can be used in factor analysis, which is very useful.
There are some far more advanced programs also.
sample and effect size
As discussed last week for moderation, when deciding how many cases you need for your sample, you need to consider that the sample size is not just a regression sample size. You need a sample size sufficient to detect the mediation effect itself.
So how many cases do you really need? Remember that rules of thumb are not very accurate, but you can use them as basic minimum suggested sample sizes. You can use G*Power to calculate the total R squared for a model or even use SPSS’s new power module, which is quite helpful. However, unfortunately, G power does not give you the correct power for a mediation analysis for the same reason that it doesn’t give you the correct power off the bat for a moderation analysis. You might look at relevant literature to estimate that you are expecting an effect of a certain size, e.g. ‘most social psychology experiments in this area typically have a medium effect, so I will use that as my benchmark’. Or if it is a clinical sample, you might decide ‘I really need to have an effect that’s at least large or else the therapy is not worth it’.
So when you’ve done the power analysis correctly, the bit that’s missing is that what you really wanted to know was, is the moderator itself significant? And to do that, you have to use the R-squared difference formula. Unfortunately, there is no analytic model for this particular situation, so you have to actually do a simulation to make an empirical estimate. This way, you can use known data and see if the simulation will give you the right answer.