NOTE: We are not fully confident that the methods on this page are valid for testing for mediated effects in multilevel models. Proceed at your own risk.
Mediator variables are variables that sit between the independent variable and dependent variable and mediate the effect of the IV on the DV. A model with one mediator is shown in the figure below.
The idea, in mediation analysis, is that some of the effect of the predictor variable, the IV, is transmitted to the DV through the mediator variable, the MV. And some of the effect of the IV passes directly to the DV. That portion of of the effect of the IV that passes through the MV is the indirect effect. The program ml_mediation (see How can I use the search command to search for programs and get additional help? for more information about using search). will compute direct and indirect effects for multilevel data. The approach used in ml_mediation was adapted from Krull & MacKinnon (2001).
When you have multilevel data, the variables may come from different levels of the model. The DV will always be a level one variable. Depending on your data, the IV and MV may be either level 1 or level 2 variables. According to Krull & MacKinnon (2001) a predictor variable may be mediated by a variable at the same level or lower. Thus a level 2 mediator may be mediated by a level 2 or level 1 variable. A level 1 predictor may only be mediated by another level 1 variable. Logically, a level 1 predictor cannot affect a level 2 mediator.
ml_mediation computes the indirect effect as the product of coefficients, i.e., indirect effect = coef[a]*coef[b]. When the response varible is at level 1, ml_mediation uses the xtmixed, reml command by default with xtmixed, mle as an option. When the response variable is at level 2, i.e., the MV is level 2, ml_mediation uses the xtreg, be command. The ml_mediation program will detect which variables are level 1 and which are level 2.
The DV and MV must be a continuous variables. The IV may be a continuous or binary predictor variable. While the CVs may be continuous, binary or factor variables.
We will illustrate the use of the ml_mediation command with a simulated multilevel dataset, ml_med.dta.. Let’s look at the data.
use https://stats.idre.ucla.edu/stat/data/ml_med, clear summarize, sep(0) /* descriptive statistics */ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- id | 200 100.5 57.87918 1 200 write | 200 52.775 9.478586 31 67 socst | 200 52.405 10.73579 26 71 cid | 200 10.43 5.801152 1 20 abil | 200 156.725 25.75063 104 215 mean_abil | 200 156.725 25.21654 114.0909 205.7 mean_ses | 200 2.055 .3142828 1.444444 2.727273 hon | 200 .545 .4992205 0 1
The variables write, socst, abil and hon are all level 1 variables. The variable cid is the cluster, level 2, identifier, while hon is a binary variable that indicates membership in the honor society. Abil is a composite measure of academic ability. Now, we are ready to try a multilevel mediation model in which all of the variables are at level 1.
ml_mediation, dv(write) iv(hon) mv(abil) l2id(cid) Equation 1 (c_path): write = hon Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -628.62552 Iteration 1: log restricted-likelihood = -628.62552 Computing standard errors: Mixed-effects REML regression Number of obs = 200 Group variable: cid Number of groups = 20 Obs per group: min = 7 avg = 10.0 max = 12 Wald chi2(1) = 32.80 Log restricted-likelihood = -628.62552 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ write | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hon | 4.138289 .7225934 5.73 0.000 2.722032 5.554546 _cons | 50.64367 1.84665 27.42 0.000 47.0243 54.26304 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ cid: Identity | sd(_cons) | 7.91701 1.331807 5.693395 11.00908 -----------------------------+------------------------------------------------ sd(Residual) | 4.823492 .2549056 4.34889 5.349889 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 191.99 Prob >= chibar2 = 0.0000 Equation 2 (a_path): abil = hon Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -659.69204 Iteration 1: log restricted-likelihood = -659.69204 Computing standard errors: Mixed-effects REML regression Number of obs = 200 Group variable: cid Number of groups = 20 Obs per group: min = 7 avg = 10.0 max = 12 Wald chi2(1) = 31.36 Log restricted-likelihood = -659.69204 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ abil | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- hon | -4.265397 .7616216 -5.60 0.000 -5.758148 -2.772647 _cons | 159.3095 5.751541 27.70 0.000 148.0367 170.5823 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ cid: Identity | sd(_cons) | 25.60223 4.169551 18.60596 35.22926 -----------------------------+------------------------------------------------ sd(Residual) | 5.074532 .2681952 4.575188 5.628375 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 537.80 Prob >= chibar2 = 0.0000 Equation 3 (b_path & c_prime): write = abil hon Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -528.74216 Iteration 1: log restricted-likelihood = -528.74216 Computing standard errors: Mixed-effects REML regression Number of obs = 200 Group variable: cid Number of groups = 20 Obs per group: min = 7 avg = 10.0 max = 12 Wald chi2(2) = 665.58 Log restricted-likelihood = -528.74216 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ write | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- abil | -.8056925 .0348556 -23.12 0.000 -.8740083 -.7373768 hon | .671848 .3882241 1.73 0.084 -.0890572 1.432753 _cons | 179.0213 8.446553 21.19 0.000 162.4664 195.5763 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ cid: Identity | sd(_cons) | 28.44004 4.705583 20.56333 39.33388 -----------------------------+------------------------------------------------ sd(Residual) | 2.38897 .1268631 2.152825 2.651018 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 247.90 Prob >= chibar2 = 0.0000 The mediator, abil, is a level 1 variable c_path = 4.1382892 a_path = -4.2653975 b_path = -.80569254 c_prime = .67184798 same as dir_eff ind_eff = 3.4365989 dir_eff = .67184798 tot_eff = 4.1084469 proportion of total effect mediated = .83647154 ratio of indirect to direct effect = 5.1151437 ratio of total to direct effect = 6.1151437
The output includes the results of three equations: 1) the DV on the IV, 2) the MV on the IV, and 3) the DV on the MV and IV. The direct, indirect and total effects along with various proportions and ratios are shown below the results of the three equations.
We see that hon is significant in equation 1 and is also a significant predictor of the mediator variable, abil, in equation 2. However, hon is not significant in equation 3 when the mediator is included in the model. This suggests that there is mediation. The output includes the indirect, direct and total effects. It does not however include standard errors or confidence intervals. To get these you need to bootstrap the results. You can bootstrap any of the effects found in the return list.
return list scalars: r(tot_eff) = 4.108446903443488 r(dir_eff) = .6718479771360948 r(ind_eff) = 3.436598926307393 r(b_path) = -.8056925398919483 r(a_path) = -4.265397476273364 r(c_path) = 4.13828918116252
We will illustrate this by bootstrapping the ml_mediation command with 500 replications. You may want to do more than 500 reps, maybe a lot more. You will probably also want to use a differnt seed value. Please note that we are bootstrapping cluster so we need the cluster option. We also need to give the clusters a new id when they are resampled, thus the idcluster option. Note that we now have to use the new cluster name, ncid, in the ml_mediation command.
bootstrap indeff=r(ind_eff) direff=r(dir_eff) toteff=r(tot_eff), /// reps(500) seed(1) cluster(cid) idcluster(ncid): /// ml_mediation, dv(write) iv(hon) mv(abil) l2id(ncid) Bootstrap results Number of obs = 200 Replications = 500 command: ml_mediation, dv(write) iv(hon) mv(abil) l2id(ncid) indeff: r(ind_eff) direff: r(dir_eff) toteff: r(tot_eff) (Replications based on 20 clusters in cid) ------------------------------------------------------------------------------ | Observed Bootstrap Normal-based | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- indeff | 3.436599 .7181118 4.79 0.000 2.029126 4.844072 direff | .671848 .3500109 1.92 0.055 -.0141608 1.357857 toteff | 4.108447 .7714546 5.33 0.000 2.596424 5.62047 ------------------------------------------------------------------------------
If you have concerns about the normal based confidence confidence intervals, you can obtain percentile or bc confidence intervals with the estat boot command.
estat boot, percentile bc Bootstrap results Number of obs = 200 Replications = 500 command: ml_mediation, dv(write) iv(hon) mv(abil) l2id(ncid) indeff: r(ind_eff) direff: r(dir_eff) toteff: r(tot_eff) (Replications based on 20 clusters in cid) ------------------------------------------------------------------------------ | Observed Bootstrap | Coef. Bias Std. Err. [95% Conf. Interval] -------------+---------------------------------------------------------------- indeff | 3.4365989 .0173307 .71811179 2.092823 5.00083 (P) | 2.18301 5.032196 (BC) direff | .67184798 -.0004241 .35001093 .0312456 1.423976 (P) | .0567802 1.446936 (BC) toteff | 4.1084469 .0169066 .77145463 2.610976 5.739329 (P) | 2.601489 5.61782 (BC) ------------------------------------------------------------------------------ (P) percentile confidence interval (BC) bias-corrected confidence interval
Based on the confidence intervals it appears that the direct, indirect and total effects are statistically significant at the alpha equal .05 level.
References
Krull,J.L. & MacKinnon,D.P. (2001) Multilevel modeling of individual and group level mediated effects. Multivariate Behavioral Research, 36(2), 249-277.