How can I compute predictive margins for xtmelogit with random effects?

Let’s start off by down loading some data and running a binary mixed model with random intercepts and slopes.

use https://stats.idre.ucla.edu/stat/data/hsbdemo, clear

meqrlogit honors read i.female || cid: read, var covar(unstr)

Refining starting values: 

Iteration 0:   log likelihood = -119.30049  (not concave)
Iteration 1:   log likelihood = -105.82055  (not concave)
Iteration 2:   log likelihood = -88.907174  

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -88.907174  (not concave)
Iteration 1:   log likelihood = -84.544443  (not concave)
[output omitted]
Iteration 12:  log likelihood = -78.268094  
Iteration 13:  log likelihood = -78.268094  

Mixed-effects logistic regression               Number of obs      =       200
Group variable: cid                             Number of groups   =        20

                                                Obs per group: min =         7
                                                               avg =      10.0
                                                               max =        12

Integration points =   7                        Wald chi2(2)       =      8.29
Log likelihood = -78.268094                     Prob > chi2        =    0.0159

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        read |   .1084052   .0671505     1.61   0.106    -.0232074    .2400177
    1.female |   1.309964   .5183077     2.53   0.011        .2941    2.325829
       _cons |   -8.18226   3.894754    -2.10   0.036    -15.81584    -.548682
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
cid: Unstructured            |
                   var(read) |   .0239163   .0244789      .0032171    .1777943
                  var(_cons) |   102.4491   96.65305      16.12367    650.9578
             cov(read,_cons) |   -1.55161   1.532405     -4.555068    1.451848
------------------------------------------------------------------------------
LR test vs. logistic regression:     chi2(3) =    14.35   Prob > chi2 = 0.0025

Note: LR test is conservative and provided only for reference.

Let’s try to use the margins command to get the predictive means (probabilities) for males and females holding the reading score at 50.

margins female, at(read=50) predict(mu)

prediction is a function of possibly stochastic quantities other than e(b)

It doesn’t work because the model has random effects in addition to the fixed effects. To get margins to work we need to include the predict(mu fixed) option.

margins female, at(read=50) predict(mu fixed)

Adjusted predictions                              Number of obs   =        200

Expression   : Predicted mean, fixed portion only, predict(mu fixed)
at           : read            =          50

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      female |
          0  |   .0594124   .0466851     1.27   0.203    -.0320887    .1509134
          1  |   .1896881   .1177394     1.61   0.107    -.0410769    .4204531
------------------------------------------------------------------------------

So, if margins won’t compute predictive margins with random effects we will have to compute them manually. We will begin with the easier task of computing predicted probabilities that include both the fixed and random effects. Of course, there is an option in predict that will do this. We will use predict, mu to check the results of our computation.

predict mu, mu  // mu contains both fixed effects and random effects

Now we will replicate mu using the predicted random effects for both the intercept and the slope. These are the Best Linear Unbiased Predictors (BLUPs) also called empirical Bayes predictors.

predict re*, reffects

. describe re1 re2

              storage  display     value
variable name   type   format      label      variable label
--------------------------------------------------------------------------------------------------
re1             float  %9.0g                  random effects for cid: read
re2             float  %9.0g                  random effects for cid: _cons

We will use these BLUPs along with the coefficients from our model. The re2 values get added to the constant from the model and the re1 values are added to the coefficient for read. Since the variable rxb is in the log odds metric we need to exponentiate it to obtain predicted probabilities.

generate rxb = _b[1.female]*female + (_b[read]+re1)*read + (_b[_cons]+re2)

generate mu2 = exp(rxb)/(1+exp(rxb))

list mu mu2 in 1/20

     +---------------------+
     |       mu        mu2 |
     |---------------------|
  1. | .0068868   .0068868 |
  2. | .0018676   .0018676 |
  3. | .0041634   .0041634 |
  4. | .0030219   .0030219 |
  5. | .0041634   .0041634 |
     |---------------------|
  6. | .0244843   .0244843 |
  7. | .0011538   .0011538 |
  8. | .0239256   .0239256 |
  9. | .0152577   .0152577 |
 10. | .0018676   .0018676 |
     |---------------------|
 11. | .0068868   .0068868 |
 12. | .0039202   .0039202 |
 13. | .0142891   .0142891 |
 14. | .0088657   .0088658 |
 15. | .0075425   .0075425 |
     |---------------------|
 16. | .0088119   .0088119 |
 17. | .0024226   .0024226 |
 18. | .0232304   .0232304 |
 19. | .0017392   .0017392 |
 20. | .0017392   .0017392 |
     +---------------------+

The listing above shows that our manually computed mu2 is equal to the value mu generated by the predict command.

Let’s return to computing the predictive margins with random effect for males and females while holding reading at 50. In the code below rxb0 and rxb1 are the predictive margins in log odds for males and females respectively. To get the predictive margins in the probability metric we exponentiate the values to get p0 and p1. Again, the zero term is for males and the one term is for females.

generate rxb0 = _b[1.female]*0 + (_b[read]+re1)*50 + (_b[_cons]+re2)

generate rxb1 = _b[1.female]*1 + (_b[read]+re1)*50 + (_b[_cons]+re2)

generate p0 = exp(rxb0)/(1+exp(rxb0))

generate p1 = exp(rxb1)/(1+exp(rxb1))

summarize p0 p1

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          p0 |       200    .1755775    .2356444   .0125389    .829953
          p1 |       200    .3156575     .303865   .0449446   .9476115

These probabilities are rather different from the values we computed earlier using just the fixed effects. This highlights the fact that estimating predicated values while averaging over the fixed effects (e.g. predict) is not the same as estimating predicted values assuming the random effect is zero (e.g. margins) If we look at p0 and p1 within cid equal 1 we see that all the values for each variable are the same.

list p0 p1 if cid==1

     +---------------------+
     |       p0         p1 |
     |---------------------|
  1. | .0239256   .0832776 |
  2. | .0239256   .0832776 |
  3. | .0239256   .0832776 |
  4. | .0239256   .0832776 |
  5. | .0239256   .0832776 |
     |---------------------|
  6. | .0239256   .0832776 |
  7. | .0239256   .0832776 |
  8. | .0239256   .0832776 |
  9. | .0239256   .0832776 |
 10. | .0239256   .0832776 |
     |---------------------|
 11. | .0239256   .0832776 |
     +---------------------+

This means that the mean values computed by summarize above are influenced by the number of observations within each cid. That is, larger cids influence the mean more than smaller cids. To get around this we can use just one value from each cid. This will produce a more balanced predicted values. Here is a way to do this.

bysort cid: gen seq=_n

summarize p0 p1 if seq==1

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          p0 |        20    .1790172    .2416508   .0125389    .829953
          p1 |        20    .3212767    .3125198   .0449446   .9476115

Now we are ready to tackle a more complex model and set of predictive margins. We will add socst to our model as a covariate. But first, we will drop the variables we created above.

drop mu-p1

xtmelogit honors read female socst || cid: read, var covar(unstr)

Refining starting values: 

Iteration 0:   log likelihood = -118.63766  (not concave)
Iteration 1:   log likelihood = -105.09158  (not concave)
Iteration 2:   log likelihood = -88.297585  

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -88.297585  (not concave)
Iteration 1:   log likelihood = -83.769166  (not concave)
[output omitted]
Iteration 8:   log likelihood = -77.974955  
Iteration 9:   log likelihood = -77.974949  

Mixed-effects logistic regression               Number of obs      =       200
Group variable: cid                             Number of groups   =        20

                                                Obs per group: min =         7
                                                               avg =      10.0
                                                               max =        12

Integration points =   7                        Wald chi2(3)       =      9.12
Log likelihood = -77.974949                     Prob > chi2        =    0.0277

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        read |   .0980242   .0658639     1.49   0.137    -.0310668    .2271151
      female |   1.263858    .514295     2.46   0.014     .2558583    2.271858
       socst |   .0243635   .0317934     0.77   0.443    -.0379504    .0866774
       _cons |  -8.900997   3.842395    -2.32   0.021    -16.43195   -1.370041
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
cid: Unstructured            |
                   var(read) |   .0209259   .0228458      .0024626    .1778206
                  var(_cons) |   89.13344   89.85624      12.35767     642.902
             cov(read,_cons) |  -1.353323   1.426992     -4.150175    1.443529
------------------------------------------------------------------------------
LR test vs. logistic regression:     chi2(3) =    12.38   Prob > chi2 = 0.0062

Note: LR test is conservative and provided only for reference.

predict re*, reffects

This time we want the predictive margins for males and females holding reading at values running from 30 to 70 in increments of five while leaving socst free to vary. We will do this with some temporary variables and three matrices: m predictive margins for males, f predictive margins for females and d the difference in predictive margins between males and females. Because of the local variables in the code below you will need to run the block of code all at one time, not one line at a time.

*** begin block ***

tempvar rxb0 rxb1 p0 p1
generate `rxb0' = .    // initialize temp var
generate `rxb1' = .    // initialize temp var
generate `p0' = .      // initialize temp var
generate `p1' = .      // initialize temp var  
matrix m = J(9,1,0)    // initialize matrix for males
matrix f = J(9,1,0)    // initialize matrix for females
matrix d = J(9,1,0)    // initialize matrix for difference

local i = 1            // initialize local macro

forvalues j=30(5)70 {
quietly replace `rxb0' = _b[female]*0 + (_b[read]+re1)*`j' + _b[socst]*socst + _b[_cons]+re2
quietly replace `rxb1' = _b[female]*1 + (_b[read]+re1)*`j' + _b[socst]*socst + _b[_cons]+re2
quietly replace `p0' = exp(`rxb0')/(1+exp(`rxb0'))
quietly sum `p0'
matrix m[`i',1] = r(mean)
quietly replace `p1' = exp(`rxb1')/(1+exp(`rxb1'))
quietly sum `p1'
matrix f[`i',1] = r(mean)
matrix d[`i',1] = f[`i',1] - m[`i',1]
local i = `i' + 1      // increment local macro
}

*** end block ***

We can look at the predictive margins for males by listing matrix m.

matrix list m

m[9,1]
           c1
r1  .17859283
r2  .17596309
r3  .17393653
r4  .17354885
r5  .17695161
r6  .18797502
r7  .21236354
r8   .2572271
r9  .32805038

This first value above is the predictive margins for males with a reading score of 30 averaged across the observed values of socst. So, it does not matter that the number of observations differ within cid because the values of socst vary within cid.

Next, we will save the matrices as observations to our dataset.

matrix r = (303540455055606570)
svmat r
svmat m
svmat f
svmat d

list r1 m1 f1 d1 in 1/9

     +-------------------------------------+
     | r1         m1         f1         d1 |
     |-------------------------------------|
  1. | 30   .1785928   .2603539   .0817611 |
  2. | 35   .1759631   .2647262   .0887631 |
  3. | 40   .1739365   .2729119   .0989754 |
  4. | 45   .1735488   .2877812   .1142324 |
  5. | 50   .1769516   .3135495   .1365979 |
     |-------------------------------------|
  6. | 55    .187975   .3555776   .1676026 |
  7. | 60   .2123635   .4193873   .2070238 |
  8. | 65   .2572271   .5054632   .2482362 |
  9. | 70   .3280504   .5996285   .2715782 |
     +-------------------------------------+

Now we are ready to graph the predictive margins for males and females.

twoway (line m1 r1)(line f1 r1), ytitle(Predictive Margins) xtitle(Reading Score) ///
       legend(order(1 "Males" 2 "Females")) name(predictive, replace) ylabel(0(.2).6)

And finally, we can plot the difference between males and females in their predictive margins.

twoway (line d1 r1), ytitle(Difference in Predictive Margins) ///
       xtitle(Reading Score) name(difference, replace)

This general approach can be extended to more complex model with a concomitant increase in the complexity of our code.