Consider an anova design with three between-subject factors; t (2 levels), c (3 levels) and p (4 levels) and one within-subject factor s with 2 levels. The within-subject factor is a pre/post variable for 350 subjects. If we were to run an anova the Stata code would look like this:
anova y t##c##p / id|t#c#p s##t##c##p
With 350 subjects this design is huge. It will not run in Stata/IC and will only run in Stata/SE when given lots of memory, variables, matsize and time. One alternative is to run the model using mixed something like this:
mixed y t##c##p##s || id:
This works fine but Stata’s factor variables use indicator (dummy) coding so it is not easy to produce anova type results with main effects and interactions. In fact the only term that is equivalent to anova is the s#t#c#p interaction term. It is possible to compute the other interactions and the main effects using linear combinations of coefficients but with a design this large it would almost be prohibitively difficult.
There is another way we could get the results that we want and that is to use a different coding scheme, say an orthogonal scheme such as Helmert coding. We can do this using a program written by John Hendrickx called desmat (search desmat). To find out more about desmat go to the Desmat home page. Here is the desmat command for our model and its output.
desmat t*c*p*s, defcon(hel)
Desmat generated the following design matrix:
nr Variables Term Parameterization
First Last
1 _x_1 t helm(F)
2 _x_2 _x_3 c helm(F)
3 _x_4 _x_5 t.c helm(F).helm(F)
4 _x_6 _x_8 p helm(F)
5 _x_9 _x_11 t.p helm(F).helm(F)
6 _x_12 _x_17 c.p helm(F).helm(F)
7 _x_18 _x_23 t.c.p helm(F).helm(F).helm(F)
8 _x_24 s helm(F)
9 _x_25 t.s helm(F).helm(F)
10 _x_26 _x_27 c.s helm(F).helm(F)
11 _x_28 _x_29 t.c.s helm(F).helm(F).helm(F)
12 _x_30 _x_32 p.s helm(F).helm(F)
13 _x_33 _x_35 t.p.s helm(F).helm(F).helm(F)
14 _x_36 _x_41 c.p.s helm(F).helm(F).helm(F)
15 _x_42 _x_47 t.c.p.s helm(F).helm(F).helm(F).helm(F)
The desmat output shows us that there are 15 terms in the model with 47 parameters to be estimated. Now we can go ahead and run mixed with these new variables.
mixed y _x* || id:
Performing EM optimization:
Performing gradient-based optimization:
Iteration 0: log restricted-likelihood = 870.89244
Iteration 1: log restricted-likelihood = 870.89251
Computing standard errors:
Mixed-effects REML regression Number of obs = 644
Group variable: id Number of groups = 322
Obs per group: min = 2
avg = 2.0
max = 2
Wald chi2(47) = 185.80
Log restricted-likelihood = 870.89251 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_x_1 | .025292 .0057073 4.43 0.000 .014106 .036478
_x_2 | -.0446694 .0060619 -7.37 0.000 -.0565505 -.0327883
_x_3 | -.0217396 .0069802 -3.11 0.002 -.0354205 -.0080588
_x_4 | .0057967 .0121238 0.48 0.633 -.0179655 .029559
_x_5 | -.0017563 .0139603 -0.13 0.900 -.0291181 .0256054
_x_6 | .0149388 .0062873 2.38 0.017 .002616 .0272616
_x_7 | .008656 .0071202 1.22 0.224 -.0052993 .0226114
_x_8 | .0027882 .0082792 0.34 0.736 -.0134388 .0190152
_x_9 | -.0026387 .0125745 -0.21 0.834 -.0272843 .0220069
_x_10 | .0084589 .0142404 0.59 0.553 -.0194519 .0363696
_x_11 | .0024302 .0165585 0.15 0.883 -.0300238 .0348842
_x_12 | .0012552 .0136317 0.09 0.927 -.0254625 .0279728
_x_13 | .0378146 .0150399 2.51 0.012 .0083369 .0672922
_x_14 | -.0144293 .0173664 -0.83 0.406 -.0484668 .0196082
_x_15 | .0026577 .015053 0.18 0.860 -.0268455 .032161
_x_16 | .024938 .0175149 1.42 0.155 -.0093906 .0592666
_x_17 | -.0291296 .0205043 -1.42 0.155 -.0693172 .011058
_x_18 | .0225859 .0272634 0.83 0.407 -.0308494 .0760211
_x_19 | -.0221501 .0300798 -0.74 0.462 -.0811053 .0368052
_x_20 | .0183917 .0347328 0.53 0.596 -.0496834 .0864668
_x_21 | .0079867 .0301059 0.27 0.791 -.0510199 .0669932
_x_22 | .0471881 .0350298 1.35 0.178 -.0214691 .1158453
_x_23 | -.0065515 .0410085 -0.16 0.873 -.0869267 .0738237
_x_24 | .01577 .0029548 5.34 0.000 .0099786 .0215613
_x_25 | -.0178688 .0059096 -3.02 0.002 -.0294515 -.0062861
_x_26 | -.0090746 .0062769 -1.45 0.148 -.0213771 .0032278
_x_27 | -.0075168 .0072277 -1.04 0.298 -.0216828 .0066492
_x_28 | .0105155 .0125538 0.84 0.402 -.0140894 .0351205
_x_29 | -.0079447 .0144554 -0.55 0.583 -.0362767 .0203873
_x_30 | .0078044 .0065102 1.20 0.231 -.0049554 .0205642
_x_31 | -.0186991 .0073727 -2.54 0.011 -.0331494 -.0042489
_x_32 | .010444 .0085728 1.22 0.223 -.0063585 .0272464
_x_33 | .0325985 .0130204 2.50 0.012 .0070789 .0581181
_x_34 | .0001508 .0147454 0.01 0.992 -.0287497 .0290513
_x_35 | .0054314 .0171457 0.32 0.751 -.0281735 .0390362
_x_36 | .0088749 .0141151 0.63 0.530 -.0187902 .03654
_x_37 | .0133038 .0155732 0.85 0.393 -.0172192 .0438267
_x_38 | -.0356037 .0179823 -1.98 0.048 -.0708483 -.0003592
_x_39 | -.0075208 .0155868 -0.48 0.629 -.0380703 .0230287
_x_40 | -.0120109 .018136 -0.66 0.508 -.0475569 .0235351
_x_41 | -.004214 .0212314 -0.20 0.843 -.0458268 .0373987
_x_42 | -.0390237 .0282302 -1.38 0.167 -.0943538 .0163065
_x_43 | -.0165987 .0311465 -0.53 0.594 -.0776446 .0444473
_x_44 | -.0121021 .0359645 -0.34 0.736 -.0825913 .058387
_x_45 | -.0069893 .0311735 -0.22 0.823 -.0680883 .0541097
_x_46 | -.0095715 .0362721 -0.26 0.792 -.0806634 .0615204
_x_47 | -.0971743 .0424627 -2.29 0.022 -.1803998 -.0139489
_cons | -.3523807 .0028536 -123.49 0.000 -.3579737 -.3467877
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Identity |
sd(_cons) | .0425494 .0024652 .037982 .0476662
-----------------------------+------------------------------------------------
sd(Residual) | .0364142 .0014916 .033605 .0394582
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) = 120.77 Prob >= chibar2 = 0.0000
In addition to creating the new variables desmat also creates global macros that contain the variables for each term. Say we were interested in term 6, the c#p interaction. We could test it by typing, testparm $term6. Since the results are given in terms of chi-square we can convert it to an F-ratio by dividing the chi-square value by its degrees of freedom.
That’s fine, but what we really want is the anova type III F-ratios for all of the terms in the model. To do that we will write a short forvalues loop.
forvalues i=1/15 {
local term = "$"+"term"+"`i'"
testparm `term'
display "F = " r(chi2)/r(df)
}
( 1) [y]_x_1 = 0
chi2( 1) = 19.64
Prob > chi2 = 0.0000
F = 19.638667
( 1) [y]_x_2 = 0
( 2) [y]_x_3 = 0
chi2( 2) = 62.33
Prob > chi2 = 0.0000
F = 31.166002
( 1) [y]_x_4 = 0
( 2) [y]_x_5 = 0
chi2( 2) = 0.25
Prob > chi2 = 0.8828
F = .12470646
( 1) [y]_x_6 = 0
( 2) [y]_x_7 = 0
( 3) [y]_x_8 = 0
chi2( 3) = 7.55
Prob > chi2 = 0.0563
F = 2.5163986
( 1) [y]_x_9 = 0
( 2) [y]_x_10 = 0
( 3) [y]_x_11 = 0
chi2( 3) = 0.44
Prob > chi2 = 0.9325
F = .14565519
( 1) [y]_x_12 = 0
( 2) [y]_x_13 = 0
( 3) [y]_x_14 = 0
( 4) [y]_x_15 = 0
( 5) [y]_x_16 = 0
( 6) [y]_x_17 = 0
chi2( 6) = 9.73
Prob > chi2 = 0.1366
F = 1.621271
( 1) [y]_x_18 = 0
( 2) [y]_x_19 = 0
( 3) [y]_x_20 = 0
( 4) [y]_x_21 = 0
( 5) [y]_x_22 = 0
( 6) [y]_x_23 = 0
chi2( 6) = 3.52
Prob > chi2 = 0.7416
F = .58630685
( 1) [y]_x_24 = 0
chi2( 1) = 28.48
Prob > chi2 = 0.0000
F = 28.483828
( 1) [y]_x_25 = 0
chi2( 1) = 9.14
Prob > chi2 = 0.0025
F = 9.1425429
( 1) [y]_x_26 = 0
( 2) [y]_x_27 = 0
chi2( 2) = 3.06
Prob > chi2 = 0.2164
F = 1.530474
( 1) [y]_x_28 = 0
( 2) [y]_x_29 = 0
chi2( 2) = 1.04
Prob > chi2 = 0.5944
F = .52026247
( 1) [y]_x_30 = 0
( 2) [y]_x_31 = 0
( 3) [y]_x_32 = 0
chi2( 3) = 9.00
Prob > chi2 = 0.0293
F = 2.999999
( 1) [y]_x_33 = 0
( 2) [y]_x_34 = 0
( 3) [y]_x_35 = 0
chi2( 3) = 6.55
Prob > chi2 = 0.0877
F = 2.1835276
( 1) [y]_x_36 = 0
( 2) [y]_x_37 = 0
( 3) [y]_x_38 = 0
( 4) [y]_x_39 = 0
( 5) [y]_x_40 = 0
( 6) [y]_x_41 = 0
chi2( 6) = 5.37
Prob > chi2 = 0.4977
F = .89444705
( 1) [y]_x_42 = 0
( 2) [y]_x_43 = 0
( 3) [y]_x_44 = 0
( 4) [y]_x_45 = 0
( 5) [y]_x_46 = 0
( 6) [y]_x_47 = 0
chi2( 6) = 8.19
Prob > chi2 = 0.2248
F = 1.3643889
If you don’t mind the having chi-square as the test statistic there is an easier way to display your results using destest (included with desmat).
destest ------------------------------------------------------------------------------------------------------------------------------ Term Wald chi2 df P > chi2 ------------------------------------------------------------------------------------------------------------------------------ t 19.639** 1 0.000 c 62.332** 2 0.000 p 7.549 3 0.056 t.c 0.249 2 0.883 t.p 0.437 3 0.933 c.p 9.728 6 0.137 t.c.p 3.518 6 0.742 s 28.484** 1 0.000 c.s 3.061 2 0.216 t.c.s 1.041 2 0.594 p.s 9.000* 3 0.029 t.p.s 6.551 3 0.088 c.p.s 5.367 6 0.498 t.c.p.s 8.186 6 0.225 ------------------------------------------------------------------------------------------------------------------------------ * p < .05 ** p < .01
So that’s how to compute anova type III results using mixed.
