How can I get denominator degrees of freedom for mixed?

At first glance this may seem to be a very silly question. Everyone knows that mixed reports chi-square and that chi-square does not have denominator degrees of freedom. Certainly, mixed with its chi-square works very well on large datasets. But, what about with small experimental design type data? The problem with chi-square in small datasets is that the p-values are on the optimistic side. Anova with their F-ratios adjust for the small sample size by adjusting the denominator degrees of freedom.

Rescaling chi-square as an F-ratio is easy, just divide the chi-square value by its degrees of freedom. So a chi-square value of 6.9 with 3 df rescales to an F-ratio of 2.3 with 2 degrees of freedom. The trick is to estimate a reasonable value for the denominator degrees of freedom.

Consider the following two-group (a) design in which each subject receives four treatments (b) in a counterbalanced order.

use https://stats.idre.ucla.edu/stat/data/repeated_missing, clear

tab s b

           |                      b
         s |         1          2          3          4 |     Total
-----------+--------------------------------------------+----------
         1 |         1          1          1          1 |         4 
         2 |         1          1          1          1 |         4 
         3 |         1          1          0          1 |         3 
         4 |         1          1          1          1 |         4 
         5 |         1          0          1          1 |         3 
         6 |         1          1          1          1 |         4 
         7 |         0          1          1          1 |         3 
         8 |         1          1          0          1 |         3 
-----------+--------------------------------------------+----------
     Total |         7          7          6          8 |        28

Due to random instrument failure one observation on each of four subjects is missing. If we were to run this as a traditional repeated measures anova we would have to drop all of the data for subjects 3, 5, 7 and 8. By running the analysis using mixed we can retain all of the observations.

mixed y a##b || s:, var reml

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log restricted-likelihood = -31.286348  
Iteration 1:   log restricted-likelihood =  -31.28616  
Iteration 2:   log restricted-likelihood =  -31.28616  

Computing standard errors:

Mixed-effects REML regression                   Number of obs      =        28
Group variable: s                               Number of groups   =         8

                                                Obs per group: min =         3
                                                               avg =       3.5
                                                               max =         4


                                                Wald chi2(7)       =    224.12
Log restricted-likelihood =  -31.28616          Prob > chi2        =    0.0000

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         2.a |  -.4684204    .703144    -0.67   0.505    -1.846557    .9097166
             |
           b |
          2  |        .25   .5749169     0.43   0.664    -.8768165    1.376816
          3  |   3.263198   .6279667     5.20   0.000     2.032406     4.49399
          4  |       4.25   .5749169     7.39   0.000     3.123184    5.376816
             |
         a#b |
        2 2  |   1.861466   .8920747     2.09   0.037     .1130314      3.6099
        2 3  |   .7925351   .9271515     0.85   0.393    -1.024648    2.609719
        2 4  |    2.71842   .8515934     3.19   0.001     1.049328    4.387513
             |
       _cons |       3.75   .4638207     8.09   0.000     2.840928    4.659072
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
s: Identity                  |
                   sd(_cons) |   .4466089   .2491581      .1496413    1.332917
-----------------------------+------------------------------------------------
                sd(Residual) |   .8130553   .1495124       .567013    1.165862
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) =     1.45 Prob >= chibar2 = 0.1139

We can checkout what is and is not significant according to mixed using the contrast command.

contrast a##b

Contrasts of marginal linear predictions

Margins      : asbalanced

------------------------------------------------
             |         df        chi2     P>chi2
-------------+----------------------------------
y            |
           a |          1        3.88     0.0489
             |
           b |          3      207.84     0.0000
             |
         a#b |          3       11.56     0.0091
------------------------------------------------

These results imply that the interaction and both main effects are statistically significant. However, there are only four subjects nested in each level of variable b. If there were no missing observations across time a repeated measures anova be our best bet. But since there are missing observations we will rescale the chi-square values to F-ratios and try to estimate the denominator degrees of freedom that can used with the F-distribution.

The way we will do this is to run anova to obtain the between and within degrees of freedom. Although we are running anova we won’t look at the anova results but only at the degrees of freedom. We also won’t bother with the repeated option for anova. We will boldface the degrees of freedom of interest.

anova y a / s|a b a#b

                           Number of obs =      28     R-squared     =  0.9446
                           Root MSE      = .825177     Adj R-squared =  0.8932

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  162.574315    13  12.5057165      18.37     0.0000
                         |
                       a |  5.12395138     1  5.12395138       3.86     0.0971
                     s|a |  7.96717172     6  1.32786195   
              -----------+----------------------------------------------------
                       b |  136.083668     3  45.3612228      66.62     0.0000
                     a#b |  7.95033498     3  2.65011166       3.89     0.0325
                         |
                Residual |  9.53282828    14  .680916306   
              -----------+----------------------------------------------------
                   Total |  172.107143    27  6.37433862

You didn’t look at the F-ratios, did you? Just look at the two bolded degrees of freedom.

So now we know that the denominator degrees of freedom are 6 and 14. We can now rescale the chi-square vales for mixed as F-ratios and obtain p-values.

First the a#b interaction.

chi-square = 11.56 df = 3

F = 11.56/3 = 3.8533333 df = 3 & 14  p-value = Ftail(3,14,3.8533333) = .03348207

The main effect for b has the same denominator degrees of freedom as the interaction.

chi-square = 207.84 df = 3

F = 207.84/3 = 69.28 df = 3 & 14  p-value = Ftail(3,14,69.28) = 1.218e-08

Finally, the a main effect which has six degrees of freedom in the denominator.

chi-square = 3.88 df = 3
F = 3.88/1 = 1 df = 1 & 6  p-value = Ftail(1,6,3.88) = .09638074

While the conclusions for b and a#b do not change the F-ratio for the a main effect is not significant even thought the mixed chi-square suggested that it was.