At first glance this may seem to be a very silly question. Everyone knows that mixed
reports chi-square and that chi-square does not have denominator degrees of freedom. Certainly,
mixed
with its chi-square works very well on large datasets. But, what about with
small experimental design type data? The problem with chi-square in small datasets is that the
p-values are on the optimistic side. Anova with their F-ratios adjust for the small sample
size by adjusting the denominator degrees of freedom.
Rescaling chi-square as an F-ratio is easy, just divide the chi-square value by its degrees of freedom. So a chi-square value of 6.9 with 3 df rescales to an F-ratio of 2.3 with 2 degrees of freedom. The trick is to estimate a reasonable value for the denominator degrees of freedom.
Consider the following two-group (a)
design in which each subject receives four
treatments (b
) in a counterbalanced order.
use https://stats.idre.ucla.edu/stat/data/repeated_missing, clear
tab s b
| b s | 1 2 3 4 | Total -----------+--------------------------------------------+---------- 1 | 1 1 1 1 | 4 2 | 1 1 1 1 | 4 3 | 1 1 0 1 | 3 4 | 1 1 1 1 | 4 5 | 1 0 1 1 | 3 6 | 1 1 1 1 | 4 7 | 0 1 1 1 | 3 8 | 1 1 0 1 | 3 -----------+--------------------------------------------+---------- Total | 7 7 6 8 | 28
Due to random instrument failure one observation on each of four subjects is missing.
If we were to run this as a traditional repeated measures anova we would have to
drop all of the data for subjects 3, 5, 7 and 8. By running the analysis using
mixed
we can retain all of the observations.
mixed y a##b || s:, var reml
Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -31.286348 Iteration 1: log restricted-likelihood = -31.28616 Iteration 2: log restricted-likelihood = -31.28616 Computing standard errors: Mixed-effects REML regression Number of obs = 28 Group variable: s Number of groups = 8 Obs per group: min = 3 avg = 3.5 max = 4 Wald chi2(7) = 224.12 Log restricted-likelihood = -31.28616 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 2.a | -.4684204 .703144 -0.67 0.505 -1.846557 .9097166 | b | 2 | .25 .5749169 0.43 0.664 -.8768165 1.376816 3 | 3.263198 .6279667 5.20 0.000 2.032406 4.49399 4 | 4.25 .5749169 7.39 0.000 3.123184 5.376816 | a#b | 2 2 | 1.861466 .8920747 2.09 0.037 .1130314 3.6099 2 3 | .7925351 .9271515 0.85 0.393 -1.024648 2.609719 2 4 | 2.71842 .8515934 3.19 0.001 1.049328 4.387513 | _cons | 3.75 .4638207 8.09 0.000 2.840928 4.659072 ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ s: Identity | sd(_cons) | .4466089 .2491581 .1496413 1.332917 -----------------------------+------------------------------------------------ sd(Residual) | .8130553 .1495124 .567013 1.165862 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 1.45 Prob >= chibar2 = 0.1139
We can checkout what is and is not significant according to mixed
using the
contrast
command.
contrast a##b
Contrasts of marginal linear predictions Margins : asbalanced ------------------------------------------------ | df chi2 P>chi2 -------------+---------------------------------- y | a | 1 3.88 0.0489 | b | 3 207.84 0.0000 | a#b | 3 11.56 0.0091 ------------------------------------------------
These results imply that the interaction and both main effects are statistically
significant. However, there are only four subjects nested in each level of variable b
.
If there were no missing observations across time a repeated measures anova be our best bet.
But since there are missing observations we will rescale the chi-square values to F-ratios
and try to estimate the denominator degrees of freedom that can used with the F-distribution.
The way we will do this is to run anova
to obtain the between and within degrees
of freedom. Although we are running anova
we won’t look at the anova results
but only at the degrees of freedom. We also won’t bother with the repeated
option
for anova
. We will boldface the degrees of freedom of interest.
anova y a / s|a b a#b
Number of obs = 28 R-squared = 0.9446 Root MSE = .825177 Adj R-squared = 0.8932 Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 162.574315 13 12.5057165 18.37 0.0000 | a | 5.12395138 1 5.12395138 3.86 0.0971 s|a | 7.96717172 6 1.32786195 -----------+---------------------------------------------------- b | 136.083668 3 45.3612228 66.62 0.0000 a#b | 7.95033498 3 2.65011166 3.89 0.0325 | Residual | 9.53282828 14 .680916306 -----------+---------------------------------------------------- Total | 172.107143 27 6.37433862
You didn’t look at the F-ratios, did you? Just look at the two bolded degrees of freedom.
So now we know that the denominator degrees of freedom are 6 and 14. We can now rescale
the chi-square vales for mixed
as F-ratios and obtain p-values.
First the a#b
interaction.
chi-square = 11.56 df = 3 F = 11.56/3 = 3.8533333 df = 3 & 14 p-value = Ftail(3,14,3.8533333) = .03348207
The main effect for b
has the same denominator degrees of freedom as the interaction.
chi-square = 207.84 df = 3 F = 207.84/3 = 69.28 df = 3 & 14 p-value = Ftail(3,14,69.28) = 1.218e-08
Finally, the a
main effect which has six degrees of freedom in the denominator.
chi-square = 3.88 df = 3 F = 3.88/1 = 1 df = 1 & 6 p-value = Ftail(1,6,3.88) = .09638074
While the conclusions for b
and a#b
do not change the F-ratio for the
a
main effect is not significant even thought the mixed
chi-square
suggested that it was.
See also
xtmixed & denominator degrees of freedom: myth or magic — 2011 Chicago Stata Conference