xtreg with its various options performs regression analysis on panel datasets. In this FAQ we will try to explain the differences between xtreg, re and xtreg, fe with an example that is taken from analysis of variance. The example (below) has 32 observations taken on eight subjects, that is, each subject is observed four times. The eight subjects are evenly divided into two groups of four.
The design is a mixed model with both within-subject and between-subject factors. The within-subject factor (b) has four levels and the between-subject factor (a) has two levels. To keep the analysis simple we will not consider the a*b interaction.
----------------------------------
s | b1 b2 b3 b4
----------+-----------------------
a1 |
1 | 3 4 7 7
2 | 6 5 8 8
3 | 3 4 7 9
4 | 3 3 6 8
----------+-----------------------
a2 |
5 | 1 2 5 10
6 | 2 3 6 10
7 | 2 4 5 9
8 | 2 3 6 11
----------------------------------
We will begin by looking at the within-subject factor using xtreg-fe. The fe option stands for fixed-effects which is really the same thing as within-subjects. Notice that there are coefficients only for the within-subjects (fixed-effects) variables. Following the xtreg we will use the test command to obtain the three degree of freedom test of the levels of b.
use https://stats.idre.ucla.edu/stat/stata/faq/spf24
xtreg y i.a i.b, i(s) fe
note: 2.a omitted because of collinearity
Fixed-effects (within) regression Number of obs = 32
Group variable: s Number of groups = 8
R-sq: Obs per group:
within = 0.8722 min = 4
between = . avg = 4.0
overall = 0.8259 max = 4
F(3,21) = 47.77
corr(u_i, Xb) = -0.0000 Prob > F = 0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.a | 0 (omitted)
|
b |
2 | .75 .5824824 1.29 0.212 -.4613384 1.961338
3 | 3.5 .5824824 6.01 0.000 2.288662 4.711338
4 | 6.25 .5824824 10.73 0.000 5.038662 7.461338
|
_cons | 2.75 .4118772 6.68 0.000 1.893454 3.606546
-------------+----------------------------------------------------------------
sigma_u | .6681531
sigma_e | 1.1649647
rho | .24752475 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(7, 21) = 1.32 Prob > F = 0.2914
test 2.b 3.b 4.b
( 1) 2.b = 0
( 2) 3.b = 0
( 3) 4.b = 0
F( 3, 21) = 47.77
Prob > F = 0.0000
Next, we will use the be option to look at the between-subject effect. This time notice that only the coefficient for a is given as it represents the between-subjects effect.
xtreg y i.a i.b, i(s) be
note: 2.b omitted because of collinearity
note: 3.b omitted because of collinearity
note: 4.b omitted because of collinearity
Between regression (regression on group means) Number of obs = 32
Group variable: s Number of groups = 8
R-sq: Obs per group:
within = . min = 4
between = 0.2500 avg = 4.0
overall = 0.0133 max = 4
F(1,6) = 2.00
sd(u_i + avg(e_i.))= .625 Prob > F = 0.2070
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.a | -.625 .4419417 -1.41 0.207 -1.706392 .4563925
|
b |
2 | 0 (omitted)
3 | 0 (omitted)
4 | 0 (omitted)
|
_cons | 5.6875 .3125 18.20 0.000 4.92284 6.45216
------------------------------------------------------------------------------
test 2.a
( 1) 2.a = 0
F( 1, 6) = 2.00
Prob > F = 0.2070
Now it is time to get both the within and between with a single xtreg, re command. Notice
that there are now estimates for both a and b. Since the xtreg, re
test command gives us a chi-square and not an F-ratio, we have to rescale the chi-square by
dividing by the degrees of freedom. The coefficients and test for the re model
are the same as the coefficients and test from the separate fe and be
models (this will likely only happen if the data are balanced as they are here).
xtreg y i.a i.b, i(s) re
Random-effects GLS regression Number of obs = 32
Group variable: s Number of groups = 8
R-sq: Obs per group:
within = 0.8722 min = 4
between = 0.2500 avg = 4.0
overall = 0.8392 max = 4
Wald chi2(4) = 145.32
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.a | -.625 .4419417 -1.41 0.157 -1.49119 .2411899
|
b |
2 | .75 .5824824 1.29 0.198 -.3916445 1.891644
3 | 3.5 .5824824 6.01 0.000 2.358356 4.641644
4 | 6.25 .5824824 10.73 0.000 5.108356 7.391644
|
_cons | 3.0625 .474224 6.46 0.000 2.133038 3.991962
-------------+----------------------------------------------------------------
sigma_u | .22658174
sigma_e | 1.1649647
rho | .03645008 (fraction of variance due to u_i)
------------------------------------------------------------------------------
test 2.a
( 1) 2.a = 0
chi2( 1) = 2.00
Prob > chi2 = 0.1573
test 2.b 3.b 4.b
( 1) 2.b = 0
( 2) 3.b = 0
( 3) 4.b = 0
chi2( 3) = 143.32
Prob > chi2 = 0.0000
/* convert chi-square to F */
display "F = " r(chi2)/r(df)
F = 47.77193
Stata's xtreg random effects model is just a matrix weighted average of the fixed-effects (within) and the between-effects. In our example, because the within- and between-effects are orthogonal, thus the re produces the same results as the individual fe and be. With more general panel datasets the results of the fe and be won't necessarily add up in the same manner.
