A single model may contain a number of linear equations. In such a model it is often unrealistic to expect that the equation errors would be uncorrelated. A set of equations that has contemporaneous cross-equation error correlation (i.e. the error terms in the regression equations are corrlated) is called a seemingly unrelated regression (SUR) system. At first look, the equations seem unrelated, but the equations are related through the correlation in the errors. The Stata command to do seemingly unrelated regression is sureg.
We will illustrate sureg using the file hsb2.dta which contains 200 observations from the High School and Beyond study. hsb2.dta can be accessed directly over the Internet from the ATS website with the use command below.
use https://stats.idre.ucla.edu/stat/stata/notes/hsb2, clear describe Contains data from https://stats.idre.ucla.edu/stat/stata/notes/hsb2.dta obs: 200 highschool and beyond (200 cases) vars: 11 8 May 1999 14:55 size: 9,600 (98.9% of memory free) ------------------------------------------------------------------------------- 1. id float %9.0g 2. female float %9.0g gl 3. race float %12.0g rl 4. ses float %9.0g sl 5. schtyp float %9.0g scl type of school 6. prog float %9.0g sel type of program 7. read float %9.0g reading score 8. write float %9.0g writing score 9. math float %9.0g math score 10. science float %9.0g science score 11. socst float %9.0g social studies score ------------------------------------------------------------------------------- Sorted by:
We will use two equations, one for read and one for math and run the sureg command. With this command we are estimating two equations, one in which read is predicted by female, ses, and socst; and the other where math is predicted by female, ses, and science. The separate equations are specified in parentheses, with the dependent variable (outcome) listed first, followed by the independent (predictor) variables. The "relationship" between these two equations is that the error terms in the two equations are allowed to correlate.
sureg (read female ses socst)(math female ses science) Seemingly unrelated regression ------------------------------------------------------------------ Equation Obs Parms RMSE "R-sq" Chi2 P ------------------------------------------------------------------ read 200 3 7.940579 0.3972 117.2329 0.0000 math 200 3 7.200735 0.4063 116.6664 0.0000 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- read | female | -1.399691 1.139324 -1.229 0.219 -3.632726 .8333432 ses | 1.495314 .8298821 1.802 0.072 -.131225 3.121853 socst | .5155857 .0548183 9.405 0.000 .4081438 .6230277 _cons | 24.30038 3.343654 7.268 0.000 17.74694 30.85382 ---------+-------------------------------------------------------------------- math | female | 1.031629 1.031014 1.001 0.317 -.9891222 3.05238 ses | 1.657043 .7340503 2.257 0.024 .2183303 3.095755 science | .5058777 .0529013 9.563 0.000 .402193 .6095624 _cons | 21.41615 3.416379 6.269 0.000 14.72017 28.11213 ------------------------------------------------------------------------------
Let’s contrast the results of the sureg command with two separate regressions using the regress command.
regress read female ses socst Source | SS df MS Number of obs = 200 ---------+------------------------------ F( 3, 196) = 43.56 Model | 8368.53693 3 2789.51231 Prob > F = 0.0000 Residual | 12550.8831 196 64.0351177 R-squared = 0.4000 ---------+------------------------------ Adj R-squared = 0.3909 Total | 20919.42 199 105.122714 Root MSE = 8.0022 ------------------------------------------------------------------------------ read | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- female | -1.511128 1.151079 -1.313 0.191 -3.781219 .7589629 ses | 1.218366 .8399004 1.451 0.148 -.4380365 2.874768 socst | .5699327 .0562967 10.124 0.000 .4589077 .6809578 _cons | 22.19363 3.400423 6.527 0.000 15.48751 28.89974 ------------------------------------------------------------------------------ regress math female ses science Source | SS df MS Number of obs = 200 ---------+------------------------------ F( 3, 196) = 45.62 Model | 7181.43086 3 2393.81029 Prob > F = 0.0000 Residual | 10284.3641 196 52.4712456 R-squared = 0.4112 ---------+------------------------------ Adj R-squared = 0.4022 Total | 17465.795 199 87.7678141 Root MSE = 7.2437 ------------------------------------------------------------------------------ math | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- female | 1.160903 1.041641 1.114 0.266 -.8933606 3.215167 ses | 1.399639 .7423902 1.885 0.061 -.0644595 2.863737 science | .5753302 .054328 10.590 0.000 .4681876 .6824727 _cons | 18.14428 3.481754 5.211 0.000 11.27777 25.01079 ------------------------------------------------------------------------------
Note that the regression coefficients, standard errors, R2‘s, etc. are different in sureg from those in the standard regressions. This is due to correlated errors in the two equations.
See also
Stata Code Fragment: Fitting a seemingly unrelated regression (sureg) manually