Figure 14.3 on page 380 using data file hartnagl. We use the in option with the use command to omit the first five observations with missing values.
use https://stats.idre.ucla.edu/stat/stata/examples/ara/hartnagl in 5/l, clear (From Fox, Applied Regression Analysis. Use 'notes' command for source of data) graph twoway scatter ftheft year, connect(l) ylabel(0(25)75) xlabel(1935(5)1970)
OLS column of table 14.1 on page 380.
regress ftheft fertil labor postsec mtheft Source | SS df MS Number of obs = 34 ---------+------------------------------ F( 4, 29) = 146.98 Model | 8545.52157 4 2136.38039 Prob > F = 0.0000 Residual | 421.508959 29 14.5347917 R-squared = 0.9530 ---------+------------------------------ Adj R-squared = 0.9465 Total | 8967.03053 33 271.728198 Root MSE = 3.8125 ------------------------------------------------------------------------------ ftheft | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- fertil | -.0060904 .0014495 -4.202 0.000 -.0090549 -.0031258 labor | .1199416 .0234097 5.124 0.000 .0720633 .1678198 postsec | .5515575 .0432509 12.753 0.000 .4630995 .6400156 mtheft | .0393248 .0185559 2.119 0.043 .0013738 .0772758 _cons | -7.334148 9.437921 -0.777 0.443 -26.63686 11.96857 ------------------------------------------------------------------------------
Figure 14.4 on page 381 on residuals from the above OLS regression.
predict r, res graph twoway scatter r year, connect(l) yline(0) ylabel(-10(5)10) xlabel(1935(5)1970)
Continuing table 14.1, page 380: EGLS(1).
tsset year time variable: year, 1935 to 1968 corrgram r, lags(1) /*get lag-one autocorrelation of the residuals*/ -1 0 1 -1 0 1 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] ------------------------------------------------------------------------------- 1 0.2442 0.2673 2.2118 0.1370 |- |-- gen fth1=ftheft-0.244*ftheft[_n-1] /*using transformation on page 378*/ gen fer1=fertil-0.244*fertil[_n-1] gen lab1 =labor-0.244*labor[_n-1] gen pos1 = postsec-0.244*postsec[_n-1] gen mth1 =mtheft-0.244*mtheft[_n-1] (1 missing value generated in each of the above command.) gen cons = .756 replace fth1 = sqrt(1-.244*.244)*ftheft if(_n ==1) replace fer1 = sqrt(1-.244*.244)*fertil if (_n==1) replace lab1 = sqrt(1-.244*.244)*labor if(_n==1) replace pos1 = sqrt(1-.244*.244)*postsec if(_n==1) replace mth1 = sqrt(1-.244*.244)*mtheft if(_n==1) replace cons = sqrt(1-.244*.244) if(_n==1) (1 real change made in each of the above command.) regress fth1 cons fer1 lab1 pos1 mth1, noco Source | SS df MS Number of obs = 34 ---------+------------------------------ F( 5, 29) = 333.75 Model | 22417.159 5 4483.43181 Prob > F = 0.0000 Residual | 389.566176 29 13.4333164 R-squared = 0.9829 ---------+------------------------------ Adj R-squared = 0.9800 Total | 22806.7252 34 670.786035 Root MSE = 3.6651 ------------------------------------------------------------------------------ fth1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- cons | -6.64294 11.12007 -0.597 0.555 -29.38604 16.10016 fer1 | -.0058795 .001779 -3.305 0.003 -.0095179 -.002241 lab1 | .1156293 .0270923 4.268 0.000 .0602194 .1710393 pos1 | .5358824 .0500467 10.708 0.000 .4335254 .6382394 mth1 | .0399334 .0219769 1.817 0.080 -.0050144 .0848812 ------------------------------------------------------------------------------
Continuing on Table 14.1, page 380: EGLS(2).
regress fth1 cons fer1 lab1 pos1 mth1 in 2/l, noco Source | SS df MS Number of obs = 33 ---------+------------------------------ F( 5, 28) = 318.04 Model | 22027.4862 5 4405.49724 Prob > F = 0.0000 Residual | 387.855476 28 13.8519813 R-squared = 0.9827 ---------+------------------------------ Adj R-squared = 0.9796 Total | 22415.3417 33 679.252779 Root MSE = 3.7218 ------------------------------------------------------------------------------ fth1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- cons | -5.518543 11.73656 -0.470 0.642 -29.5598 18.52272 fer1 | -.0060796 .0018941 -3.210 0.003 -.0099596 -.0021996 lab1 | .1136506 .0280815 4.047 0.000 .0561283 .171173 pos1 | .5342096 .051043 10.466 0.000 .4296526 .6387665 mth1 | .0407056 .0224247 1.815 0.080 -.0052293 .0866404 ------------------------------------------------------------------------------
Figure 14.5 on page 382. The ac command produces a correlogram (the autocorrelations) with pointwise confidence intervals obtained from the Q statistic.
ac r, lags(7) yline(0) ylabel(-.5(.25).5)
Figure 14.9 on page 400 using data file US-pop.
use https://stats.idre.ucla.edu/stat/stata/examples/ara/US-pop, clear (From Fox, Applied Regression Analysis. Use 'notes' command for source of data ) gen myear=year-1790 nl log3 pop myear, init(b1=350, b2=0.3, b3=15) /*nonlinear least squares using built-in model log3*/ (obs = 21) Iteration 0: residual SS = 1312135 Iteration 1: residual SS = 100157.2 Iteration 2: residual SS = 88085.9 ...more iterations in between.... Iteration 26: residual SS = 356.4 Iteration 27: residual SS = 356.4 Source | SS df MS Number of obs = 21 ---------+------------------------------ F( 3, 18) = 4664.56 Model | 277074.795 3 92358.2651 Prob > F = 0.0000 Residual | 356.399974 18 19.7999986 R-squared = 0.9987 ---------+------------------------------ Adj R-squared = 0.9985 Total | 277431.195 21 13211.0093 Root MSE = 4.449719 Res. dev. = 119.0576 3-parameter logistic function, pop=b1/(1+exp(-b2*(myear-b3))) ------------------------------------------------------------------------------ pop | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- b1 | 389.1659 30.8114 12.631 0.000 324.4335 453.8982 b2 | .022662 .0010857 20.873 0.000 .020381 .0249429 b3 | 176.0811 7.244577 24.305 0.000 160.8608 191.3014 ------------------------------------------------------------------------------ (SE's, P values, CI's, and correlations are asymptotic approximations) predict mpop if e(sample) (option yhat assumed; fitted values) graph twoway scatter mpop pop year, connect(l i) msymbol(i O)
predict res, r graph twoway scatter r year, yline(0) xlabel(1750(50)2000)
Table 14.8 on page 414 using the data file duncan.
use https://stats.idre.ucla.edu/stat/stata/examples/ara/duncan, clear (From Fox, Applied Regression Analysis. Use 'notes' command for source of data )
Estimator: least squares.
regress prestige income educ Source | SS df MS Number of obs = 45 ---------+------------------------------ F( 2, 42) = 101.22 Model | 36180.9458 2 18090.4729 Prob > F = 0.0000 Residual | 7506.69865 42 178.73092 R-squared = 0.8282 ---------+------------------------------ Adj R-squared = 0.8200 Total | 43687.6444 44 992.90101 Root MSE = 13.369 ------------------------------------------------------------------------------ prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- income | .5987328 .1196673 5.003 0.000 .3572343 .8402313 educ | .5458339 .0982526 5.555 0.000 .3475521 .7441158 _cons | -6.064663 4.271941 -1.420 0.163 -14.68579 2.556463 ------------------------------------------------------------------------------
Estimator: least square*.
regress prestige income educ if(occtitle !="minister" & occtitle !="railroad_conductor") Source | SS df MS Number of obs = 43 ---------+------------------------------ F( 2, 40) = 141.26 Model | 36815.4292 2 18407.7146 Prob > F = 0.0000 Residual | 5212.57076 40 130.314269 R-squared = 0.8760 ---------+------------------------------ Adj R-squared = 0.8698 Total | 42028.00 42 1000.66667 Root MSE = 11.416 ------------------------------------------------------------------------------ prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- income | .8673986 .1219756 7.111 0.000 .6208767 1.11392 educ | .3322408 .0987498 3.364 0.002 .1326599 .5318217 _cons | -6.408986 3.652627 -1.755 0.087 -13.79122 .9732494 ------------------------------------------------------------------------------
Estimator: least absolute values.
qreg prestige income educ Iteration 1: WLS sum of weighted deviations = 435.45438 Iteration 1: sum of abs. weighted deviations = 448.85635 Iteration 2: sum of abs. weighted deviations = 420.65054 Iteration 3: sum of abs. weighted deviations = 420.1346 Iteration 4: sum of abs. weighted deviations = 416.71435 Iteration 5: sum of abs. weighted deviations = 415.99351 Iteration 6: sum of abs. weighted deviations = 415.97706 Median regression Number of obs = 45 Raw sum of deviations 1249 (about 41) Min sum of deviations 415.9771 Pseudo R2 = 0.6670 ------------------------------------------------------------------------------ prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- income | .7477064 .1016554 7.355 0.000 .5425576 .9528553 educ | .4587156 .0852699 5.380 0.000 .2866339 .6307972 _cons | -6.408257 4.068319 -1.575 0.123 -14.61846 1.801944 ------------------------------------------------------------------------------
Estimator: Bisquare(biweight)
rreg prestige income educ, tolerance(0.0005) Huber iteration 1: maximum difference in weights = .6047203 ..more iterations... Huber iteration 6: maximum difference in weights = .001508 Biweight iteration 7: maximum difference in weights = .28607984 ..more iterations... Biweight iteration 16: maximum difference in weights = .00046242 Robust regression estimates Number of obs = 45 F( 2, 42) = 141.84 Prob > F = 0.0000 ------------------------------------------------------------------------------ prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- income | .8183817 .1053366 7.769 0.000 .6058039 1.03096 educ | .4039986 .0864864 4.671 0.000 .229462 .5785352 _cons | -7.480399 3.760355 -1.989 0.053 -15.0691 .1083048 ------------------------------------------------------------------------------
Estimator: Huber
We modified the rreg.ado file so it also displays the result from Huber estimation iterations. We call it https://stats.idre.ucla.edu/wp-content/uploads/2016/02/rregh.ado for Huber estimate.
rregh prestige income educ, tolerance(0.0005)
Huber iteration 1: maximum difference in weights = .6047203 Huber iteration 2: maximum difference in weights = .16902569 Huber iteration 3: maximum difference in weights = .04468015 Huber iteration 4: maximum difference in weights = .01946043 Huber iteration 5: maximum difference in weights = .00284761 Huber iteration 6: maximum difference in weights = .001508 Huber iteration 7: maximum difference in weights = .00041577 ------------------------------------------------------------------------------ prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- income | .7101061 .1021612 6.95 0.000 .5039364 .9162758 educ | .48201 .0836081 5.77 0.000 .313282 .650738 _cons | -7.283356 3.32942 -2.19 0.034 -14.0024 -.5643149 ------------------------------------------------------------------------------ Biweight iteration 8: maximum difference in weights = .28604323 Biweight iteration 9: maximum difference in weights = .09851044 Biweight iteration 10: maximum difference in weights = .15075821 Biweight iteration 11: maximum difference in weights = .07122389 Biweight iteration 12: maximum difference in weights = .02099558 Biweight iteration 13: maximum difference in weights = .01118912 Biweight iteration 14: maximum difference in weights = .00423083 Biweight iteration 15: maximum difference in weights = .00229196 Biweight iteration 16: maximum difference in weights = .00086014 Biweight iteration 17: maximum difference in weights = .00046868
Robust regression Number of obs = 45 F( 2, 42) = 141.84 Prob > F = 0.0000
------------------------------------------------------------------------------ prestige | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- income | .8183817 .1053366 7.77 0.000 .6058038 1.03096 educ | .4039985 .0864864 4.67 0.000 .2294619 .5785352 _cons | -7.480396 3.760356 -1.99 0.053 -15.0691 .1083092 ------------------------------------------------------------------------------
Figure 14.14 on page 415 on final weights for the bisquare estimator.
rreg prestige income educ, tolerance(.0005) genwt(mywt) /*generating weights (intermediate output omitted here.) gen index=_n graph twoway (scatter mywt index) /// (scatter mywt index if mywt <= .25, mlabel(occtitle) mlabangle(15)), ylabel(0 .5 1)
Figure 14.15 on page 418 using the data file prestige.
use https://stats.idre.ucla.edu/stat/stata/examples/ara/prestige, clear (From Fox, Applied Regression Analysis. Use 'notes' command for source of data) sort income gen neb = abs(income-income[80]) /*get obs close to income[80]*/ sort neb gen mpr =prestige in 1/40 (62 missing values generated) gen minc= income in 1/40 /*get x-coordinate for reference line */ (62 missing values generated) display minc[1] 8403 display minc[40] 5902
Figure 14.15 (a) on page 418 using symmetric neighborhood. For example, to get the upper bound we do (8403+(8403-5902))=10904.
graph twoway (scatter prestige income) (scatter mpr income), /// xline(5902 8403 10904) xlabel(0(5000)30000) ylabel(0(40)120)
Figure 14.15 (b) on page 418 using tricube weight function.
gen stinc=abs(minc-8403)/2501 (62 missing values generated) gen fv=(1-abs(stinc)^3)^3 (62 missing values generated) replace fv=0 if (fv<0) (0 real changes made) sort minc graph twoway scatter fv minc, connect(l) msymbol(i) /// xline(5902 8403 10904) xlabel(0(5000)30000) ylabel(0 .5 1)
Figure 14.15 (c) on page 418 showing local weighted regression.
regress prestige minc (output omitted here.) matrix list e(b) e(b)[1,2] minc _cons y1 .00343774 25.4807 gen pred=25.4807+.00343774 *income graph twoway scatter pred mpr income, connect(l) msymbol(i O) /// xline(5902 8403 10904) xlabel(0(5000)30000) ylabel(0(40)120)
Figure 14.15 (d) on page 418 on lowess smoothing.
lowess prestige income, bwidth(0.4) xlabel(0(5000)30000) ylabel(0(40)120)
Figure 14.16 on page 421.
lowess prestige income, bwidth(0.5) xlabel(0(5000)30000) ylabel(0(40)120) gen(newp)
gen res=prestige-newp lowess res income, bwidth(0.5) xlabel(0(5000)30000) ylabel(0(40)120) yline(0)
For calculation on page 423 and Figure 14.17 on page 424, please see the SAS version of this page. SAS does it simply with proc loess.