Figure 6.1, page 184.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear
gen lnhc = ln(hc)
label variable lnhc "natural log of hydrocarbon pollution potential"
regress mort lnhc
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 1, 58) = 1.35
Model | 5179.87999 1 5179.87999 Prob > F = 0.2506
Residual | 223117.041 58 3846.84554 R-squared = 0.0227
---------+------------------------------ Adj R-squared = 0.0058
Total | 228296.921 59 3869.43934 Root MSE = 62.023
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 7.968576 6.867098 1.160 0.251 -5.777414 21.71457
_cons | 918.4252 20.53273 44.730 0.000 877.3245 959.5259
------------------------------------------------------------------------------
graph twoway (scatter mort lnhc) (lfit mort lnhc), xlabel(0(2)6) ylabel(800(100)1100)
Figure 6.2, page 185.
rreg mort lnhc
Huber iteration 1: maximum difference in weights = .58511763
Huber iteration 2: maximum difference in weights = .12109939
Huber iteration 3: maximum difference in weights = .07054585
Huber iteration 4: maximum difference in weights = .02080019
Biweight iteration 5: maximum difference in weights = .20680335
Biweight iteration 6: maximum difference in weights = .06324705
Biweight iteration 7: maximum difference in weights = .05913415
Biweight iteration 8: maximum difference in weights = .02922746
Biweight iteration 9: maximum difference in weights = .01978239
Biweight iteration 10: maximum difference in weights = .01178611
Biweight iteration 11: maximum difference in weights = .0036652
Robust regression estimates Number of obs = 60
F( 1, 58) = 8.81
Prob > F = 0.0043
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 19.45727 6.553716 2.969 0.004 6.338583 32.57596
_cons | 891.75 19.59571 45.507 0.000 852.525 930.9751
------------------------------------------------------------------------------
predict yhat
graph twoway (scatter mort lnhc) (lfit mort lnhc) (line yhat lnhc) ///
(scatter mort lnhc if lnhc >=4.5, mlabel(smsa)), ///
legend(off) xlabel(0(2)6) ylabel(800(100)1100)

Table 6.1, page186. The genwt option generates the weight and labels it "rweight." The format rweight %3.2f command formats the variable rweight to display only two digits after the decimal.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear
gen lnhc = ln(hc)
rreg mort lnhc, genwt(rweight)
Huber iteration 1: maximum difference in weights = .58511763
Huber iteration 2: maximum difference in weights = .12109939
Huber iteration 3: maximum difference in weights = .07054585
Huber iteration 4: maximum difference in weights = .02080019
Biweight iteration 5: maximum difference in weights = .20680335
Biweight iteration 6: maximum difference in weights = .06324705
Biweight iteration 7: maximum difference in weights = .05913415
Biweight iteration 8: maximum difference in weights = .02922746
Biweight iteration 9: maximum difference in weights = .01978239
Biweight iteration 10: maximum difference in weights = .01178611
Biweight iteration 11: maximum difference in weights = .0036652
Robust regression estimates Number of obs = 60
F( 1, 58) = 8.81
Prob > F = 0.0043
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 19.45727 6.553716 2.969 0.004 6.338583 32.57596
_cons | 891.75 19.59571 45.507 0.000 852.525 930.9751
------------------------------------------------------------------------------
predict yhat
predict e, r
sort rweight
format mort yhat e lnhc %3.1f
format rweight %3.2f
list smsa hc mort yhat e rweight
smsa hc mort yhat e rweight
1. SanJose 105 790.7 982.3 -191.6 0.08
2. NewOrleans 20 1113.0 950.0 163.0 0.23
3. LosAngeles 648 861.8 1017.7 -155.9 0.27
4. SanDiego 144 839.7 988.4 -148.7 0.32
5. Baltimore 43 1071.0 964.9 106.1 0.60
6. Wichita 4 823.8 918.7 -94.9 0.68
7. Lancaster 11 844.1 938.4 -94.3 0.68
8. Minneapolis 20 857.6 950.0 -92.4 0.69
9. SanFrancisco 311 911.7 1003.4 -91.7 0.70
10. Richmond 12 1026.0 940.1 85.9 0.73
11. Portland 56 894.0 970.1 -76.1 0.79
12. Denver 17 871.8 946.9 -75.1 0.79
13. Birmingham 30 1030.0 957.9 72.1 0.81
14. Chattanooga 18 1018.0 948.0 70.0 0.82
15. Albany 8 997.9 932.2 65.7 0.84
16. Memphis 15 1006.0 944.4 61.6 0.86
17. Wilmington 14 1004.0 943.1 60.9 0.86
18. Philadelphia 29 1015.0 957.3 57.7 0.87
19. Rochester 7 874.3 929.6 -55.3 0.88
20. Buffalo 18 1002.0 948.0 54.0 0.89
21. GrandRapids 5 871.3 923.1 -51.8 0.90
22. Miami 3 861.4 913.1 -51.7 0.90
23. Seattle 20 899.3 950.0 -50.7 0.90
24. Chicago 88 1025.0 978.9 46.1 0.92
25. Hartford 7 887.5 929.6 -42.1 0.93
26. Greensboro 8 971.1 932.2 38.9 0.94
27. Allentown 6 962.4 926.6 35.8 0.95
28. Atlanta 18 982.3 948.0 34.3 0.95
29. Toledo 11 972.5 938.4 34.1 0.95
30. Worcester 7 895.7 929.6 -33.9 0.96
31. Dallas 1 860.1 891.8 -31.7 0.96
32. NewYork 41 994.6 964.0 30.6 0.96
33. Milwaukee 33 929.2 959.8 -30.6 0.96
34. Akron 21 921.9 951.0 -29.1 0.97
35. Canton 12 912.3 940.1 -27.8 0.97
36. Cleveland 31 986.0 958.6 27.4 0.97
37. Bridgeport 6 899.5 926.6 -27.1 0.97
38. Indianapolis 13 968.7 941.7 27.0 0.97
39. Louisville 38 989.3 962.5 26.8 0.97
40. Houston 6 952.5 926.6 25.9 0.97
41. Pittsburgh 45 991.3 965.8 25.5 0.97
42. York 8 911.8 932.2 -20.4 0.98
43. Springfield 5 904.2 923.1 -18.9 0.99
44. Syracuse 8 950.7 932.2 18.5 0.99
45. Boston 21 934.7 951.0 -16.3 0.99
46. Cincinnati 26 970.5 955.1 15.4 0.99
47. Nashville 17 961.0 946.9 14.1 0.99
48. Providence 6 938.5 926.6 11.9 0.99
49. Youngstown 14 954.4 943.1 11.3 0.99
50. Utica 5 912.2 923.1 -10.9 1.00
51. KansasCity 7 919.7 929.6 -9.9 1.00
52. Dayton 6 936.2 926.6 9.6 1.00
53. Detroit 52 959.2 968.6 -9.4 1.00
54. Reading 11 946.2 938.4 7.8 1.00
55. Columbus 23 958.8 952.8 6.0 1.00
56. Washington 65 967.8 973.0 -5.2 1.00
57. StLouis 31 953.6 958.6 -5.0 1.00
58. NewHaven 4 923.2 918.7 4.5 1.00
59. Flint 11 941.2 938.4 2.8 1.00
60. FortWorth 1 891.7 891.8 -0.1 1.00
Figure 6.3, page 188.
format rweight %3.2f graph twoway (scatter mort lnhc, mlabel(rweight) msymbol(i)) /// (line yhat lnhc), xlabel(0(2)6) ylabel(800(100)1100)
Figure 6.7, page 197.
NOTE: For the following figure, we are going to produce each graph manually. One can specify the graph option in the rreg command to get a slideshow of the changes in weights in the IRLS estimation.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear
Iteration 1
gen lnhc = ln(hc)
regress mort lnhc
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 1, 58) = 1.35
Model | 5179.87999 1 5179.87999 Prob > F = 0.2506
Residual | 223117.041 58 3846.84554 R-squared = 0.0227
---------+------------------------------ Adj R-squared = 0.0058
Total | 228296.921 59 3869.43934 Root MSE = 62.023
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 7.968576 6.867098 1.160 0.251 -5.777414 21.71457
_cons | 918.4252 20.53273 44.730 0.000 877.3245 959.5259
------------------------------------------------------------------------------
predict e, resid
summarize e, detail
Residuals
-------------------------------------------------------------
Percentiles Smallest
1% -164.8106 -164.8106
5% -106.9425 -118.3275
10% -76.94942 -108.2129 Obs 60
25% -40.61413 -105.672 Sum of Wgt. 60
50% 6.803905 Mean 7.55e-08
Largest Std. Dev. 61.49508
75% 40.52672 84.4721
90% 70.31952 87.77363 Variance 3781.645
95% 86.12286 122.6034 Skewness -.0801513
99% 170.7031 170.7031 Kurtosis 3.362466
The median is the 50th percentile, 6.803905
gen m0=6.803905
gen ee=abs(e-m0)
summarize ee, detail
ee
-------------------------------------------------------------
Percentiles Smallest
1% 1.00688 1.00688
5% 2.174106 1.00688
10% 5.724127 1.863093 Obs 60
25% 20.17187 2.485118 Sum of Wgt. 60
50% 35.41085 Mean 47.77389
Largest Std. Dev. 38.82904
75% 65.78219 115.7995
90% 106.3564 125.1314 Variance 1507.694
95% 120.4655 163.8992 Skewness 1.190757
99% 171.6145 171.6145 Kurtosis 4.333584
gen m00=35.41085
gen s0=m00/.6745
gen es0=abs(e)/s0
gen w1=.
(60 missing values generated)
replace w1=1 if es0<=1.345
(48 real changes made)
replace w1=1.345/es0 if es0>1.345
(12 real changes made)
gen z=1
graph twoway scatter w1 z, connect(l) xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 2
gen mortw1=mort*sqrt(w1)
gen lnhcw0=sqrt(w1)
gen lnhcw1=lnhc*sqrt(w1)
regress mortw1 lnhcw1 lnhcw0, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) = 8902.25
Model | 50024076.4 2 25012038.2 Prob > F = 0.0000
Residual | 162958.613 58 2809.63127 R-squared = 0.9968
---------+------------------------------ Adj R-squared = 0.9966
Total | 50187035.0 60 836450.584 Root MSE = 53.006
------------------------------------------------------------------------------
mortw1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw1 | 12.08538 6.277814 1.925 0.059 -.4810322 24.65179
lnhcw0 | 908.4398 18.31743 49.594 0.000 871.7735 945.1061
------------------------------------------------------------------------------
predict e1, resid
gen e11=e1/lnhcw0
summarize e11, detail
e11
-------------------------------------------------------------
Percentiles Smallest
1% -173.9847 -173.9847
5% -113.1364 -128.8018
10% -78.96231 -124.8792 Obs 60
25% -40.35685 -101.3937 Sum of Wgt. 60
50% 4.943437 Mean -1.349338
Largest Std. Dev. 61.68531
75% 36.87691 80.45545
90% 65.34878 87.52921 Variance 3805.077
95% 83.99233 117.1047 Skewness -.2116352
99% 168.3556 168.3556 Kurtosis 3.577332
gen m1=4.943437
gen ee1=abs(e11-m1)
summarize ee1, detail
ee1
-------------------------------------------------------------
Percentiles Smallest
1% 1.16269 1.16269
5% 1.609889 1.162691
10% 3.902509 1.284287 Obs 60
25% 17.60206 1.935491 Sum of Wgt. 60
50% 33.28572 Mean 47.06011
Largest Std. Dev. 39.91437
75% 63.93033 129.8226
90% 102.2999 133.7453 Variance 1593.157
95% 131.7839 163.4122 Skewness 1.300015
99% 178.9281 178.9281 Kurtosis 4.602989
gen m11= 33.28572gen s1=m11/.6745
gen es1=abs(e11)/s1
gen w2=.
(60 missing values generated)
replace w2=1 if es1<=1.345
(48 real changes made)
replace w2=1.345/es1 if es1>1.345
(12 real changes made)
graph twoway (scatter w2 w1) (line w1 w1, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 3
gen mortw2=mort*sqrt(w2)
gen lnhcw2=sqrt(w2)
gen lnhcw3=lnhc*sqrt(w2)
regress mortw2 lnhcw2 lnhcw3, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) = 9248.78
Model | 49588476.9 2 24794238.5 Prob > F = 0.0000
Residual | 155487.033 58 2680.81092 R-squared = 0.9969
---------+------------------------------ Adj R-squared = 0.9968
Total | 49743964.0 60 829066.066 Root MSE = 51.777
------------------------------------------------------------------------------
mortw2 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw2 | 905.4311 18.09045 50.050 0.000 869.2191 941.6431
lnhcw3 | 13.4653 6.237472 2.159 0.035 .9796466 25.95096
------------------------------------------------------------------------------
predict e2, resid
gen e22=e2/lnhcw2
summarize e22, detail
e22
-------------------------------------------------------------
Percentiles Smallest
1% -177.3981 -177.3981
5% -115.551 -132.6511
10% -79.9754 -130.804 Obs 60
25% -40.03335 -100.298 Sum of Wgt. 60
50% 4.820021 Mean -2.139968
Largest Std. Dev. 61.83381
75% 34.83411 78.77076
90% 64.34794 87.10891 Variance 3823.419
95% 82.93983 114.9232 Skewness -.2607152
99% 167.2305 167.2305 Kurtosis 3.656614
gen m2=4.820021
gen ee2=abs(e22-m2)
summarize ee2, detail
ee2
-------------------------------------------------------------
Percentiles Smallest
1% 1.339484 1.339484
5% 2.356566 1.339485
10% 4.189049 1.822319 Obs 60
25% 17.43784 2.890813 Sum of Wgt. 60
50% 32.98885 Mean 46.90808
Largest Std. Dev. 40.43538
75% 62.89881 135.624
90% 101.7788 137.4711 Variance 1635.02
95% 136.5476 162.4104 Skewness 1.342615
99% 182.2181 182.2181 Kurtosis 4.699437
gen m22=32.98885
gen s2=m22/.6745
gen es2=abs(e22)/s2
gen w3=.
(60 missing values generated)
replace w3=1 if es2<=1.345
(47 real changes made)
replace w3=1.345/es2 if es2>1.345
(13 real changes made)
graph twoway (scatter w3 w2) (line w2 w2, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 4
gen mortw3=mort*sqrt(w3)
gen lnhcw4=sqrt(w3)
gen lnhcw5=lnhc*sqrt(w3)
regress mortw3 lnhcw4 lnhcw5, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) = 9332.34
Model | 49471461.0 2 24735730.5 Prob > F = 0.0000
Residual | 153731.199 58 2650.53791 R-squared = 0.9969
---------+------------------------------ Adj R-squared = 0.9968
Total | 49625192.2 60 827086.537 Root MSE = 51.483
------------------------------------------------------------------------------
mortw3 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw4 | 904.103 18.09088 49.976 0.000 867.8902 940.3159
lnhcw5 | 14.08514 6.258612 2.251 0.028 1.557165 26.61311
------------------------------------------------------------------------------
predict e3, resid
gen e33=e3/lnhcw4
summarize e33, detail
e33
-------------------------------------------------------------
Percentiles Smallest
1% -178.9547 -178.9547
5% -116.6589 -134.4035
10% -80.97359 -133.4887 Obs 60
25% -39.90723 -99.82915 Sum of Wgt. 60
50% 4.111221 Mean -2.518487
Largest Std. Dev. 61.91423
75% 34.29168 77.99062
90% 64.13075 86.89668 Variance 3833.372
95% 82.44365 113.92 Skewness -.2834424
99% 166.7017 166.7017 Kurtosis 3.693297
gen m3=4.111221
gen ee3=abs(e33-m3)
summarize ee3, detail
ee3
-------------------------------------------------------------
Percentiles Smallest
1% .7889185 .7889185
5% 2.865521 .7889187
10% 4.604212 2.748589 Obs 60
25% 16.45549 2.982452 Sum of Wgt. 60
50% 33.48549 Mean 46.83978
Largest Std. Dev. 40.58225
75% 61.8055 137.5999
90% 100.9146 138.5147 Variance 1646.919
95% 138.0573 162.5905 Skewness 1.361655
99% 183.0659 183.0659 Kurtosis 4.749048
gen m33=33.48549
gen s3=m33/.6745
gen es3=abs(e33)/s3
gen w4=.
(60 missing values generated)
replace w4=1 if es3<=1.345
(46 real changes made)
replace w4=1.345/es3 if es3>1.345
(14 real changes made)
graph twoway (scatter w4 w3) (line w3 w3, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 5 (change to biweight)
gen mortw4=mort*sqrt(w4)
gen lnhcw6=sqrt(w4)
gen lnhcw7=lnhc*sqrt(w4)
regress mortw4 lnhcw6 lnhcw7, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) = 9291.05
Model | 49565578.2 2 24782789.1 Prob > F = 0.0000
Residual | 154708.151 58 2667.38191 R-squared = 0.9969
---------+------------------------------ Adj R-squared = 0.9968
Total | 49720286.4 60 828671.44 Root MSE = 51.647
------------------------------------------------------------------------------
mortw4 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw6 | 903.8434 18.15556 49.783 0.000 867.5011 940.1857
lnhcw7 | 14.21256 6.283636 2.262 0.027 1.634493 26.79062
------------------------------------------------------------------------------
predict e4, resid
gen e44=e4/lnhcw6
summarize e44, detail
e44
-------------------------------------------------------------
Percentiles Smallest
1% -179.2881 -179.2881
5% -116.9001 -134.7772
10% -81.27055 -134.054 Obs 60
25% -39.77162 -99.74622 Sum of Wgt. 60
50% 3.95209 Mean -2.609724
Largest Std. Dev. 61.93181
75% 34.16674 77.81688
90% 64.08532 86.83968 Variance 3835.549
95% 82.32828 113.7003 Skewness -.2881632
99% 166.5795 166.5795 Kurtosis 3.700915
gen m4=3.95209
gen ee4=abs(e44-m4)
summarize ee4, detail
ee4
-------------------------------------------------------------
Percentiles Smallest
1% .6757395 .6757395
5% 2.970146 .67574
10% 4.538526 2.939001 Obs 60
25% 16.38228 3.00129 Sum of Wgt. 60
50% 33.58758 Mean 46.82574
Largest Std. Dev. 40.61432
75% 61.58075 138.0061
90% 100.737 138.7292 Variance 1649.523
95% 138.3677 162.6274 Skewness 1.365385
99% 183.2402 183.2402 Kurtosis 4.758903
gen m44=33.58758
gen s4=m44/.6745
gen es4=e44/s4
gen w5=.
(60 missing values generated)
replace w5=(1-(es4/4.685)^2)^2 if (abs(e44)/s4)<=4.685
(60 real changes made)
replace w5=0 if (abs(e44)/s4)>4.685
(0 real changes made)
graph twoway (scatter w5 w4) (line w4 w4, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 6
gen mortw5=mort*sqrt(w5)
gen lnhcw8=sqrt(w5)
gen lnhcw9=lnhc*sqrt(w5)
regress mortw5 lnhcw8 lnhcw9, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) =10199.02
Model | 46795992.5 2 23397996.2 Prob > F = 0.0000
Residual | 133060.189 58 2294.1412 R-squared = 0.9972
---------+------------------------------ Adj R-squared = 0.9971
Total | 46929052.7 60 782150.878 Root MSE = 47.897
------------------------------------------------------------------------------
mortw5 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw8 | 900.2209 17.37487 51.812 0.000 865.4413 935.0005
lnhcw9 | 15.63724 6.060739 2.580 0.012 3.505358 27.76913
------------------------------------------------------------------------------
predict e5, resid
gen e55=e5/lnhcw8
summarize e55, detail
e55
-------------------------------------------------------------
Percentiles Smallest
1% -182.296 -182.296
5% -118.1669 -139.6547
10% -83.87064 -138.2351 Obs 60
25% -37.53524 -98.09869 Sum of Wgt. 60
50% 2.892957 Mean -2.909743
Largest Std. Dev. 62.15267
75% 34.47182 76.59371
90% 64.29753 86.92204 Variance 3862.955
95% 81.75788 111.9643 Skewness -.3419659
99% 165.9341 165.9341 Kurtosis 3.78783
gen m5=2.892957
gen ee5=abs(e55-m5)
summarize ee5, detail
ee5
-------------------------------------------------------------
Percentiles Smallest
1% .589726 .589726
5% 2.40177 .5897262
10% 5.645023 1.591628 Obs 60
25% 16.05141 3.211911 Sum of Wgt. 60
50% 32.65288 Mean 46.70808
Largest Std. Dev. 40.97041
75% 61.40457 141.1281
90% 98.751 142.5477 Variance 1678.574
95% 141.8379 163.0411 Skewness 1.408325
99% 185.1889 185.1889 Kurtosis 4.871201
gen m55=32.65288
gen s5=m55/.6745
gen es5=e55/s5
gen w6=.
(60 missing values generated)
replace w6=(1-(es5/4.685)^2)^2 if (abs(e55)/s5)<=4.685
(60 real changes made)
replace w6=0 if (abs(e55)/s5)>4.685
(0 real changes made)
graph twoway (scatter w6 w5) (line w5 w5, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 7 (Note that iterations 7 and 8 are not graphed in the book.)
gen mortw6=mort*sqrt(w6)
gen lnhcw10=sqrt(w6)
gen lnhcw11=lnhc*sqrt(w6)
regress mortw6 lnhcw10 lnhcw11, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) =10569.43
Model | 46495826.3 2 23247913.2 Prob > F = 0.0000
Residual | 127573.451 58 2199.54225 R-squared = 0.9973
---------+------------------------------ Adj R-squared = 0.9972
Total | 46623399.8 60 777056.663 Root MSE = 46.899
------------------------------------------------------------------------------
mortw6 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw10 | 897.4257 17.15824 52.303 0.000 863.0798 931.7717
lnhcw11 | 16.85949 6.015829 2.803 0.007 4.817501 28.90148
------------------------------------------------------------------------------
predict e6, resid
gen e66=e6/lnhcw10
summarize e66, detail
e66
-------------------------------------------------------------
Percentiles Smallest
1% -185.1891 -185.1891
5% -119.2561 -144.7722
10% -86.414 -141.5143 Obs 60
25% -35.92926 -96.99795 Sum of Wgt. 60
50% 2.874535 Mean -3.479797
Largest Std. Dev. 62.3774
75% 34.60612 75.23182
90% 64.16693 86.67999 Variance 3890.94
95% 80.95591 110.1623 Skewness -.3894314
99% 165.0678 165.0678 Kurtosis 3.864709
gen m6=2.874535
gen ee6=abs(e66-m6)
summarize ee6, detail
ee6
-------------------------------------------------------------
Percentiles Smallest
1% .4724467 .4724467
5% 3.736962 .4724476
10% 5.664235 2.87833 Obs 60
25% 15.30565 4.595593 Sum of Wgt. 60
50% 31.832 Mean 46.64581
Largest Std. Dev. 41.46435
75% 61.2924 144.3889
90% 98.25005 147.6468 Variance 1719.292
95% 146.0178 162.1933 Skewness 1.450591
99% 188.0636 188.0636 Kurtosis 4.977333
gen m66=31.832
gen s6=m66/.6745
gen es6=e66/s6
gen w7=.
(60 missing values generated)
replace w7=(1-(es6/4.685)^2)^2 if (abs(e66)/s6)<=4.685
(60 real changes made)
replace w7=0 if (abs(e66)/s6)>4.685
(0 real changes made)
Iteration 8
gen mortw7=mort*sqrt(w7)
gen lnhcw12=sqrt(w7)
gen lnhcw13=lnhc*sqrt(w7)
regress mortw7 lnhcw12 lnhcw13, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) =10952.16
Model | 46220949.9 2 23110475.0 Prob > F = 0.0000
Residual | 122387.492 58 2110.12917 R-squared = 0.9974
---------+------------------------------ Adj R-squared = 0.9973
Total | 46343337.4 60 772388.957 Root MSE = 45.936
------------------------------------------------------------------------------
mortw7 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw12 | 894.7121 16.95266 52.777 0.000 860.7777 928.6466
lnhcw13 | 18.07344 5.972631 3.026 0.004 6.117918 30.02896
------------------------------------------------------------------------------
predict e7, resid
gen e77=e7/lnhcw12
summarize e77, detail
e77
-------------------------------------------------------------
Percentiles Smallest
1% -188.1251 -188.1251
5% -120.4005 -149.9176
10% -89.00262 -144.8338 Obs 60
25% -34.39677 -95.96723 Sum of Wgt. 60
50% 3.291201 Mean -4.108537
Largest Std. Dev. 62.63243
75% 33.61028 73.81649
90% 63.97465 86.37708 Variance 3922.822
95% 80.09678 108.3101 Skewness -.4375264
99% 164.1446 164.1446 Kurtosis 3.942903
gen m7=3.291201
gen ee7=abs(e77-m7)
summarize ee7, detail
ee7
-------------------------------------------------------------
Percentiles Smallest
1% .1415799 .1415799
5% 4.492992 .1415806
10% 6.058385 4.127497 Obs 60
25% 15.0027 4.858488 Sum of Wgt. 60
50% 32.00533 Mean 46.59337
Largest Std. Dev. 42.0805
75% 60.68345 148.125
90% 98.24997 153.2088 Variance 1770.768
95% 150.6669 160.8534 Skewness 1.492862
99% 191.4163 191.4163 Kurtosis 5.089983
gen m77=32.00533
gen s7=m77/.6745
gen es7=e77/s7
gen w8=.
(60 missing values generated)
replace w8=(1-(es7/4.685)^2)^2 if (abs(e77)/s7)<=4.685
(60 real changes made)
replace w8=0 if (abs(e77)/s7)>4.685
(0 real changes made)
Iteration 9
gen mortw8=mort*sqrt(w8)
gen lnhcw14=sqrt(w8)
gen lnhcw15=lnhc*sqrt(w8)
regress mortw8 lnhcw14 lnhcw15, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) =11032.70
Model | 46288732.5 2 23144366.2 Prob > F = 0.0000
Residual | 121672.237 58 2097.79719 R-squared = 0.9974
---------+------------------------------ Adj R-squared = 0.9973
Total | 46410404.7 60 773506.745 Root MSE = 45.802
------------------------------------------------------------------------------
mortw8 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw14 | 893.1722 16.95102 52.691 0.000 859.2411 927.1034
lnhcw15 | 18.78453 5.985631 3.138 0.003 6.802987 30.76607
------------------------------------------------------------------------------
predict e8, resid
gen e88=e8/lnhcw14
summarize e88, detail
e88
-------------------------------------------------------------
Percentiles Smallest
1% -189.8946 -189.8946
5% -121.1205 -152.9813
10% -90.56861 -146.8279 Obs 60
25% -33.54876 -95.41312 Sum of Wgt. 60
50% 3.485634 Mean -4.526487
Largest Std. Dev. 62.79643
75% 32.97723 72.93785
90% 63.81236 86.14997 Variance 3943.392
95% 79.54391 107.1754 Skewness -.4660449
99% 163.5544 163.5544 Kurtosis 3.989447
gen m8=3.485634
gen ee8=abs(e88-m8)
summarize ee8, detail
ee8
-------------------------------------------------------------
Percentiles Smallest
1% .5012119 .5012119
5% 3.871048 .5012128
10% 6.578306 3.243295 Obs 60
25% 14.83559 4.4988 Sum of Wgt. 60
50% 32.01626 Mean 46.58111
Largest Std. Dev. 42.45112
75% 60.54584 150.3135
90% 98.25001 156.4669 Variance 1802.098
95% 153.3902 160.0687 Skewness 1.518493
99% 193.3803 193.3803 Kurtosis 5.161077
gen m88=32.01626
gen s8=m88/.6745
gen es8=e88/s8
gen w9=.
(60 missing values generated)
replace w9=(1-(es8/4.685)^2)^2 if (abs(e88)/s8)<=4.685
(60 real changes made)
replace w9=0 if (abs(e88)/s8)>4.685
(0 real changes made)
graph twoway (scatter w9 w8) (line w8 w8, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 10
gen mortw9=mort*sqrt(w9)
gen lnhcw16=sqrt(w9)
gen lnhcw17=lnhc*sqrt(w9)
regress mortw9 lnhcw16 lnhcw17, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) =11114.84
Model | 46295327.8 2 23147663.9 Prob > F = 0.0000
Residual | 120790.281 58 2082.59105 R-squared = 0.9974
---------+------------------------------ Adj R-squared = 0.9973
Total | 46416118.0 60 773601.967 Root MSE = 45.635
------------------------------------------------------------------------------
mortw9 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw16 | 892.1295 16.92993 52.695 0.000 858.2405 926.0184
lnhcw17 | 19.26631 5.987742 3.218 0.002 7.280545 31.25208
------------------------------------------------------------------------------
predict e9, resid
gen e99=e9/lnhcw16
summarize e99, detail
e99
-------------------------------------------------------------
Percentiles Smallest
1% -191.0941 -191.0941
5% -121.6088 -155.0575
10% -91.63017 -148.1794 Obs 60
25% -32.97474 -95.03825 Sum of Wgt. 60
50% 3.61685 Mean -4.810208
Largest Std. Dev. 62.91361
75% 32.54779 72.342
90% 63.70186 85.99552 Variance 3958.123
95% 79.16876 106.4061 Skewness -.4854825
99% 163.1539 163.1539 Kurtosis 4.021253
gen m9=3.61685
gen ee9=abs(e99-m9)
summarize ee9, detail
ee9
-------------------------------------------------------------
Percentiles Smallest
1% .7448692 .7448692
5% 3.345251 .7448692
10% 6.870948 2.644254 Obs 60
25% 14.72238 4.046248 Sum of Wgt. 60
50% 32.31808 Mean 46.5728
Largest Std. Dev. 42.71478
75% 60.51368 151.7963
90% 98.25004 158.6744 Variance 1824.553
95% 155.2353 159.537 Skewness 1.535109
99% 194.711 194.711 Kurtosis 5.209073
gen m99=32.31808
gen s9=m99/.6745
gen es9=e99/s9
gen w10=.
(60 missing values generated)
replace w10=(1-(es9/4.685)^2)^2 if (abs(e99)/s9)<=4.685
(60 real changes made)
replace w10=0 if (abs(e99)/s9)>4.685
(0 real changes made)
graph twoway (scatter w10 w9) (line w9 w9, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Iteration 11
gen mortw10=mort*sqrt(w10)
gen lnhcw18=sqrt(w10)
gen lnhcw19=lnhc*sqrt(w10)
regress mortw10 lnhcw18 lnhcw19, nocons
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 2, 58) =11069.26
Model | 46399353.4 2 23199676.7 Prob > F = 0.0000
Residual | 121560.159 58 2095.86482 R-squared = 0.9974
---------+------------------------------ Adj R-squared = 0.9973
Total | 46520913.5 60 775348.559 Root MSE = 45.781
------------------------------------------------------------------------------
mortw10 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhcw18 | 891.8404 16.97908 52.526 0.000 857.8531 925.8278
lnhcw19 | 19.41061 6.006624 3.232 0.002 7.387048 31.43418
------------------------------------------------------------------------------
predict e10, resid
gen e1010=e10/lnhcw18
summarize e1010, detail
e1010
-------------------------------------------------------------
Percentiles Smallest
1% -191.4766 -191.4766
5% -121.7784 -155.7027
10% -91.97139 -148.6076 Obs 60
25% -32.82613 -94.94926 Sum of Wgt. 60
50% 3.632838 Mean -4.918479
Largest Std. Dev. 62.94965
75% 32.3959 72.14027
90% 63.64549 85.926 Variance 3962.659
95% 79.03314 106.1523 Skewness -.4913203
99% 163.0105 163.0105 Kurtosis 4.030817
gen m10=3.632838
gen ee10=abs(e1010-m10)
summarize ee10, detail
ee10
-------------------------------------------------------------
Percentiles Smallest
1% .8178842 .8178842
5% 3.119054 .8178847
10% 6.824313 2.464844 Obs 60
25% 14.68851 3.773263 Sum of Wgt. 60
50% 32.40845 Mean 46.57032
Largest Std. Dev. 42.79567
75% 60.50405 152.2404
90% 98.25003 159.3355 Variance 1831.47
95% 155.7879 159.3777 Skewness 1.539966
99% 195.1095 195.1095 Kurtosis 5.2234
gen m1010=32.40845
gen s10=m1010/.6745
gen es10=e1010/s10
gen w11=.
(60 missing values generated)
replace w11=(1-(es10/4.685)^2)^2 if (abs(e1010)/s10)<=4.685
(60 real changes made)
replace w11=0 if (abs(e1010)/s10)>4.685
(0 real changes made)
graph twoway (scatter w11 w10) (line w10 w10, sort), xlabel(0 .5 1) ylabel(0 .5 1)
Figure 6.8, page 197.
rreg mort lnhc
Huber iteration 1: maximum difference in weights = .58511763
Huber iteration 2: maximum difference in weights = .12109939
Huber iteration 3: maximum difference in weights = .07054585
Huber iteration 4: maximum difference in weights = .02080019
Biweight iteration 5: maximum difference in weights = .20680335
Biweight iteration 6: maximum difference in weights = .06324705
Biweight iteration 7: maximum difference in weights = .05913415
Biweight iteration 8: maximum difference in weights = .02922746
Biweight iteration 9: maximum difference in weights = .01978239
Biweight iteration 10: maximum difference in weights = .01178611
Biweight iteration 11: maximum difference in weights = .0036652
Robust regression estimates Number of obs = 60
F( 1, 58) = 8.81
Prob > F = 0.0043
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 19.45727 6.553716 2.969 0.004 6.338583 32.57596
_cons | 891.75 19.59571 45.507 0.000 852.525 930.9751
------------------------------------------------------------------------------
predict e15, r
gen se=e15/48.09
sort se
graph twoway scatter w11 se, xlabel(-5(1)5) ylabel(0(.2)1) xline(0)
Figure 6.9, page 202.
Note: The robust regression line for this graph does not match the book. Further, the next Figure does match when the two outlying points are removed.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear
regress mort hc
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 1, 58) = 1.88
Model | 7181.1855 1 7181.1855 Prob > F = 0.1752
Residual | 221115.736 58 3812.34027 R-squared = 0.0315
---------+------------------------------ Adj R-squared = 0.0148
Total | 228296.921 59 3869.43934 Root MSE = 61.744
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hc | -.1199471 .0873952 -1.372 0.175 -.2948875 .0549934
_cons | 944.905 8.630252 109.488 0.000 927.6297 962.1803
------------------------------------------------------------------------------
predict h, leverage
rreg mort hc, genwt(tempwt)
Huber iteration 1: maximum difference in weights = .47764579
Huber iteration 2: maximum difference in weights = .01386864
Biweight iteration 3: maximum difference in weights = .15809529
Biweight iteration 4: maximum difference in weights = .00210305
Robust regression estimates Number of obs = 60
F( 1, 58) = 1.63
Prob > F = 0.2075
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hc | -.1138386 .089299 -1.275 0.207 -.2925899 .0649127
_cons | 944.1757 8.818254 107.071 0.000 926.524 961.8273
------------------------------------------------------------------------------
predict yhat2
sort hc
. graph twoway (scatter mort hc) (lfit mort hc) (line yhat2 hc)
Figure 6.10, page 202.
drop if h>=.166
(2 observations deleted)
regress mort hc
Source | SS df MS Number of obs = 58
---------+------------------------------ F( 1, 56) = 0.04
Model | 157.327497 1 157.327497 Prob > F = 0.8424
Residual | 220947.205 56 3945.4858 R-squared = 0.0007
---------+------------------------------ Adj R-squared = -0.0171
Total | 221104.532 57 3879.02688 Root MSE = 62.813
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hc | -.0636877 .318936 -0.200 0.842 -.7025932 .5752178
_cons | 943.6545 10.95789 86.116 0.000 921.7032 965.6057
------------------------------------------------------------------------------
rreg mort hc
Huber iteration 1: maximum difference in weights = .63453732
Huber iteration 2: maximum difference in weights = .23102242
Huber iteration 3: maximum difference in weights = .04856842
Biweight iteration 4: maximum difference in weights = .26244177
Biweight iteration 5: maximum difference in weights = .10250554
Biweight iteration 6: maximum difference in weights = .01126607
Biweight iteration 7: maximum difference in weights = .00181083
Robust regression estimates Number of obs = 57
F( 1, 55) = 15.93
Prob > F = 0.0002
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
hc | 1.408462 .3529216 3.991 0.000 .7011917 2.115733
_cons | 918.531 10.21261 89.941 0.000 898.0645 938.9975
------------------------------------------------------------------------------
predict yhat6
graph twoway (scatter mort hc) (lfit mort hc) (line yhat6 hc, sort), ///
xlabel(0(50)150) ylabel(800(100)1100)
Figure 6.11, page 204.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear gen lnhc = ln(hc) graph twoway scatter lnhc hc, connect(l) sort xlabel(0(200)600) ylabel(0(2)6)
gen lnpopd = ln(popden) graph twoway scatter lnpopd popden, connect(l) sort xlabel(0 5000 10000) ylabel(7 8 9)
gen nrrpoor = -(1/(sqrt(poor))) graph twoway scatter nrrpoor poor, connect(l) sort xlabel(10(5)30) ylabel(-.35(.05)-.2)
gen srnonw = sqrt(nonw) graph twoway scatter srnonw nonw, connect(l) sort xlabel(0(10)40) ylabel(0(2)6)
Figure 6.12, page 205.
regress mort rain jan educ srnonw
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 4, 55) = 26.00
Model | 149326.539 4 37331.6348 Prob > F = 0.0000
Residual | 78970.3819 55 1435.82513 R-squared = 0.6541
---------+------------------------------ Adj R-squared = 0.6289
Total | 228296.921 59 3869.43934 Root MSE = 37.892
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
rain | 1.038763 .5972583 1.739 0.088 -.1581692 2.235696
jan | -1.9212 .5579225 -3.443 0.001 -3.039302 -.8030985
educ | -21.13074 6.844689 -3.087 0.003 -34.8478 -7.413674
srnonw | 32.40913 4.662617 6.951 0.000 23.06503 41.75322
_cons | 1094.805 86.29434 12.687 0.000 921.8676 1267.743
------------------------------------------------------------------------------
avplot lnhc, mlabel(smsa) msymbol(i) xlabel(-4(2)2) ylabel(-100(50)50)
Figure 6.13, page 206.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear
gen lnhc=log(hc)
gen srnonw=sqrt(nonw)
rreg mort lnhc rain jan educ srnonw, genwt(rweight)
Huber iteration 1: maximum difference in weights = .45915614
Huber iteration 2: maximum difference in weights = .04128307
Biweight iteration 3: maximum difference in weights = .14223798
Biweight iteration 4: maximum difference in weights = .00447933
Robust regression estimates Number of obs = 60
F( 5, 54) = 28.12
Prob > F = 0.0000
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 17.76648 4.625325 3.841 0.000 8.493262 27.0397
rain | 2.317299 .6382043 3.631 0.001 1.037776 3.596821
jan | -2.110483 .5029979 -4.196 0.000 -3.118933 -1.102033
educ | -19.10964 6.190165 -3.087 0.003 -31.52017 -6.699102
srnonw | 26.21364 4.38846 5.973 0.000 17.41531 35.01197
_cons | 1001.758 82.48887 12.144 0.000 836.3781 1167.139
------------------------------------------------------------------------------
predict e, r
gen e1=e/48.09
sort e1
graph twoway scatter rweight e1, connect(l) xlabel(-5(1)5) ylabel(0(.2)1) xline(0)
Table 6.3, page 206.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/airpol, clear
gen lnhc = ln(hc)
gen srnonw = sqrt(nonw)
regress mort lnhc rain jan educ srnonw
Source | SS df MS Number of obs = 60
---------+------------------------------ F( 5, 54) = 28.63
Model | 165769.597 5 33153.9194 Prob > F = 0.0000
Residual | 62527.3244 54 1157.91341 R-squared = 0.7261
---------+------------------------------ Adj R-squared = 0.7008
Total | 228296.921 59 3869.43934 Root MSE = 34.028
------------------------------------------------------------------------------
mort | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
lnhc | 17.4691 4.635721 3.768 0.000 8.175039 26.76316
rain | 2.352107 .6396387 3.677 0.001 1.069709 3.634505
jan | -2.1316 .5041284 -4.228 0.000 -3.142316 -1.120883
educ | -17.95806 6.204078 -2.895 0.005 -30.39649 -5.519631
srnonw | 27.3349 4.398323 6.215 0.000 18.5168 36.15301
_cons | 986.261 82.67427 11.929 0.000 820.509 1152.013
------------------------------------------------------------------------------
predict h, leverage
format h %3.2f
list smsa h if h>.23
smsa h
16. Dallas 0.23
21. FortWorth 0.25
29. LosAngeles 0.28
32. Miami 0.36
46. SanDiego 0.25
Table 6.14, page 209.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/bays, clear
list bay pcb84 pcb85
bay pcb84 pcb85
1. Casco Bay 95.28 77.55
2. Merrimack River 52.97 29.23
3. Salem Harbor 533.58 403.1
4. Boston Harbor 17104.86 736
5. Buzzards' Bay 308.46 192.15
6. Narragansett Bay 159.96 220.6
7. East Long Island Sound 10 8.62
8. West Long Island Sound 234.43 174.31
9. Raritan Bay 443.89 529.28
10. Delaware Bay 2.5 130.67
11. Lower Chesapeake Bay 51 39.74
12. Pamilico Sound 0 0
13. Charleston Harbor 9.1 8.43
14. Sapelo Sound 0 0
15. St. Johns River 140 120.04
16. Tampa Bay 0 0
17. Apalachicola Bay 12 11.93
18. Mobile Bay 0 0
19. Round Island 0 0
20. Mississippi River Delta 34 30.14
21. Barataria Bay 0 0
22. San Antonio Bay 0 0
23. Corpus Christi Bay 0 0
24. San Diego Harbor 422.1 531.67
25. San Diego Bay 6.74 9.3
26. Dana Point 7.06 5.74
27. Seal Beach 46.71 46.47
28. San Pedro Canyon 159.56 176.9
29. Santa Monica Bay 14 13.69
30. Bodega Bay 4.18 4.89
31. Coos Bay 3.19 6.6
32. Columbia River Mouth 8.77 6.73
33. Nisqually Beach 4.23 4.28
34. Commencement Bay 20.6 20.5
35. Elliott Bay 329.97 414.5
36. Lutak Inlet 5.5 5.8
37. Nahku Bay 6.6 5.08
Figure 6.15, page 210.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/bays, clear qreg pcb85 pcb84 Iteration 1: WLS sum of weighted deviations = 3864.9265 Iteration 1: sum of abs. weighted deviations = 16672.456 Iteration 2: sum of abs. weighted deviations = 4312.7581 Iteration 3: sum of abs. weighted deviations = 2969.6801 Median regression Number of obs = 37 Raw sum of deviations 3821.07 (about 11.93) Min sum of deviations 2969.68 Pseudo R2 = 0.2228 ------------------------------------------------------------------------------ pcb85 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- pcb84 | .0425018 .0005854 72.608 0.000 .0413134 .0436901 _cons | 9.013539 9.6191 0.937 0.355 -10.51427 28.54135 ------------------------------------------------------------------------------ predict h1 graph twoway (scatter pcb85 pcb84) (lfit pcb85 pcb84) (line h1 pcb84, sort), /// xlabel(0(4000)16000) ylabel(0(200)800)
Figure 6.16, page 210.
NOTE: The prediction for id #4 is far above the highest point on the scale and is excluded from this analysis so that the line shown in the text can be produced.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/bays, clear
regress pcb85 pcb84
Source | SS df MS Number of obs = 37
---------+------------------------------ F( 1, 35) = 21.77
Model | 462349.858 1 462349.858 Prob > F = 0.0000
Residual | 743254.192 35 21235.8341 R-squared = 0.3835
---------+------------------------------ Adj R-squared = 0.3659
Total | 1205604.05 36 33489.0014 Root MSE = 145.73
------------------------------------------------------------------------------
pcb85 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pcb84 | .0404537 .0086698 4.666 0.000 .0228532 .0580543
_cons | 85.0138 24.42159 3.481 0.001 35.43533 134.5923
------------------------------------------------------------------------------
predict h, leverage
summarize h, detail
Leverage
-------------------------------------------------------------
Percentiles Smallest
1% .0270276 .0270276
5% .0270645 .0270645
10% .0271934 .0270821 Obs 37
25% .0277486 .0271934 Sum of Wgt. 37
50% .0280503 Mean .0540541
Largest Std. Dev. .1594038
75% .0280756 .0280853
90% .0280853 .0280853 Variance .0254096
95% .0280853 .0280853 Skewness 5.833291
99% .9974615 .9974615 Kurtosis 35.02746
Note that the 90th percentile is .0280853
gen wh=. (37 missing values generated) replace wh=1 if h<=.0280853 (36 real changes made) replace wh=(.0280853/h)^2 if h>.0280853 (1 real change made) qreg pcb85 pcb84 [aw=wh] (sum of wgt is 3.6001e+001) Iteration 1: WLS sum of weighted deviations = 13713.003 (sum of wgt is 3.6001e+001) Iteration 1: sum of abs. weighted deviations = 1122.2417 Iteration 2: sum of abs. weighted deviations = 942.67567 Iteration 3: sum of abs. weighted deviations = 921.36162 Iteration 4: sum of abs. weighted deviations = 921.36163 Median regression Number of obs = 37 Raw sum of deviations 3183.548 (about 11.93) Min sum of deviations 921.3616 Pseudo R2 = 0.7106 ------------------------------------------------------------------------------ pcb85 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- pcb84 | .994862 .0009638 1032.218 0.000 .9929053 .9968186 _cons | -7.92e-07 .1708606 0.000 1.000 -.3468662 .3468646 ------------------------------------------------------------------------------ predict yhat3 if id~=4 (option xb assumed; fitted values) (1 missing value generated) graph twoway (scatter pcb85 pcb84) (line yhat3 pcb84, sort) /// (scatter pcb85 pcb84 if pcb84 >= 16000, mlabel(bay) mlabposition(9)), /// xlabel(0(4000)16000) ylabel(0(200)800)
Figure 6.17, page 211.
use https://stats.idre.ucla.edu/stat/stata/examples/rwg/bays, clear
gen log84=log(pcb84+1)
gen log85=log(pcb85+1)
regress log85 log84
Source | SS df MS Number of obs = 37
---------+------------------------------ F( 1, 35) = 251.17
Model | 145.581687 1 145.581687 Prob > F = 0.0000
Residual | 20.2863138 35 .579608967 R-squared = 0.8777
---------+------------------------------ Adj R-squared = 0.8742
Total | 165.868001 36 4.60744448 Root MSE = .76132
------------------------------------------------------------------------------
log85 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
log84 | .8508259 .0536852 15.848 0.000 .741839 .9598127
_cons | .4251097 .202327 2.101 0.043 .014364 .8358553
------------------------------------------------------------------------------
predict h, leverage
summarize h, detail
Leverage
-------------------------------------------------------------
Percentiles Smallest
1% .0270889 .0270889
5% .0273455 .0273455
10% .0286045 .0278075 Obs 37
25% .0311876 .0286045 Sum of Wgt. 37
50% .0415393 Mean .0540541
Largest Std. Dev. .038911
75% .0706273 .0743968
90% .0743968 .0759508 Variance .0015141
95% .0818475 .0818475 Skewness 3.870526
99% .2560128 .2560128 Kurtosis 20.8041
Note that the 90th percentile is .0743968. The [aw=wh] option tells Stata to use wh as the aweight.
gen wh=. (37 missing values generated) replace wh=1 if h<=.0743968 (33 real changes made) replace wh=(.0743968/h)^2 if h>.0743968 (4 real changes made) qreg log85 log84 [aw=wh] (sum of wgt is 3.5870e+001) Iteration 1: WLS sum of weighted deviations = 14.000723 (sum of wgt is 3.5870e+001) Iteration 1: sum of abs. weighted deviations = 11.648956 Iteration 2: sum of abs. weighted deviations = 9.4317616 Iteration 3: sum of abs. weighted deviations = 9.3947206 Median regression Number of obs = 37 Raw sum of deviations 63.89834 (about 2.332144) Min sum of deviations 9.394721 Pseudo R2 = 0.8530 ------------------------------------------------------------------------------ log85 | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- log84 | .9922884 .0076806 129.194 0.000 .9766959 1.007881 _cons | -1.49e-09 .0211771 0.000 1.000 -.0429919 .0429919 ------------------------------------------------------------------------------ predict yhat1 graph twoway (scatter log85 log84) (lfit log85 log84) (line yhat1 log84), /// xlabel(0(2)10) ylabel(0(2)10)























