The data files used for the examples in this text can be downloaded in a zip file from the Stata Web site. You can then use a program such as zip to unzip the data files.
Example 15.1 on page 455 using mroz.dta. Notice that the robust standard error in the output below is slightly different from the book. This is because Stata does the finite sample correction to the robust standard error.
use mroz, clear reg inlf nwifeinc educ exper expersq age kidslt6 kidsge6 Source | SS df MS Number of obs = 753 -------------+------------------------------ F( 7, 745) = 38.22 Model | 48.8080578 7 6.97257969 Prob > F = 0.0000 Residual | 135.919698 745 .182442547 R-squared = 0.2642 -------------+------------------------------ Adj R-squared = 0.2573 Total | 184.727756 752 .245648611 Root MSE = .42713 ------------------------------------------------------------------------------ inlf | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -.0034052 .0014485 -2.35 0.019 -.0062488 -.0005616 educ | .0379953 .007376 5.15 0.000 .023515 .0524756 exper | .0394924 .0056727 6.96 0.000 .0283561 .0506287 expersq | -.0005963 .0001848 -3.23 0.001 -.0009591 -.0002335 age | -.0160908 .0024847 -6.48 0.000 -.0209686 -.011213 kidslt6 | -.2618105 .0335058 -7.81 0.000 -.3275875 -.1960335 kidsge6 | .0130122 .013196 0.99 0.324 -.0128935 .0389179 _cons | .5855192 .154178 3.80 0.000 .2828442 .8881943 ------------------------------------------------------------------------------ reg inlf nwifeinc educ exper expersq age kidslt6 kidsge6, robust Regression with robust standard errors Number of obs = 753 F( 7, 745) = 62.48 Prob > F = 0.0000 R-squared = 0.2642 Root MSE = .42713 ------------------------------------------------------------------------------ | Robust inlf | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -.0034052 .0015249 -2.23 0.026 -.0063988 -.0004115 educ | .0379953 .007266 5.23 0.000 .023731 .0522596 exper | .0394924 .00581 6.80 0.000 .0280864 .0508983 expersq | -.0005963 .00019 -3.14 0.002 -.0009693 -.0002233 age | -.0160908 .002399 -6.71 0.000 -.0208004 -.0113812 kidslt6 | -.2618105 .0317832 -8.24 0.000 -.3242058 -.1994152 kidsge6 | .0130122 .0135329 0.96 0.337 -.013555 .0395795 _cons | .5855192 .1522599 3.85 0.000 .2866098 .8844287 ------------------------------------------------------------------------------
Example 15.2 on page 468 using mroz.dta.
reg inlf nwifeinc educ exper expersq age kidslt6 kidsge6 Source | SS df MS Number of obs = 753 -------------+------------------------------ F( 7, 745) = 38.22 Model | 48.8080578 7 6.97257969 Prob > F = 0.0000 Residual | 135.919698 745 .182442547 R-squared = 0.2642 -------------+------------------------------ Adj R-squared = 0.2573 Total | 184.727756 752 .245648611 Root MSE = .42713 ------------------------------------------------------------------------------ inlf | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -.0034052 .0014485 -2.35 0.019 -.0062488 -.0005616 educ | .0379953 .007376 5.15 0.000 .023515 .0524756 exper | .0394924 .0056727 6.96 0.000 .0283561 .0506287 expersq | -.0005963 .0001848 -3.23 0.001 -.0009591 -.0002335 age | -.0160908 .0024847 -6.48 0.000 -.0209686 -.011213 kidslt6 | -.2618105 .0335058 -7.81 0.000 -.3275875 -.1960335 kidsge6 | .0130122 .013196 0.99 0.324 -.0128935 .0389179 _cons | .5855192 .154178 3.80 0.000 .2828442 .8881943 ------------------------------------------------------------------------------ gen lpm_c = (lpm>=.5) tab inlf lpm_c =1 if in | lab frce, | lpm_c 1975 | 0 1 | Total -----------+----------------------+---------- 0 | 203 122 | 325 1 | 78 350 | 428 -----------+----------------------+---------- Total | 281 472 | 753 di (203+350)/753 .73439575 logit inlf nwifeinc educ exper expersq age kidslt6 kidsge6 Logit estimates Number of obs = 753 LR chi2(7) = 226.22 Prob > chi2 = 0.0000 Log likelihood = -401.76515 Pseudo R2 = 0.2197 ------------------------------------------------------------------------------ inlf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -.0213452 .0084214 -2.53 0.011 -.0378509 -.0048394 educ | .2211704 .0434396 5.09 0.000 .1360303 .3063105 exper | .2058695 .0320569 6.42 0.000 .1430391 .2686999 expersq | -.0031541 .0010161 -3.10 0.002 -.0051456 -.0011626 age | -.0880244 .014573 -6.04 0.000 -.116587 -.0594618 kidslt6 | -1.443354 .2035849 -7.09 0.000 -1.842373 -1.044335 kidsge6 | .0601122 .0747897 0.80 0.422 -.086473 .2066974 _cons | .4254524 .8603696 0.49 0.621 -1.260841 2.111746 ------------------------------------------------------------------------------ * Stata 8 code. lstat * Stata 9 code and output. estat classification Logistic model for inlf -------- True -------- Classified | D ~D | Total -----------+--------------------------+----------- + | 347 118 | 465 - | 81 207 | 288 -----------+--------------------------+----------- Total | 428 325 | 753 Classified + if predicted Pr(D) >= .5 True D defined as inlf != 0 -------------------------------------------------- Sensitivity Pr( +| D) 81.07% Specificity Pr( -|~D) 63.69% Positive predictive value Pr( D| +) 74.62% Negative predictive value Pr(~D| -) 71.88% -------------------------------------------------- False + rate for true ~D Pr( +|~D) 36.31% False - rate for true D Pr( -| D) 18.93% False + rate for classified + Pr(~D| +) 25.38% False - rate for classified - Pr( D| -) 28.13% -------------------------------------------------- Correctly classified 73.57% -------------------------------------------------- probit inlf nwifeinc educ exper expersq age kidslt6 kidsge6 Probit estimates Number of obs = 753 LR chi2(7) = 227.14 Prob > chi2 = 0.0000 Log likelihood = -401.30219 Pseudo R2 = 0.2206 ------------------------------------------------------------------------------ inlf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -.0120237 .0048398 -2.48 0.013 -.0215096 -.0025378 educ | .1309047 .0252542 5.18 0.000 .0814074 .180402 exper | .1233476 .0187164 6.59 0.000 .0866641 .1600311 expersq | -.0018871 .0006 -3.15 0.002 -.003063 -.0007111 age | -.0528527 .0084772 -6.23 0.000 -.0694678 -.0362376 kidslt6 | -.8683285 .1185223 -7.33 0.000 -1.100628 -.636029 kidsge6 | .036005 .0434768 0.83 0.408 -.049208 .1212179 _cons | .2700768 .508593 0.53 0.595 -.7267473 1.266901 ------------------------------------------------------------------------------ * Stata 8 code. lstat * Stata 9 code and output. estat classification Probit model for inlf -------- True -------- Classified | D ~D | Total -----------+--------------------------+----------- + | 348 120 | 468 - | 80 205 | 285 -----------+--------------------------+----------- Total | 428 325 | 753 Classified + if predicted Pr(D) >= .5 True D defined as inlf != 0 -------------------------------------------------- Sensitivity Pr( +| D) 81.31% Specificity Pr( -|~D) 63.08% Positive predictive value Pr( D| +) 74.36% Negative predictive value Pr(~D| -) 71.93% -------------------------------------------------- False + rate for true ~D Pr( +|~D) 36.92% False - rate for true D Pr( -| D) 18.69% False + rate for classified + Pr(~D| +) 25.64% False - rate for classified - Pr( D| -) 28.07% -------------------------------------------------- Correctly classified 73.44% --------------------------------------------------
Example 15.3 on page 474, testing for exogeneity of education.
reg educ nwifeinc exper expersq age kidslt6 kidsge6 motheduc fatheduc huseduc Source | SS df MS Number of obs = 753 -------------+------------------------------ F( 9, 743) = 74.07 Model | 1849.07781 9 205.45309 Prob > F = 0.0000 Residual | 2060.96203 743 2.77383853 R-squared = 0.4729 -------------+------------------------------ Adj R-squared = 0.4665 Total | 3910.03984 752 5.19952106 Root MSE = 1.6655 ------------------------------------------------------------------------------ educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | .0156893 .0058267 2.69 0.007 .0042506 .027128 exper | .0577544 .0220604 2.62 0.009 .0144462 .1010625 expersq | -.000784 .000721 -1.09 0.277 -.0021994 .0006314 age | -.0059011 .0098709 -0.60 0.550 -.0252792 .013477 kidslt6 | .1195954 .1307071 0.91 0.360 -.1370038 .3761945 kidsge6 | -.0731404 .0515299 -1.42 0.156 -.174302 .0280212 motheduc | .1300347 .0225669 5.76 0.000 .0857322 .1743373 fatheduc | .0950702 .0214618 4.43 0.000 .0529373 .1372032 huseduc | .3475092 .0235063 14.78 0.000 .3013626 .3936558 _cons | 5.43695 .5873755 9.26 0.000 4.283837 6.590064 ------------------------------------------------------------------------------ predict u, res probit inlf nwifeinc educ exper expersq age kidslt6 kidsge6 u Probit estimates Number of obs = 753 LR chi2(8) = 227.90 Prob > chi2 = 0.0000 Log likelihood = -400.92551 Pseudo R2 = 0.2213 ------------------------------------------------------------------------------ inlf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -.0102851 .0052347 -1.96 0.049 -.020545 -.0000253 educ | .1035752 .0403061 2.57 0.010 .0245767 .1825737 exper | .1262477 .0190256 6.64 0.000 .0889582 .1635373 expersq | -.0019432 .0006032 -3.22 0.001 -.0031254 -.0007609 age | -.0543808 .0086633 -6.28 0.000 -.0713605 -.0374012 kidslt6 | -.8630859 .1187394 -7.27 0.000 -1.095811 -.630361 kidsge6 | .0313802 .0437901 0.72 0.474 -.0544468 .1172071 u | .0433658 .050021 0.87 0.386 -.0546736 .1414051 _cons | .6209105 .6497413 0.96 0.339 -.652559 1.89438 ------------------------------------------------------------------------------
Example 15.4 on page 498 using keane.dta.
use keane, clear * Stata 8 code. mlogit status educ exper expersq black if year ==87, basecategory(1) * Stata 9 code and output. mlogit status educ exper expersq black if year ==87, baseoutcome(1) Multinomial logistic regression Number of obs = 1717 LR chi2(8) = 583.72 Prob > chi2 = 0.0000 Log likelihood = -907.85723 Pseudo R2 = 0.2433 ------------------------------------------------------------------------------ status | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 2 | educ | -.6736313 .0698999 -9.64 0.000 -.8106325 -.53663 exper | -.1062149 .173282 -0.61 0.540 -.4458414 .2334116 expersq | -.0125152 .0252291 -0.50 0.620 -.0619633 .036933 black | .8130166 .3027231 2.69 0.007 .2196902 1.406343 _cons | 10.27787 1.133336 9.07 0.000 8.056578 12.49917 -------------+---------------------------------------------------------------- 3 | educ | -.3146573 .0651096 -4.83 0.000 -.4422699 -.1870448 exper | .8487367 .1569856 5.41 0.000 .5410507 1.156423 expersq | -.0773003 .0229217 -3.37 0.001 -.1222261 -.0323746 black | .3113612 .2815339 1.11 0.269 -.240435 .8631574 _cons | 5.543798 1.086409 5.10 0.000 3.414475 7.673121 ------------------------------------------------------------------------------ (Outcome status==1 is the comparison group) predict p1 p2 p3 if e(sample), p (11006 missing values generated) gen pstatus = . (12723 missing values generated) egen atest = rmax(p1 p2 p3) if e(sample) (11006 missing values generated) foreach n of numlist 1/3 { replace pstatus = `n' if p`n'==atest & year ==87 } (36 real changes made) (214 real changes made) (1530 real changes made) gen pdiff = (status == pstatus) tab pdiff if year==87 & exper~=. & black ~=. & status ~=. pdiff | Freq. Percent Cum. ------------+----------------------------------- 0 | 351 20.44 20.44 1 | 1,366 79.56 100.00 ------------+----------------------------------- Total | 1,717 100.00 di 1366/1717 .79557368 test [2]: exper expersq ( 1) [2]exper = 0 ( 2) [2]expersq = 0 chi2( 2) = 6.12 Prob > chi2 = 0.0468
Example 15.5 on page 507 using pension.dta.
use pension, clear reg pctstck choice age educ female black married finc25 finc35 finc50 finc75 finc100 /// finc101 wealth89 prftshr Source | SS df MS Number of obs = 194 -------------+------------------------------ F( 14, 179) = 1.42 Model | 30402.0516 14 2171.57511 Prob > F = 0.1486 Residual | 274134.031 179 1531.47503 R-squared = 0.0998 -------------+------------------------------ Adj R-squared = 0.0294 Total | 304536.082 193 1577.90716 Root MSE = 39.134 ------------------------------------------------------------------------------ pctstck | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- choice | 12.04773 6.298171 1.91 0.057 -.3804881 24.47594 age | -1.625967 .7748246 -2.10 0.037 -3.154932 -.0970012 educ | .7538685 1.207392 0.62 0.533 -1.628684 3.136421 female | 1.302856 7.163775 0.18 0.856 -12.83346 15.43917 black | 3.967391 9.782799 0.41 0.686 -15.33706 23.27184 married | 3.303436 7.997618 0.41 0.680 -12.47831 19.08518 finc25 | -18.18567 14.12026 -1.29 0.199 -46.04924 9.677906 finc35 | -3.925374 14.48565 -0.27 0.787 -32.50999 24.65924 finc50 | -8.128784 14.34191 -0.57 0.572 -36.42976 20.17219 finc75 | -17.57921 16.07766 -1.09 0.276 -49.30534 14.14693 finc100 | -6.74559 15.79116 -0.43 0.670 -37.90637 24.41519 finc101 | -28.34407 17.9049 -1.58 0.115 -63.67591 6.987774 wealth89 | -.0026918 .0124603 -0.22 0.829 -.0272797 .0218961 prftshr | 15.80791 7.332677 2.16 0.032 1.338299 30.27752 _cons | 134.1161 55.70525 2.41 0.017 24.1926 244.0395 ------------------------------------------------------------------------------ oprobit pctstck choice age educ female black married finc25 finc35 finc50 finc75 finc100 finc101 /// wealth89 prftshr Ordered probit estimates Number of obs = 194 LR chi2(14) = 20.77 Prob > chi2 = 0.1077 Log likelihood = -201.9865 Pseudo R2 = 0.0489 ------------------------------------------------------------------------------ pctstck | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- choice | .371171 .1841121 2.02 0.044 .010318 .7320241 age | -.0500516 .0226063 -2.21 0.027 -.0943591 -.005744 educ | .0261382 .0352561 0.74 0.458 -.0429626 .0952389 female | .0455642 .206004 0.22 0.825 -.3581963 .4493246 black | .0933923 .2820403 0.33 0.741 -.4593965 .6461811 married | .0935981 .2332114 0.40 0.688 -.3634878 .550684 finc25 | -.5784299 .423162 -1.37 0.172 -1.407812 .2509524 finc35 | -.1346721 .4305242 -0.31 0.754 -.9784841 .7091399 finc50 | -.2620401 .4265936 -0.61 0.539 -1.098148 .5740681 finc75 | -.5662312 .4780035 -1.18 0.236 -1.503101 .3706385 finc100 | -.2278963 .4685942 -0.49 0.627 -1.146324 .6905316 finc101 | -.8641109 .5291111 -1.63 0.102 -1.90115 .1729279 wealth89 | -.0000956 .0003737 -0.26 0.798 -.0008279 .0006368 prftshr | .4817182 .2161233 2.23 0.026 .0581243 .905312 -------------+---------------------------------------------------------------- _cut1 | -3.087373 1.623765 (Ancillary parameters) _cut2 | -2.053553 1.618611 ------------------------------------------------------------------------------ gen pclass=. (226 missing values generated) predict c1 if e(sample), outcome(0) (option p assumed; predicted probability) (32 missing values generated) predict c2 if e(sample), outcome(50) (option p assumed; predicted probability) (32 missing values generated) predict c3 if e(sample), outcome(100) (option p assumed; predicted probability) (32 missing values generated) egen atest = rmax(c1 c2 c3) (32 missing values generated) foreach n of numlist 1/3 { replace pclass = `n' if c`n' ==atest } (97 real changes made) (113 real changes made) (80 real changes made) tab pclass pctstck if e(sample) | 0=mstbnds,50=mixed,100=mststcks pclass | 0 50 100 | Total -----------+---------------------------------+---------- 1 | 33 21 11 | 65 2 | 25 31 25 | 81 3 | 6 20 22 | 48 -----------+---------------------------------+---------- Total | 64 72 58 | 194 di (33+31+22)/194 .44329897 di 33/64 .515625 di 31/72 .43055556 di 22/58 .37931034