The data files used for the examples in this text can be downloaded in a zip file from the Stata Web site. You can then use a program such as zip to unzip the data files.
Example 16.3 on page 527 using mroz.dta. In this example, we also show how to get the R-squared for tobit model based on the description on page 527.
use mroz, clear reg hours nwifeinc educ exper expersq age kidslt6 kidsge6 Source | SS df MS Number of obs = 753 -------------+------------------------------ F( 7, 745) = 38.50 Model | 151647606 7 21663943.7 Prob > F = 0.0000 Residual | 419262118 745 562767.944 R-squared = 0.2656 -------------+------------------------------ Adj R-squared = 0.2587 Total | 570909724 752 759188.463 Root MSE = 750.18 ------------------------------------------------------------------------------ hours | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -3.446636 2.544 -1.35 0.176 -8.440898 1.547626 educ | 28.76112 12.95459 2.22 0.027 3.329284 54.19297 exper | 65.67251 9.962983 6.59 0.000 46.11365 85.23138 expersq | -.7004939 .3245501 -2.16 0.031 -1.337635 -.0633524 age | -30.51163 4.363868 -6.99 0.000 -39.07858 -21.94469 kidslt6 | -442.0899 58.8466 -7.51 0.000 -557.6148 -326.565 kidsge6 | -32.77923 23.17622 -1.41 0.158 -78.2777 12.71924 _cons | 1330.482 270.7846 4.91 0.000 798.8906 1862.074 ------------------------------------------------------------------------------ tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6, ll(0) Tobit estimates Number of obs = 753 LR chi2(7) = 271.59 Prob > chi2 = 0.0000 Log likelihood = -3819.0946 Pseudo R2 = 0.0343 ------------------------------------------------------------------------------ hours | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -8.814243 4.459096 -1.98 0.048 -17.56811 -.0603726 educ | 80.64561 21.58322 3.74 0.000 38.27453 123.0167 exper | 131.5643 17.27938 7.61 0.000 97.64231 165.4863 expersq | -1.864158 .5376615 -3.47 0.001 -2.919667 -.8086479 age | -54.40501 7.418496 -7.33 0.000 -68.96862 -39.8414 kidslt6 | -894.0217 111.8779 -7.99 0.000 -1113.655 -674.3887 kidsge6 | -16.218 38.64136 -0.42 0.675 -92.07674 59.64075 _cons | 965.3053 446.4358 2.16 0.031 88.8853 1841.725 -------------+---------------------------------------------------------------- _se | 1122.022 41.57903 (Ancillary parameter) ------------------------------------------------------------------------------ Obs. summary: 325 left-censored observations at hours<=0 428 uncensored observations matrix b = e(b) local se = el(b,1, 9) di `se' .1122.0217 predict xb, xb gen yhat = norm(xb/`se')*xb + `se'*normden(xb/`se') reg hours yhat Source | SS df MS Number of obs = 753 -------------+------------------------------ F( 1, 751) = 283.78 Model | 156568646 1 156568646 Prob > F = 0.0000 Residual | 414341078 751 551719.144 R-squared = 0.2742 -------------+------------------------------ Adj R-squared = 0.2733 Total | 570909724 752 759188.463 Root MSE = 742.78 ------------------------------------------------------------------------------ hours | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- yhat | .9634449 .0571918 16.85 0.000 .8511702 1.07572 _cons | 45.52784 49.34596 0.92 0.356 -51.34458 142.4003 ------------------------------------------------------------------------------
Now we compare the current model with the model with two additional variables.
tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6, ll(0) Tobit estimates Number of obs = 753 LR chi2(7) = 271.59 Prob > chi2 = 0.0000 Log likelihood = -3819.0946 Pseudo R2 = 0.0343 ------------------------------------------------------------------------------ hours | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -8.814243 4.459096 -1.98 0.048 -17.56811 -.0603726 educ | 80.64561 21.58322 3.74 0.000 38.27453 123.0167 exper | 131.5643 17.27938 7.61 0.000 97.64231 165.4863 expersq | -1.864158 .5376615 -3.47 0.001 -2.919667 -.8086479 age | -54.40501 7.418496 -7.33 0.000 -68.96862 -39.8414 kidslt6 | -894.0217 111.8779 -7.99 0.000 -1113.655 -674.3887 kidsge6 | -16.218 38.64136 -0.42 0.675 -92.07674 59.64075 _cons | 965.3053 446.4358 2.16 0.031 88.8853 1841.725 -------------+---------------------------------------------------------------- _se | 1122.022 41.57903 (Ancillary parameter) ------------------------------------------------------------------------------ Obs. summary: 325 left-censored observations at hours<=0 428 uncensored observations
The fitstat program needs to be downloaded prior to its use. You can download fitstat from within Stata by typing search fitstat in the command line and proceed with the installation (see How can I use the search command to search for programs and get additional help? for more information about using search).
fitstat, saving(m0) Measures of Fit for tobit of hours Log-Lik Intercept Only: -3954.892 Log-Lik Full Model: -3819.095 D(744): 7638.189 LR(7): 271.594 Prob > LR: 0.000 McFadden's R2: 0.034 McFadden's Adj R2: 0.032 Maximum Likelihood R2: 0.303 Cragg & Uhler's R2: 0.303 McKelvey and Zavoina's R2: 0.357 Variance of y*: 1956786.696 Variance of error: 1258932.624 AIC: 10.168 AIC*n: 7656.189 BIC: 2709.885 BIC': -225.226 (Indices saved in matrix fs_m0) tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6 unem city, ll(0) Tobit estimates Number of obs = 753 LR chi2(9) = 274.01 Prob > chi2 = 0.0000 Log likelihood = -3817.8867 Pseudo R2 = 0.0346 ------------------------------------------------------------------------------ hours | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- nwifeinc | -8.700003 4.544473 -1.91 0.056 -17.62152 .2215131 educ | 83.11418 21.68775 3.83 0.000 40.53771 125.6907 exper | 133.4305 17.31206 7.71 0.000 99.44423 167.4168 expersq | -1.934666 .5393865 -3.59 0.000 -2.993567 -.8757655 age | -53.13497 7.468394 -7.11 0.000 -67.7966 -38.47333 kidslt6 | -887.5036 111.6144 -7.95 0.000 -1106.62 -668.3869 kidsge6 | -13.28932 38.60969 -0.34 0.731 -89.08622 62.50759 unem | -23.3407 15.06687 -1.55 0.122 -52.91933 6.237931 city | 11.36026 99.61766 0.11 0.909 -184.2049 206.9254 _cons | 1060.127 449.8033 2.36 0.019 177.0922 1943.162 -------------+---------------------------------------------------------------- _se | 1119.474 41.47684 (Ancillary parameter) ------------------------------------------------------------------------------ Obs. summary: 325 left-censored observations at hours<=0 428 uncensored observations fitstat, using(m0) Measures of Fit for tobit of hours Current Saved Difference Model: tobit tobit N: 753 753 0 Log-Lik Intercept Only: -3954.892 -3954.892 0.000 Log-Lik Full Model: -3817.887 -3819.095 1.208 D: 7635.773(742) 7638.189(744) 2.416(2) LR: 274.010(9) 271.594(7) 2.416(2) Prob > LR: 0.000 0.000 0.299 McFadden's R2: 0.035 0.034 0.000 McFadden's Adj R2: 0.032 0.032 -0.000 Maximum Likelihood R2: 0.305 0.303 0.002 Cragg & Uhler's R2: 0.305 0.303 0.002 McKelvey and Zavoina's R2: 0.359 0.357 0.003 Variance of y*: 1955865.260 1956786.696 -921.436 Variance of error: 1253221.510 1258932.624 -5711.114 AIC: 10.170 10.168 0.002 AIC*n: 7657.773 7656.189 1.584 BIC: 2720.717 2709.885 10.833 BIC': -214.393 -225.226 10.833 Difference of 10.833 in BIC' provides very strong support for saved model. Note: p-value for difference in LR is only valid if models are nested.