The data files used for the examples in this text can be downloaded in a zip file from the Stata Web site. You can then use a program such as zip to unzip the data files.
Example 16.3 on page 527 using mroz.dta. In this example, we also show how to get the R-squared for tobit model based on the description on page 527.
use mroz, clear
reg hours nwifeinc educ exper expersq age kidslt6 kidsge6
Source | SS df MS Number of obs = 753
-------------+------------------------------ F( 7, 745) = 38.50
Model | 151647606 7 21663943.7 Prob > F = 0.0000
Residual | 419262118 745 562767.944 R-squared = 0.2656
-------------+------------------------------ Adj R-squared = 0.2587
Total | 570909724 752 759188.463 Root MSE = 750.18
------------------------------------------------------------------------------
hours | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
nwifeinc | -3.446636 2.544 -1.35 0.176 -8.440898 1.547626
educ | 28.76112 12.95459 2.22 0.027 3.329284 54.19297
exper | 65.67251 9.962983 6.59 0.000 46.11365 85.23138
expersq | -.7004939 .3245501 -2.16 0.031 -1.337635 -.0633524
age | -30.51163 4.363868 -6.99 0.000 -39.07858 -21.94469
kidslt6 | -442.0899 58.8466 -7.51 0.000 -557.6148 -326.565
kidsge6 | -32.77923 23.17622 -1.41 0.158 -78.2777 12.71924
_cons | 1330.482 270.7846 4.91 0.000 798.8906 1862.074
------------------------------------------------------------------------------
tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6, ll(0)
Tobit estimates Number of obs = 753
LR chi2(7) = 271.59
Prob > chi2 = 0.0000
Log likelihood = -3819.0946 Pseudo R2 = 0.0343
------------------------------------------------------------------------------
hours | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
nwifeinc | -8.814243 4.459096 -1.98 0.048 -17.56811 -.0603726
educ | 80.64561 21.58322 3.74 0.000 38.27453 123.0167
exper | 131.5643 17.27938 7.61 0.000 97.64231 165.4863
expersq | -1.864158 .5376615 -3.47 0.001 -2.919667 -.8086479
age | -54.40501 7.418496 -7.33 0.000 -68.96862 -39.8414
kidslt6 | -894.0217 111.8779 -7.99 0.000 -1113.655 -674.3887
kidsge6 | -16.218 38.64136 -0.42 0.675 -92.07674 59.64075
_cons | 965.3053 446.4358 2.16 0.031 88.8853 1841.725
-------------+----------------------------------------------------------------
_se | 1122.022 41.57903 (Ancillary parameter)
------------------------------------------------------------------------------
Obs. summary: 325 left-censored observations at hours<=0
428 uncensored observations
matrix b = e(b)
local se = el(b,1, 9)
di `se'
.1122.0217
predict xb, xb
gen yhat = norm(xb/`se')*xb + `se'*normden(xb/`se')
reg hours yhat
Source | SS df MS Number of obs = 753
-------------+------------------------------ F( 1, 751) = 283.78
Model | 156568646 1 156568646 Prob > F = 0.0000
Residual | 414341078 751 551719.144 R-squared = 0.2742
-------------+------------------------------ Adj R-squared = 0.2733
Total | 570909724 752 759188.463 Root MSE = 742.78
------------------------------------------------------------------------------
hours | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yhat | .9634449 .0571918 16.85 0.000 .8511702 1.07572
_cons | 45.52784 49.34596 0.92 0.356 -51.34458 142.4003
------------------------------------------------------------------------------
Now we compare the current model with the model with two additional variables.
tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6, ll(0)
Tobit estimates Number of obs = 753
LR chi2(7) = 271.59
Prob > chi2 = 0.0000
Log likelihood = -3819.0946 Pseudo R2 = 0.0343
------------------------------------------------------------------------------
hours | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
nwifeinc | -8.814243 4.459096 -1.98 0.048 -17.56811 -.0603726
educ | 80.64561 21.58322 3.74 0.000 38.27453 123.0167
exper | 131.5643 17.27938 7.61 0.000 97.64231 165.4863
expersq | -1.864158 .5376615 -3.47 0.001 -2.919667 -.8086479
age | -54.40501 7.418496 -7.33 0.000 -68.96862 -39.8414
kidslt6 | -894.0217 111.8779 -7.99 0.000 -1113.655 -674.3887
kidsge6 | -16.218 38.64136 -0.42 0.675 -92.07674 59.64075
_cons | 965.3053 446.4358 2.16 0.031 88.8853 1841.725
-------------+----------------------------------------------------------------
_se | 1122.022 41.57903 (Ancillary parameter)
------------------------------------------------------------------------------
Obs. summary: 325 left-censored observations at hours<=0
428 uncensored observations
The fitstat program needs to be downloaded prior to its use. You can download fitstat from within Stata by typing search fitstat in the command line and proceed with the installation (see How can I use the search command to search for programs and get additional help? for more information about using search).
fitstat, saving(m0)
Measures of Fit for tobit of hours
Log-Lik Intercept Only: -3954.892 Log-Lik Full Model: -3819.095
D(744): 7638.189 LR(7): 271.594
Prob > LR: 0.000
McFadden's R2: 0.034 McFadden's Adj R2: 0.032
Maximum Likelihood R2: 0.303 Cragg & Uhler's R2: 0.303
McKelvey and Zavoina's R2: 0.357
Variance of y*: 1956786.696 Variance of error: 1258932.624
AIC: 10.168 AIC*n: 7656.189
BIC: 2709.885 BIC': -225.226
(Indices saved in matrix fs_m0)
tobit hours nwifeinc educ exper expersq age kidslt6 kidsge6 unem city, ll(0)
Tobit estimates Number of obs = 753
LR chi2(9) = 274.01
Prob > chi2 = 0.0000
Log likelihood = -3817.8867 Pseudo R2 = 0.0346
------------------------------------------------------------------------------
hours | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
nwifeinc | -8.700003 4.544473 -1.91 0.056 -17.62152 .2215131
educ | 83.11418 21.68775 3.83 0.000 40.53771 125.6907
exper | 133.4305 17.31206 7.71 0.000 99.44423 167.4168
expersq | -1.934666 .5393865 -3.59 0.000 -2.993567 -.8757655
age | -53.13497 7.468394 -7.11 0.000 -67.7966 -38.47333
kidslt6 | -887.5036 111.6144 -7.95 0.000 -1106.62 -668.3869
kidsge6 | -13.28932 38.60969 -0.34 0.731 -89.08622 62.50759
unem | -23.3407 15.06687 -1.55 0.122 -52.91933 6.237931
city | 11.36026 99.61766 0.11 0.909 -184.2049 206.9254
_cons | 1060.127 449.8033 2.36 0.019 177.0922 1943.162
-------------+----------------------------------------------------------------
_se | 1119.474 41.47684 (Ancillary parameter)
------------------------------------------------------------------------------
Obs. summary: 325 left-censored observations at hours<=0
428 uncensored observations
fitstat, using(m0)
Measures of Fit for tobit of hours
Current Saved Difference
Model: tobit tobit
N: 753 753 0
Log-Lik Intercept Only: -3954.892 -3954.892 0.000
Log-Lik Full Model: -3817.887 -3819.095 1.208
D: 7635.773(742) 7638.189(744) 2.416(2)
LR: 274.010(9) 271.594(7) 2.416(2)
Prob > LR: 0.000 0.000 0.299
McFadden's R2: 0.035 0.034 0.000
McFadden's Adj R2: 0.032 0.032 -0.000
Maximum Likelihood R2: 0.305 0.303 0.002
Cragg & Uhler's R2: 0.305 0.303 0.002
McKelvey and Zavoina's R2: 0.359 0.357 0.003
Variance of y*: 1955865.260 1956786.696 -921.436
Variance of error: 1253221.510 1258932.624 -5711.114
AIC: 10.170 10.168 0.002
AIC*n: 7657.773 7656.189 1.584
BIC: 2720.717 2709.885 10.833
BIC': -214.393 -225.226 10.833
Difference of 10.833 in BIC' provides very strong support for saved model.
Note: p-value for difference in LR is only valid if models are nested.
