Table 3.2, page 51.
use chdage.dta, clear (Hosmer and Lemeshow - from chapter 1) gen aged=0 replace aged=1 if age >= 55 (27 real changes made) sort aged by aged: tabulate chd _______________________________________________________________________________ -> aged = 0 chd | Freq. Percent Cum. ------------+----------------------------------- 0 | 51 69.86 69.86 1 | 22 30.14 100.00 ------------+----------------------------------- Total | 73 100.00 _______________________________________________________________________________ -> aged = 1 chd | Freq. Percent Cum. ------------+----------------------------------- 0 | 6 22.22 22.22 1 | 21 77.78 100.00 ------------+----------------------------------- Total | 27 100.00
Table 3.3, page 52.
logit chd aged Iteration 0: log likelihood = -68.331491 Iteration 1: log likelihood = -59.020453 Iteration 2: log likelihood = -58.979594 Iteration 3: log likelihood = -58.979565 Logit estimates Number of obs = 100 LR chi2(1) = 18.70 Prob > chi2 = 0.0000 Log likelihood = -58.979565 Pseudo R2 = 0.1369 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- aged | 2.093546 .5285335 3.96 0.000 1.057639 3.129453 _cons | -.8407832 .2550733 -3.30 0.001 -1.340718 -.3408487 ------------------------------------------------------------------------------
Table 3.5, page 56.
clear input race chd cnt 1 1 5 2 1 20 3 1 15 4 1 10 1 0 20 2 0 10 3 0 10 4 0 10 end expand cnt (92 observations created) tab chd race | race chd | 1 2 3 4 | Total -----------+--------------------------------------------+---------- 0 | 20 10 10 10 | 50 1 | 5 20 15 10 | 50 -----------+--------------------------------------------+---------- Total | 25 30 25 20 | 100 tabulate race, gen(race_) race | Freq. Percent Cum. ------------+----------------------------------- 1 | 25 25.00 25.00 2 | 30 30.00 55.00 3 | 25 25.00 80.00 4 | 20 20.00 100.00 ------------+----------------------------------- Total | 100 100.00 logistic chd race_2 race_3 race_4 Logit estimates Number of obs = 100 LR chi2(3) = 14.04 Prob > chi2 = 0.0028 Log likelihood = -62.293721 Pseudo R2 = 0.1013 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race_2 | 8 5.059619 3.29 0.001 2.316037 27.63341 race_3 | 6 3.872965 2.78 0.006 1.69319 21.26165 race_4 | 4 2.68327 2.07 0.039 1.074136 14.8957 ------------------------------------------------------------------------------ logit Logit estimates Number of obs = 100 LR chi2(3) = 14.04 Prob > chi2 = 0.0028 Log likelihood = -62.293721 Pseudo R2 = 0.1013 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race_2 | 2.079442 .6324524 3.29 0.001 .8398576 3.319026 race_3 | 1.791759 .6454942 2.78 0.006 .5266141 3.056905 race_4 | 1.386294 .6708175 2.07 0.039 .0715163 2.701072 _cons | -1.386294 .4999961 -2.77 0.006 -2.366269 -.4063201 ------------------------------------------------------------------------------
NOTE: The logistic command gives the odds ratios and the 95% confidence intervals; the logit command gives the ln(OR) on the bottom row of the table.
Table 3.7, page 58.
logit chd race_2 race_3 race_4 Iteration 0: log likelihood = -69.314718 Iteration 1: log likelihood = -62.368156 Iteration 2: log likelihood = -62.293897 Iteration 3: log likelihood = -62.293721 Logit estimates Number of obs = 100 LR chi2(3) = 14.04 Prob > chi2 = 0.0028 Log likelihood = -62.293721 Pseudo R2 = 0.1013 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race_2 | 2.079442 .6324524 3.29 0.001 .8398576 3.319026 race_3 | 1.791759 .6454942 2.78 0.006 .5266141 3.056905 race_4 | 1.386294 .6708175 2.07 0.039 .0715163 2.701072 _cons | -1.386294 .4999961 -2.77 0.006 -2.366269 -.4063201 ------------------------------------------------------------------------------
Table 3.8, page 59.
NOTE: We will drop the variables race_2 race_3 and race_4 that we created above. Next, we will copy the variable race into new variables called race_2 race_3 and race_4 and create the design variables from them.
drop race_1 - race_4 gen race_2=race gen race_3=race gen race_4=race recode race_2 1=-1 2=1 3/4=0 (100 changes made) recode race_3 1=-1 2=0 3=1 4=0 (100 changes made) recode race_4 1=-1 2=0 3=0 4=1 (100 changes made) list race race_2 race_3 race_4 in 1/4 race race_2 race_3 race_4 1. 1 -1 -1 -1 2. 2 1 0 0 3. 3 0 1 0 4. 4 0 0 1
Table 3.9, page 60.
logit chd race_2 race_3 race_4 Iteration 0: log likelihood = -69.314718 Iteration 1: log likelihood = -62.368156 Iteration 2: log likelihood = -62.293897 Iteration 3: log likelihood = -62.293721 Logit estimates Number of obs = 100 LR chi2(3) = 14.04 Prob > chi2 = 0.0028 Log likelihood = -62.293721 Pseudo R2 = 0.1013 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- race_2 | .7650677 .3505944 2.18 0.029 .0779153 1.45222 race_3 | .4773856 .3622841 1.32 0.188 -.2326781 1.187449 race_4 | .0719205 .384599 0.19 0.852 -.6818797 .8257208 _cons | -.0719205 .2188982 -0.33 0.742 -.5009531 .3571121 ------------------------------------------------------------------------------
Tables 3.10 – 3.14, pages 67 – 73. These data are hypothetical and are not available.
Table 3.14, page 77.
NOTE: We have run the logistic regression models from the largest to the smallest so that the difference between the larger and the smaller model can be determined. This is the reverse of the presentation in the table in the book.
use lowbwt.dta, clear (Hosmer and Lemeshow - from appendix 1) gen lwd=(lwt<110) gen lwdage=lwd*age logit low lwd age lwdage Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -110.71804 Iteration 2: log likelihood = -110.57024 Iteration 3: log likelihood = -110.56997 Logit estimates Number of obs = 189 LR chi2(3) = 13.53 Prob > chi2 = 0.0036 Log likelihood = -110.56997 Pseudo R2 = 0.0577 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwd | -1.944089 1.724804 -1.13 0.260 -5.324643 1.436465 age | -.0795722 .0396343 -2.01 0.045 -.157254 -.0018904 lwdage | .1321967 .0756982 1.75 0.081 -.0161691 .2805626 _cons | .7744952 .9100949 0.85 0.395 -1.009258 2.558248 ------------------------------------------------------------------------------ lrtest, saving(3) logit low lwd age Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -112.19831 Iteration 2: log likelihood = -112.14339 Iteration 3: log likelihood = -112.14338 Logit estimates Number of obs = 189 LR chi2(2) = 10.39 Prob > chi2 = 0.0056 Log likelihood = -112.14338 Pseudo R2 = 0.0443 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwd | 1.010122 .3642627 2.77 0.006 .2961806 1.724064 age | -.044232 .0322248 -1.37 0.170 -.1073913 .0189274 _cons | -.026891 .7621481 -0.04 0.972 -1.520674 1.466892 ------------------------------------------------------------------------------ lrtest, using(3) Logit: likelihood-ratio test chi2(1) = 3.15 Prob > chi2 = 0.0761 lrtest, saving(2) logit low lwd Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -113.161 Iteration 2: log likelihood = -113.12058 Iteration 3: log likelihood = -113.12058 Logit estimates Number of obs = 189 LR chi2(1) = 8.43 Prob > chi2 = 0.0037 Log likelihood = -113.12058 Pseudo R2 = 0.0359 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwd | 1.053762 .3615635 2.91 0.004 .3451102 1.762413 _cons | -1.053762 .1883882 -5.59 0.000 -1.422996 -.6845277 ------------------------------------------------------------------------------ lrtest, using(2) Logit: likelihood-ratio test chi2(1) = 1.95 Prob > chi2 = 0.1621 lrtest, saving(1) logit low Iteration 0: log likelihood = -117.336 Logit estimates Number of obs = 189 LR chi2(0) = 0.00 Prob > chi2 = . Log likelihood = -117.336 Pseudo R2 = 0.0000 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | -.789997 .156976 -5.03 0.000 -1.097664 -.4823297 ------------------------------------------------------------------------------ lrtest, using(1) Logit: likelihood-ratio test chi2(1) = 8.43 Prob > chi2 = 0.0037
Figure 3.3, page 78.
logit low age lwd lwdage Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -110.71804 Iteration 2: log likelihood = -110.57024 Iteration 3: log likelihood = -110.56997 Logit estimates Number of obs = 189 LR chi2(3) = 13.53 Prob > chi2 = 0.0036 Log likelihood = -110.56997 Pseudo R2 = 0.0577 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0795722 .0396343 -2.01 0.045 -.157254 -.0018904 lwd | -1.944089 1.724804 -1.13 0.260 -5.324643 1.436465 lwdage | .1321967 .0756982 1.75 0.081 -.0161691 .2805626 _cons | .7744952 .9100949 0.85 0.395 -1.009258 2.558248 ------------------------------------------------------------------------------ predict el, xb graph twoway scatter el age, xlabel(10(5)45) ylabel(-3 .6)
Table 3.15, page 78.
estat vce Covariance matrix of coefficients of logit model e(V) | age lwd lwdage _cons -------------+------------------------------------------------ age | .00157088 lwd | .03526621 2.974949 lwdage | -.00157088 -.12760349 .00573022 _cons | -.03526621 -.82827277 .03526621 .82827277
Table 3.16, page 79.
logit low age lwd lwdage Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -110.71804 Iteration 2: log likelihood = -110.57024 Iteration 3: log likelihood = -110.56997 Logit estimates Number of obs = 189 LR chi2(3) = 13.53 Prob > chi2 = 0.0036 Log likelihood = -110.56997 Pseudo R2 = 0.0577 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0795722 .0396343 -2.01 0.045 -.157254 -.0018904 lwd | -1.944089 1.724804 -1.13 0.260 -5.324643 1.436465 lwdage | .1321967 .0756982 1.75 0.081 -.0161691 .2805626 _cons | .7744952 .9100949 0.85 0.395 -1.009258 2.558248 ------------------------------------------------------------------------------ lincom lwd + 15*lwdage , or ( 1) lwd + 15.0 lwdage = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.039627 .6865828 0.06 0.953 .284927 3.79334 ------------------------------------------------------------------------------ lincom lwd + 20*lwdage , or ( 1) lwd + 20.0 lwdage = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 2.013443 .81264 1.73 0.083 .9128263 4.441098 ------------------------------------------------------------------------------ lincom lwd + 25*lwdage , or ( 1) lwd + 25.0 lwdage = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 3.899427 1.636664 3.24 0.001 1.712913 8.877003 ------------------------------------------------------------------------------ lincom lwd + 30*lwdage , or ( 1) lwd + 30.0 lwdage = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 7.552007 5.210013 2.93 0.003 1.953582 29.19397 ------------------------------------------------------------------------------
Table 3.17, page 80.
tab2 low smoke -> tabulation of low by smoke | smoke < 2500g | 0 1 | Total -----------+----------------------+---------- 0 | 86 44 | 130 1 | 29 30 | 59 -----------+----------------------+---------- Total | 115 74 | 189
Table 3.18, page 81.
sort race by race: tabulate low smoke _______________________________________________________________________________ -> race = white | smoke < 2500g | 0 1 | Total -----------+----------------------+---------- 0 | 40 33 | 73 1 | 4 19 | 23 -----------+----------------------+---------- Total | 44 52 | 96 _______________________________________________________________________________ -> race = black | smoke < 2500g | 0 1 | Total -----------+----------------------+---------- 0 | 11 4 | 15 1 | 5 6 | 11 -----------+----------------------+---------- Total | 16 10 | 26 _______________________________________________________________________________ -> race = other | smoke < 2500g | 0 1 | Total -----------+----------------------+---------- 0 | 35 7 | 42 1 | 20 5 | 25 -----------+----------------------+---------- Total | 55 12 | 67
Table 3.19, page 82.
xi i.race i.race _Irace_1-3 (naturally coded; _Irace_1 omitted) gen race2sm = _Irace_2*smoke gen race3sm = _Irace_3*smoke logit low smoke _Irace_2 _Irace_3 race2sm race3sm Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -108.9189 Iteration 2: log likelihood = -108.42021 Iteration 3: log likelihood = -108.4089 Iteration 4: log likelihood = -108.40889 Logit estimates Number of obs = 189 LR chi2(5) = 17.85 Prob > chi2 = 0.0031 Log likelihood = -108.40889 Pseudo R2 = 0.0761 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- smoke | 1.750517 .5982759 2.93 0.003 .5779173 2.923116 _Irace_2 | 1.514128 .7522689 2.01 0.044 .0397077 2.988548 _Irace_3 | 1.742969 .5946183 2.93 0.003 .5775389 2.9084 race2sm | -.556594 1.032235 -0.54 0.590 -2.579738 1.46655 race3sm | -1.527373 .8828152 -1.73 0.084 -3.257659 .202913 _cons | -2.302585 .5244039 -4.39 0.000 -3.330398 -1.274772 ------------------------------------------------------------------------------ lincom smoke, or ( 1) smoke = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 5.757576 3.444619 2.93 0.003 1.782322 18.59915 ------------------------------------------------------------------------------ lincom smoke ( 1) smoke = 0.0 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.750517 .5982759 2.93 0.003 .5779173 2.923116 ------------------------------------------------------------------------------ lincom smoke+race2sm, or ( 1) smoke + race2sm = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 3.3 2.775878 1.42 0.156 .6346062 17.16025 ------------------------------------------------------------------------------ lincom smoke+race2sm ( 1) smoke + race2sm = 0.0 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.193922 .8411752 1.42 0.156 -.4547507 2.842596 ------------------------------------------------------------------------------ lincom smoke+race3sm, or ( 1) smoke + race3sm = 0.0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.25 .8114691 0.34 0.731 .350212 4.461584 ------------------------------------------------------------------------------ lincom smoke+race3sm ( 1) smoke + race3sm = 0.0 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | .2231436 .6491753 0.34 0.731 -1.049217 1.495504 ------------------------------------------------------------------------------
NOTE: The estimated variance of the ln(estimated odds ratios), and the inverse of the estimated variance, w, were not calculated because they were needed only to do a hand-computation. The value of chi-square-h (at the top of page 83) can be obtained using the test command, as shown below.
test race2sm race3sm ( 1) race2sm = 0.0 ( 2) race3sm = 0.0 chi2( 2) = 3.02 Prob > chi2 = 0.2213
Table 3.20, page 84.
logit low smoke Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -114.9123 Iteration 2: log likelihood = -114.9023 Logit estimates Number of obs = 189 LR chi2(1) = 4.87 Prob > chi2 = 0.0274 Log likelihood = -114.9023 Pseudo R2 = 0.0207 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- smoke | .7040592 .3196386 2.20 0.028 .0775791 1.330539 _cons | -1.087051 .2147299 -5.06 0.000 -1.507914 -.6661886 ------------------------------------------------------------------------------ lrtest, saving(1) logit low smoke _Irace_2 _Irace_3 Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -110.10441 Iteration 2: log likelihood = -109.98749 Iteration 3: log likelihood = -109.98736 Logit estimates Number of obs = 189 LR chi2(3) = 14.70 Prob > chi2 = 0.0021 Log likelihood = -109.98736 Pseudo R2 = 0.0626 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- smoke | 1.116004 .3692258 3.02 0.003 .3923346 1.839673 _Irace_2 | 1.084088 .4899845 2.21 0.027 .1237362 2.04444 _Irace_3 | 1.108563 .4003054 2.77 0.006 .3239787 1.893147 _cons | -1.840539 .3528633 -5.22 0.000 -2.532138 -1.148939 ------------------------------------------------------------------------------ lrtest, saving(2) logit low smoke _Irace_2 _Irace_3 race2sm race3sm Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -108.9189 Iteration 2: log likelihood = -108.42021 Iteration 3: log likelihood = -108.4089 Iteration 4: log likelihood = -108.40889 Logit estimates Number of obs = 189 LR chi2(5) = 17.85 Prob > chi2 = 0.0031 Log likelihood = -108.40889 Pseudo R2 = 0.0761 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- smoke | 1.750517 .5982759 2.93 0.003 .5779173 2.923116 _Irace_2 | 1.514128 .7522689 2.01 0.044 .0397077 2.988548 _Irace_3 | 1.742969 .5946183 2.93 0.003 .5775389 2.9084 race2sm | -.556594 1.032235 -0.54 0.590 -2.579738 1.46655 race3sm | -1.527373 .8828152 -1.73 0.084 -3.257659 .202913 _cons | -2.302585 .5244039 -4.39 0.000 -3.330398 -1.274772 ------------------------------------------------------------------------------ lrtest, saving(3) lrtest, using(2) model(1) Logit: likelihood-ratio test chi2(2) = 9.83 Prob > chi2 = 0.0073 lrtest, using(3) model(2) Logit: likelihood-ratio test chi2(2) = 3.16 Prob > chi2 = 0.2063
Figure 3.4, page 86.
logit low lwt _Irace_2 _Irace_3 Iteration 0: log likelihood = -117.336 Iteration 1: log likelihood = -111.7491 Iteration 2: log likelihood = -111.62983 Iteration 3: log likelihood = -111.62955 Logit estimates Number of obs = 189 LR chi2(3) = 11.41 Prob > chi2 = 0.0097 Log likelihood = -111.62955 Pseudo R2 = 0.0486 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0152231 .0064393 -2.36 0.018 -.0278439 -.0026023 _Irace_2 | 1.081066 .4880512 2.22 0.027 .1245034 2.037629 _Irace_3 | .4806033 .3566733 1.35 0.178 -.2184636 1.17967 _cons | .8057535 .8451625 0.95 0.340 -.8507345 2.462241 ------------------------------------------------------------------------------ predict p1, xb predict sep1, stdp gen ulp1 = p1+1.96*sep1 gen llp1 = p1-1.96*sep1 graph twoway (scatter p1 lwt if race == 1, sort connect(l)) /// (line ulp1 llp1 lwt if race ==1, sort pstyle(p3 p3)), /// xlabel(90(40)250) ylabel(-4.25 .1)
Figure 3.5, page 87.
gen odds1 = exp(ulp1) gen ulprob1 = odds1/(1+odds1) gen odds2 = exp(p1) gen prob1 = odds2/(1+odds2) gen odds3 = exp(llp1) gen llprob1 = odds3/(1+odds3) graph twoway (scatter prob1 lwt if race == 1, sort connect(l)) /// (line llprob1 ulprob1 lwt if race ==1, sort pstyle(p3 p3) ), /// xlabel(90(40)250) ylabel(0(.15).6)