Table 3.2, page 51.
use chdage.dta, clear
(Hosmer and Lemeshow - from chapter 1)
gen aged=0
replace aged=1 if age >= 55
(27 real changes made)
sort aged
by aged: tabulate chd
_______________________________________________________________________________
-> aged = 0
chd | Freq. Percent Cum.
------------+-----------------------------------
0 | 51 69.86 69.86
1 | 22 30.14 100.00
------------+-----------------------------------
Total | 73 100.00
_______________________________________________________________________________
-> aged = 1
chd | Freq. Percent Cum.
------------+-----------------------------------
0 | 6 22.22 22.22
1 | 21 77.78 100.00
------------+-----------------------------------
Total | 27 100.00
Table 3.3, page 52.
logit chd aged
Iteration 0: log likelihood = -68.331491
Iteration 1: log likelihood = -59.020453
Iteration 2: log likelihood = -58.979594
Iteration 3: log likelihood = -58.979565
Logit estimates Number of obs = 100
LR chi2(1) = 18.70
Prob > chi2 = 0.0000
Log likelihood = -58.979565 Pseudo R2 = 0.1369
------------------------------------------------------------------------------
chd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
aged | 2.093546 .5285335 3.96 0.000 1.057639 3.129453
_cons | -.8407832 .2550733 -3.30 0.001 -1.340718 -.3408487
------------------------------------------------------------------------------
Table 3.5, page 56.
clear
input race chd cnt
1 1 5
2 1 20
3 1 15
4 1 10
1 0 20
2 0 10
3 0 10
4 0 10
end
expand cnt
(92 observations created)
tab chd race
| race
chd | 1 2 3 4 | Total
-----------+--------------------------------------------+----------
0 | 20 10 10 10 | 50
1 | 5 20 15 10 | 50
-----------+--------------------------------------------+----------
Total | 25 30 25 20 | 100
tabulate race, gen(race_)
race | Freq. Percent Cum.
------------+-----------------------------------
1 | 25 25.00 25.00
2 | 30 30.00 55.00
3 | 25 25.00 80.00
4 | 20 20.00 100.00
------------+-----------------------------------
Total | 100 100.00
logistic chd race_2 race_3 race_4
Logit estimates Number of obs = 100
LR chi2(3) = 14.04
Prob > chi2 = 0.0028
Log likelihood = -62.293721 Pseudo R2 = 0.1013
------------------------------------------------------------------------------
chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race_2 | 8 5.059619 3.29 0.001 2.316037 27.63341
race_3 | 6 3.872965 2.78 0.006 1.69319 21.26165
race_4 | 4 2.68327 2.07 0.039 1.074136 14.8957
------------------------------------------------------------------------------
logit
Logit estimates Number of obs = 100
LR chi2(3) = 14.04
Prob > chi2 = 0.0028
Log likelihood = -62.293721 Pseudo R2 = 0.1013
------------------------------------------------------------------------------
chd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race_2 | 2.079442 .6324524 3.29 0.001 .8398576 3.319026
race_3 | 1.791759 .6454942 2.78 0.006 .5266141 3.056905
race_4 | 1.386294 .6708175 2.07 0.039 .0715163 2.701072
_cons | -1.386294 .4999961 -2.77 0.006 -2.366269 -.4063201
------------------------------------------------------------------------------
NOTE: The logistic command gives the odds ratios and the 95% confidence intervals; the logit command gives the ln(OR) on the bottom row of the table.
Table 3.7, page 58.
logit chd race_2 race_3 race_4
Iteration 0: log likelihood = -69.314718
Iteration 1: log likelihood = -62.368156
Iteration 2: log likelihood = -62.293897
Iteration 3: log likelihood = -62.293721
Logit estimates Number of obs = 100
LR chi2(3) = 14.04
Prob > chi2 = 0.0028
Log likelihood = -62.293721 Pseudo R2 = 0.1013
------------------------------------------------------------------------------
chd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race_2 | 2.079442 .6324524 3.29 0.001 .8398576 3.319026
race_3 | 1.791759 .6454942 2.78 0.006 .5266141 3.056905
race_4 | 1.386294 .6708175 2.07 0.039 .0715163 2.701072
_cons | -1.386294 .4999961 -2.77 0.006 -2.366269 -.4063201
------------------------------------------------------------------------------
Table 3.8, page 59.
NOTE: We will drop the variables race_2 race_3 and race_4 that we created above. Next, we will copy the variable race into new variables called race_2 race_3 and race_4 and create the design variables from them.
drop race_1 - race_4
gen race_2=race
gen race_3=race
gen race_4=race
recode race_2 1=-1 2=1 3/4=0
(100 changes made)
recode race_3 1=-1 2=0 3=1 4=0
(100 changes made)
recode race_4 1=-1 2=0 3=0 4=1
(100 changes made)
list race race_2 race_3 race_4 in 1/4
race race_2 race_3 race_4
1. 1 -1 -1 -1
2. 2 1 0 0
3. 3 0 1 0
4. 4 0 0 1
Table 3.9, page 60.
logit chd race_2 race_3 race_4
Iteration 0: log likelihood = -69.314718
Iteration 1: log likelihood = -62.368156
Iteration 2: log likelihood = -62.293897
Iteration 3: log likelihood = -62.293721
Logit estimates Number of obs = 100
LR chi2(3) = 14.04
Prob > chi2 = 0.0028
Log likelihood = -62.293721 Pseudo R2 = 0.1013
------------------------------------------------------------------------------
chd | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
race_2 | .7650677 .3505944 2.18 0.029 .0779153 1.45222
race_3 | .4773856 .3622841 1.32 0.188 -.2326781 1.187449
race_4 | .0719205 .384599 0.19 0.852 -.6818797 .8257208
_cons | -.0719205 .2188982 -0.33 0.742 -.5009531 .3571121
------------------------------------------------------------------------------
Tables 3.10 – 3.14, pages 67 – 73. These data are hypothetical and are not available.
Table 3.14, page 77.
NOTE: We have run the logistic regression models from the largest to the smallest so that the difference between the larger and the smaller model can be determined. This is the reverse of the presentation in the table in the book.
use lowbwt.dta, clear
(Hosmer and Lemeshow - from appendix 1)
gen lwd=(lwt<110)
gen lwdage=lwd*age
logit low lwd age lwdage
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -110.71804
Iteration 2: log likelihood = -110.57024
Iteration 3: log likelihood = -110.56997
Logit estimates Number of obs = 189
LR chi2(3) = 13.53
Prob > chi2 = 0.0036
Log likelihood = -110.56997 Pseudo R2 = 0.0577
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwd | -1.944089 1.724804 -1.13 0.260 -5.324643 1.436465
age | -.0795722 .0396343 -2.01 0.045 -.157254 -.0018904
lwdage | .1321967 .0756982 1.75 0.081 -.0161691 .2805626
_cons | .7744952 .9100949 0.85 0.395 -1.009258 2.558248
------------------------------------------------------------------------------
lrtest, saving(3)
logit low lwd age
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -112.19831
Iteration 2: log likelihood = -112.14339
Iteration 3: log likelihood = -112.14338
Logit estimates Number of obs = 189
LR chi2(2) = 10.39
Prob > chi2 = 0.0056
Log likelihood = -112.14338 Pseudo R2 = 0.0443
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwd | 1.010122 .3642627 2.77 0.006 .2961806 1.724064
age | -.044232 .0322248 -1.37 0.170 -.1073913 .0189274
_cons | -.026891 .7621481 -0.04 0.972 -1.520674 1.466892
------------------------------------------------------------------------------
lrtest, using(3)
Logit: likelihood-ratio test chi2(1) = 3.15
Prob > chi2 = 0.0761
lrtest, saving(2)
logit low lwd
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -113.161
Iteration 2: log likelihood = -113.12058
Iteration 3: log likelihood = -113.12058
Logit estimates Number of obs = 189
LR chi2(1) = 8.43
Prob > chi2 = 0.0037
Log likelihood = -113.12058 Pseudo R2 = 0.0359
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwd | 1.053762 .3615635 2.91 0.004 .3451102 1.762413
_cons | -1.053762 .1883882 -5.59 0.000 -1.422996 -.6845277
------------------------------------------------------------------------------
lrtest, using(2)
Logit: likelihood-ratio test chi2(1) = 1.95
Prob > chi2 = 0.1621
lrtest, saving(1)
logit low
Iteration 0: log likelihood = -117.336
Logit estimates Number of obs = 189
LR chi2(0) = 0.00
Prob > chi2 = .
Log likelihood = -117.336 Pseudo R2 = 0.0000
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | -.789997 .156976 -5.03 0.000 -1.097664 -.4823297
------------------------------------------------------------------------------
lrtest, using(1)
Logit: likelihood-ratio test chi2(1) = 8.43
Prob > chi2 = 0.0037
Figure 3.3, page 78.
logit low age lwd lwdage
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -110.71804
Iteration 2: log likelihood = -110.57024
Iteration 3: log likelihood = -110.56997
Logit estimates Number of obs = 189
LR chi2(3) = 13.53
Prob > chi2 = 0.0036
Log likelihood = -110.56997 Pseudo R2 = 0.0577
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0795722 .0396343 -2.01 0.045 -.157254 -.0018904
lwd | -1.944089 1.724804 -1.13 0.260 -5.324643 1.436465
lwdage | .1321967 .0756982 1.75 0.081 -.0161691 .2805626
_cons | .7744952 .9100949 0.85 0.395 -1.009258 2.558248
------------------------------------------------------------------------------
predict el, xb
graph twoway scatter el age, xlabel(10(5)45) ylabel(-3 .6)
Table 3.15, page 78.
estat vce
Covariance matrix of coefficients of logit model
e(V) | age lwd lwdage _cons
-------------+------------------------------------------------
age | .00157088
lwd | .03526621 2.974949
lwdage | -.00157088 -.12760349 .00573022
_cons | -.03526621 -.82827277 .03526621 .82827277
Table 3.16, page 79.
logit low age lwd lwdage
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -110.71804
Iteration 2: log likelihood = -110.57024
Iteration 3: log likelihood = -110.56997
Logit estimates Number of obs = 189
LR chi2(3) = 13.53
Prob > chi2 = 0.0036
Log likelihood = -110.56997 Pseudo R2 = 0.0577
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.0795722 .0396343 -2.01 0.045 -.157254 -.0018904
lwd | -1.944089 1.724804 -1.13 0.260 -5.324643 1.436465
lwdage | .1321967 .0756982 1.75 0.081 -.0161691 .2805626
_cons | .7744952 .9100949 0.85 0.395 -1.009258 2.558248
------------------------------------------------------------------------------
lincom lwd + 15*lwdage , or
( 1) lwd + 15.0 lwdage = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 1.039627 .6865828 0.06 0.953 .284927 3.79334
------------------------------------------------------------------------------
lincom lwd + 20*lwdage , or
( 1) lwd + 20.0 lwdage = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 2.013443 .81264 1.73 0.083 .9128263 4.441098
------------------------------------------------------------------------------
lincom lwd + 25*lwdage , or
( 1) lwd + 25.0 lwdage = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 3.899427 1.636664 3.24 0.001 1.712913 8.877003
------------------------------------------------------------------------------
lincom lwd + 30*lwdage , or
( 1) lwd + 30.0 lwdage = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 7.552007 5.210013 2.93 0.003 1.953582 29.19397
------------------------------------------------------------------------------
Table 3.17, page 80.
tab2 low smoke
-> tabulation of low by smoke
| smoke
< 2500g | 0 1 | Total
-----------+----------------------+----------
0 | 86 44 | 130
1 | 29 30 | 59
-----------+----------------------+----------
Total | 115 74 | 189
Table 3.18, page 81.
sort race
by race: tabulate low smoke
_______________________________________________________________________________
-> race = white
| smoke
< 2500g | 0 1 | Total
-----------+----------------------+----------
0 | 40 33 | 73
1 | 4 19 | 23
-----------+----------------------+----------
Total | 44 52 | 96
_______________________________________________________________________________
-> race = black
| smoke
< 2500g | 0 1 | Total
-----------+----------------------+----------
0 | 11 4 | 15
1 | 5 6 | 11
-----------+----------------------+----------
Total | 16 10 | 26
_______________________________________________________________________________
-> race = other
| smoke
< 2500g | 0 1 | Total
-----------+----------------------+----------
0 | 35 7 | 42
1 | 20 5 | 25
-----------+----------------------+----------
Total | 55 12 | 67
Table 3.19, page 82.
xi i.race
i.race _Irace_1-3 (naturally coded; _Irace_1 omitted)
gen race2sm = _Irace_2*smoke
gen race3sm = _Irace_3*smoke
logit low smoke _Irace_2 _Irace_3 race2sm race3sm
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -108.9189
Iteration 2: log likelihood = -108.42021
Iteration 3: log likelihood = -108.4089
Iteration 4: log likelihood = -108.40889
Logit estimates Number of obs = 189
LR chi2(5) = 17.85
Prob > chi2 = 0.0031
Log likelihood = -108.40889 Pseudo R2 = 0.0761
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
smoke | 1.750517 .5982759 2.93 0.003 .5779173 2.923116
_Irace_2 | 1.514128 .7522689 2.01 0.044 .0397077 2.988548
_Irace_3 | 1.742969 .5946183 2.93 0.003 .5775389 2.9084
race2sm | -.556594 1.032235 -0.54 0.590 -2.579738 1.46655
race3sm | -1.527373 .8828152 -1.73 0.084 -3.257659 .202913
_cons | -2.302585 .5244039 -4.39 0.000 -3.330398 -1.274772
------------------------------------------------------------------------------
lincom smoke, or
( 1) smoke = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 5.757576 3.444619 2.93 0.003 1.782322 18.59915
------------------------------------------------------------------------------
lincom smoke
( 1) smoke = 0.0
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 1.750517 .5982759 2.93 0.003 .5779173 2.923116
------------------------------------------------------------------------------
lincom smoke+race2sm, or
( 1) smoke + race2sm = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 3.3 2.775878 1.42 0.156 .6346062 17.16025
------------------------------------------------------------------------------
lincom smoke+race2sm
( 1) smoke + race2sm = 0.0
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 1.193922 .8411752 1.42 0.156 -.4547507 2.842596
------------------------------------------------------------------------------
lincom smoke+race3sm, or
( 1) smoke + race3sm = 0.0
------------------------------------------------------------------------------
low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 1.25 .8114691 0.34 0.731 .350212 4.461584
------------------------------------------------------------------------------
lincom smoke+race3sm
( 1) smoke + race3sm = 0.0
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | .2231436 .6491753 0.34 0.731 -1.049217 1.495504
------------------------------------------------------------------------------
NOTE: The estimated variance of the ln(estimated odds ratios), and the inverse of the estimated variance, w, were not calculated because they were needed only to do a hand-computation. The value of chi-square-h (at the top of page 83) can be obtained using the test command, as shown below.
test race2sm race3sm
( 1) race2sm = 0.0
( 2) race3sm = 0.0
chi2( 2) = 3.02
Prob > chi2 = 0.2213
Table 3.20, page 84.
logit low smoke
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -114.9123
Iteration 2: log likelihood = -114.9023
Logit estimates Number of obs = 189
LR chi2(1) = 4.87
Prob > chi2 = 0.0274
Log likelihood = -114.9023 Pseudo R2 = 0.0207
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
smoke | .7040592 .3196386 2.20 0.028 .0775791 1.330539
_cons | -1.087051 .2147299 -5.06 0.000 -1.507914 -.6661886
------------------------------------------------------------------------------
lrtest, saving(1)
logit low smoke _Irace_2 _Irace_3
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -110.10441
Iteration 2: log likelihood = -109.98749
Iteration 3: log likelihood = -109.98736
Logit estimates Number of obs = 189
LR chi2(3) = 14.70
Prob > chi2 = 0.0021
Log likelihood = -109.98736 Pseudo R2 = 0.0626
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
smoke | 1.116004 .3692258 3.02 0.003 .3923346 1.839673
_Irace_2 | 1.084088 .4899845 2.21 0.027 .1237362 2.04444
_Irace_3 | 1.108563 .4003054 2.77 0.006 .3239787 1.893147
_cons | -1.840539 .3528633 -5.22 0.000 -2.532138 -1.148939
------------------------------------------------------------------------------
lrtest, saving(2)
logit low smoke _Irace_2 _Irace_3 race2sm race3sm
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -108.9189
Iteration 2: log likelihood = -108.42021
Iteration 3: log likelihood = -108.4089
Iteration 4: log likelihood = -108.40889
Logit estimates Number of obs = 189
LR chi2(5) = 17.85
Prob > chi2 = 0.0031
Log likelihood = -108.40889 Pseudo R2 = 0.0761
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
smoke | 1.750517 .5982759 2.93 0.003 .5779173 2.923116
_Irace_2 | 1.514128 .7522689 2.01 0.044 .0397077 2.988548
_Irace_3 | 1.742969 .5946183 2.93 0.003 .5775389 2.9084
race2sm | -.556594 1.032235 -0.54 0.590 -2.579738 1.46655
race3sm | -1.527373 .8828152 -1.73 0.084 -3.257659 .202913
_cons | -2.302585 .5244039 -4.39 0.000 -3.330398 -1.274772
------------------------------------------------------------------------------
lrtest, saving(3)
lrtest, using(2) model(1)
Logit: likelihood-ratio test chi2(2) = 9.83
Prob > chi2 = 0.0073
lrtest, using(3) model(2)
Logit: likelihood-ratio test chi2(2) = 3.16
Prob > chi2 = 0.2063
Figure 3.4, page 86.
logit low lwt _Irace_2 _Irace_3
Iteration 0: log likelihood = -117.336
Iteration 1: log likelihood = -111.7491
Iteration 2: log likelihood = -111.62983
Iteration 3: log likelihood = -111.62955
Logit estimates Number of obs = 189
LR chi2(3) = 11.41
Prob > chi2 = 0.0097
Log likelihood = -111.62955 Pseudo R2 = 0.0486
------------------------------------------------------------------------------
low | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lwt | -.0152231 .0064393 -2.36 0.018 -.0278439 -.0026023
_Irace_2 | 1.081066 .4880512 2.22 0.027 .1245034 2.037629
_Irace_3 | .4806033 .3566733 1.35 0.178 -.2184636 1.17967
_cons | .8057535 .8451625 0.95 0.340 -.8507345 2.462241
------------------------------------------------------------------------------
predict p1, xb
predict sep1, stdp
gen ulp1 = p1+1.96*sep1
gen llp1 = p1-1.96*sep1
graph twoway (scatter p1 lwt if race == 1, sort connect(l)) ///
(line ulp1 llp1 lwt if race ==1, sort pstyle(p3 p3)), ///
xlabel(90(40)250) ylabel(-4.25 .1)
Figure 3.5, page 87.
gen odds1 = exp(ulp1) gen ulprob1 = odds1/(1+odds1) gen odds2 = exp(p1) gen prob1 = odds2/(1+odds2) gen odds3 = exp(llp1) gen llprob1 = odds3/(1+odds3) graph twoway (scatter prob1 lwt if race == 1, sort connect(l)) /// (line llprob1 ulprob1 lwt if race ==1, sort pstyle(p3 p3) ), /// xlabel(90(40)250) ylabel(0(.15).6)



