Figure 15.1 on page 440 on the data file chile.
use https://stats.idre.ucla.edu/stat/stata/examples/ara/chile, clear gen voting=1 if (intvote==1) replace voting = 0 if (intvote==2) logistic voting statquo predict pred, p graph twoway (scatter voting statquo, jitter(1)) (lfit voting statquo) /// (lowess voting statquo, bwidth(.4)) (scatter pred statquo, connect(l) msymbol(i) sort), /// ylabel(0 .5 1)
Table 15.1 and 15.2 on page 452 using the data file womenlf.
use https://stats.idre.ucla.edu/stat/stata/examples/ara/womenlf, clear * Generating a new work status variable and an interaction variable gen ws = 1 if( workstat==1 | workstat==2) replace ws=0 if ( workstat==0) gen ik = husbinc*chilpres * Contrast between model 1 and 0 is the overall likelihood ratio of * model 1 and the p-value is the overall Prob > chi2 xi: logistic ws husbinc i.chilpres i.region ik /*model 1 */ (Some output omitted here.) Logit estimates Number of obs = 263 LR chi2(7) = 39.61 Prob > chi2 = 0.0000 Log likelihood = -158.27076 Pseudo R2 = 0.1112 (More output is omitted.) lrtest, saving(m1)/* saving for further contrast test */ display -2*e(ll) /* displaying deviance */ 316.54152 logistic ws /* model 0 */ display -2*e(ll)/* displaying deviance */ 356.15089 xi: logistic ws husbinc i.chilpres i.region /* model 2 */ display -2*e(ll) 317.30107 lrtest, saving(m2) lrtest, using(m1) /* contrast 2-1 */ Logistic: likelihood-ratio test chi2(1) = 0.76 Prob > chi2 = 0.3835 xi: logistic ws husbinc ik i.chilpres /* model 3 */ display -2*e(ll) 319.12422 lrtest, using(m1)/* constrast 3-1 */ Logistic: likelihood-ratio test chi2(4) = 2.58 Prob > chi2 = 0.6299 xi: logistic ws husbinc i.region /* model 4 */ display -2*e(ll) 347.84936 lrtest, using(m2)/* contrast 4-2 */ Logistic: likelihood-ratio test chi2(1) = 30.55 Prob > chi2 = 0.0000 xi: logistic ws i.chilpres i.region /* model 5 */ display -2*e(ll) 322.4267 lrtest, using(m2)/* contrast 5-2 */ Logistic: likelihood-ratio test chi2(1) = 5.13 Prob > chi2 = 0.0236
Figure 15.4 and formula on page 453.
xi: logistic ws i.chilpres husbinc i.chilpres Ichilp_0-1 (naturally coded; Ichilp_0 omitted) Logit estimates Number of obs = 263 LR chi2(2) = 36.42 Prob > chi2 = 0.0000 Log likelihood = -159.86627 Pseudo R2 = 0.1023 ------------------------------------------------------------------------------ ws | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- Ichilp_1 | .2068734 .0604614 -5.391 0.000 .1166622 .3668421 husbinc | .9585741 .0189607 -2.139 0.032 .9221229 .9964661 ------------------------------------------------------------------------------ logit Logit estimates Number of obs = 263 LR chi2(2) = 36.42 Prob > chi2 = 0.0000 Log likelihood = -159.86627 Pseudo R2 = 0.1023 ------------------------------------------------------------------------------ ws | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- Ichilp_1 | -1.575648 .2922629 -5.391 0.000 -2.148473 -1.002824 husbinc | -.0423084 .0197801 -2.139 0.032 -.0810767 -.0035401 _cons | 1.33583 .3837632 3.481 0.000 .5836677 2.087992 ------------------------------------------------------------------------------ predict pw, p xi: regress ws i.chilpres husbinc i.chilpres Ichilp_0-1 (naturally coded; Ichilp_0 omitted) Source | SS df MS Number of obs = 263 ---------+------------------------------ F( 2, 260) = 20.43 Model | 8.64321054 2 4.32160527 Prob > F = 0.0000 Residual | 55.0069796 260 .211565306 R-squared = 0.1358 ---------+------------------------------ Adj R-squared = 0.1291 Total | 63.6501901 262 .242939657 Root MSE = .45996 ------------------------------------------------------------------------------ ws | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- Ichilp_1 | -.3673736 .061906 -5.934 0.000 -.4892745 -.2454727 husbinc | -.0085375 .0039351 -2.170 0.031 -.0162863 -.0007887 _cons | .7936535 .0766814 10.350 0.000 .6426578 .9446491 ------------------------------------------------------------------------------ predict lw, xb graph twoway (scatter pw lw husbinc if chilpres ==1, sort connect(l l) msymbol(i i) clcolor(black blue)) /// (scatter pw lw husbinc if chilpres ==0, sort connect(l l) msymbol(i i) clcolor(black blue)), /// ttext(0.15 10 "Children Present") ttext(0.75 30 "Children Absent") /// ylabel(0(.25)1)
Figure 15.5 on page 459. We can’t get the exact lowess smooth as in the book as it needs more than one iterative reweighting steps as there are outliers in the data and Stata’s command lowess does not have an option on that yet. See the corresponding part of SAS for a better lowess smoothing result.
logistic ws chilpres husbinc predict prob gen par=(ws-prob)/(prob*(1-prob))-.0423*husbinc /*creating partial residual*/ graph twoway (scatter par husbinc) (lowess par husbinc, bwidth(.9)) (lfit par husbinc), /// ylabel(-5 0 5)
Figure 15.6 on page 461. Notice that in Stata all the diagnostic statistics for logistic regression are adjusted for the number of covariate patterns, so called m-asymptotic instead of n-asymptotic, i.e., for the number of observations. So we have to do some calculations here in order to get the results in the text book.
logistic ws chilpres husbinc predict yhat /*the predicted value*/ gen pr=(ws-yhat)/sqrt(yhat*(1-yhat)) /* generating Pearson residual on a case by case basis */ predict myhat, hat predict pattern, number egen count=count(obs), by(pattern) /*number of obs sharing a c.p.*/ gen nhat=myhat/count /*hat diagonal on per observation basis*/ gen sr = pr/sqrt(1-nhat)/*studentized pearson residual*/ graph twoway scatter sr nhat, ylabel(-2(2)4) yline(-2 0 2) xlabel(0(.01).06) xline(0.0228 .034)
Figure 15.7 (a) and (b) on page 462. Stata does not have built-in command for Dfbeta. We use formula [15.21] to create the required statistics for these figures. This example is continuation of the previous one.
gen id=1 set matsize 300 mkmat chilpres husbinc id, matrix(X) matrix d=e(V)*X' matrix dt=d' svmat dt, name(mydf) gen md2=mydf2*(ws-yhat)/(1-nhat)/*husbincDfbeta*/ graph twoway scatter md2 obs, ylab(-0.005(0.005)0.01) yline(0) xlabel(0(50)300)
gen md1=mydf1*(ws-yhat)/(1-nhat) /*kidDfbeta*/ graph twoway scatter md1 obs, ylabel(-0.04(0.04)0.08) yline(0) xlab(0(50)300)
Calculation and Figure 15.8 on page 468 and 469.
mlogit workstat husbinc chilpre, base(0) Iteration 0: log likelihood = -250.24628 Iteration 1: log likelihood = -214.24438 Iteration 2: log likelihood = -211.495 Iteration 3: log likelihood = -211.44102 Iteration 4: log likelihood = -211.44096 Multinomial regression Number of obs = 263 LR chi2(4) = 77.61 Prob > chi2 = 0.0000 Log likelihood = -211.44096 Pseudo R2 = 0.1551 ------------------------------------------------------------------------------ workstat | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- parttime | husbinc | .0068921 .0234548 0.294 0.769 -.0390784 .0528627 chilpres | .0214911 .4690366 0.046 0.963 -.8978037 .940786 _cons | -1.432307 .5924623 -2.418 0.016 -2.593512 -.2711023 ---------+-------------------------------------------------------------------- fulltime | husbinc | -.0972307 .0280958 -3.461 0.001 -.1522975 -.0421639 chilpres | -2.558595 .362199 -7.064 0.000 -3.268492 -1.848698 _cons | 1.982822 .4841771 4.095 0.000 1.033853 2.931792 ------------------------------------------------------------------------------ (Outcome workstat==not work is the comparison group) predict p0 p1 p2 graph twoway (line p0 husbinc if chilpres==1, sort) (line p1 husbinc if chilpres==1, sort) /// (line p2 husbinc if chilpres==1, sort) , ylabel(0(.25).75) /// ttext(0.75 5 "Children Present") ttext(0.70 30 "Not Working") /// ttext(0.25 8.5 "Full-Time") ttext(0.20 35 "Part-Time")
graph twoway (line p0 husbinc if chilpres==0, sort) (line p1 husbinc if chilpres==0, sort) /// (line p2 husbinc if chilpres==0, sort) , ylabel(0(.25)1) /// ttext(1 5 "Children Absent") ttext(0.50 30 "Not Working") /// ttext(0.80 8.5 "Full-Time") ttext(0.05 17.5 "Part-Time")
Calculation on page 473.
gen wk=(workstat==0) logistic wk husbinc chilpres Logit estimates Number of obs = 263 LR chi2(2) = 36.42 Prob > chi2 = 0.0000 Log likelihood = -159.86627 Pseudo R2 = 0.1023 ------------------------------------------------------------------------------ wk | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- husbinc | 1.043216 .0206349 2.139 0.032 1.003546 1.084454 chilpres | 4.833875 1.412762 5.391 0.000 2.725968 8.57176 ------------------------------------------------------------------------------ logit Logit estimates Number of obs = 263 LR chi2(2) = 36.42 Prob > chi2 = 0.0000 Log likelihood = -159.86627 Pseudo R2 = 0.1023 ------------------------------------------------------------------------------ wk | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- husbinc | .0423084 .0197801 2.139 0.032 .0035401 .0810767 chilpres | 1.575648 .2922629 5.391 0.000 1.002824 2.148473 _cons | -1.33583 .3837632 -3.481 0.000 -2.087992 -.5836677 ------------------------------------------------------------------------------ lrtest, saving(0) logistic wk chilpres Logit estimates Number of obs = 263 LR chi2(1) = 31.59 Prob > chi2 = 0.0000 Log likelihood = -162.27945 Pseudo R2 = 0.0887 ------------------------------------------------------------------------------ wk | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- chilpres | 4.781119 1.379609 5.422 0.000 2.71589 8.416797 ------------------------------------------------------------------------------ lrtest /*test on the bottom of page 473*/ Logistic: likelihood-ratio test chi2(1) = 4.83 Prob > chi2 = 0.0280 gen ptime=. replace ptime = 0 if (workstat==1) replace ptime=1 if(workstat==2) logistic ptime husbinc chilpres Logit estimates Number of obs = 108 LR chi2(2) = 39.85 Prob > chi2 = 0.0000 Log likelihood = -52.247423 Pseudo R2 = 0.2761 ------------------------------------------------------------------------------ ptime | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- husbinc | .898285 .0351699 -2.740 0.006 .8319318 .9699305 chilpres | .0705484 .0381719 -4.900 0.000 .0244301 .2037278 ------------------------------------------------------------------------------ logit Logit estimates Number of obs = 108 LR chi2(2) = 39.85 Prob > chi2 = 0.0000 Log likelihood = -52.247423 Pseudo R2 = 0.2761 ------------------------------------------------------------------------------ ptime | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- husbinc | -.1072679 .0391522 -2.740 0.006 -.1840048 -.0305309 chilpres | -2.651456 .5410738 -4.900 0.000 -3.711941 -1.59097 _cons | 3.477773 .7671069 4.534 0.000 1.974272 4.981275 ------------------------------------------------------------------------------ lrtest, saving(p1) logistic ptime chilpres Logit estimates Number of obs = 108 LR chi2(1) = 30.87 Prob > chi2 = 0.0000 Log likelihood = -56.738094 Pseudo R2 = 0.2138 ------------------------------------------------------------------------------ ptime | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- chilpres | .0869565 .04288 -4.953 0.000 .0330794 .2285846 ------------------------------------------------------------------------------ lrtest, using(p1)/*test on the top of page 474*/ Logistic: likelihood-ratio test chi2(1) = 8.98 Prob > chi2 = 0.0027
Calculation on page 477.
xi: ologit workstat husbinc i.chilpres i.chilpres Ichilp_0-1 (naturally coded; Ichilp_0 omitted) Iteration 0: log likelihood = -250.24628 Iteration 1: log likelihood = -221.36758 Iteration 2: log likelihood = -220.83242 Iteration 3: log likelihood = -220.83148 Ordered logit estimates Number of obs = 263 LR chi2(2) = 58.83 Prob > chi2 = 0.0000 Log likelihood = -220.83148 Pseudo R2 = 0.1175 ------------------------------------------------------------------------------ workstat | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- husbinc | -.0539007 .01949 -2.766 0.006 -.0921004 -.0157009 Ichilp_1 | -1.971957 .2869478 -6.872 0.000 -2.534364 -1.40955 ---------+-------------------------------------------------------------------- _cut1 | -1.852037 .3862995 (Ancillary parameters) _cut2 | -.9409253 .3699303 ------------------------------------------------------------------------------
The analysis in this section is based on table 15.3, which is based on data from an example from The American Voter (Campbell, et al., 1960). The first data set we create here is based on table 15.3 for figure 15.3 (page 480).
input logv1 logvc inten 847 .9 0 904 1.318 1 981 2.084 2 end label define scale 0 Weak 1 Medium 2 Strong label values inten scale label variable logv1 "One-Sided" label variable logvc "Close" graph twoway scatter logv1 logvc inten, connect(l l) xlabel(0 1 2)
Now we create another data set below to do the logistic regression and tests over different models. Thus we will have table 15.4 and table 15.5 on page 482.
input perclose inten1 inten2 voted wv 0 0 0 1 91 0 0 0 0 39 0 1 0 1 121 0 1 0 0 49 0 0 1 1 64 0 0 1 0 24 1 0 0 1 214 1 0 0 0 87 1 1 0 1 284 1 1 0 0 76 1 0 1 1 201 1 0 1 0 25 end gen clspref1=perclose*inten1 gen clspref2=perclose*inten2 logistic voted perclose inten1 inten2 clspref1 clspref2 [fweight=wv] (Output omitted.) di -2*e(ll) /*deviance for model 1*/ 1356.4343 lrtest, saving(t1) logistic voted perclose inten1 inten2 [fweight=wv] (Output omitted.) di -2*e(ll) /*model 2*/ 1363.5529 lrtest, saving(t2) lrtest, using(t1)/*contrast 2-1*/ Logistic: likelihood-ratio test chi2(2) = 7.12 Prob > chi2 = 0.0285 logistic voted perclose clspref1 clspref2 [fweight=wv] (Output omitted.) di -2*e(ll) /*model 3 */ 1356.6253 lrtest, saving(t3) lrtest, using(t1) /*contrast 3-1 */ Logistic: likelihood-ratio test chi2(2) = 0.19 Prob > chi2 = 0.9089 logistic voted inten1 inten2 clspref1 clspref2 [fweight=wv] (Output omitted.) di -2*e(ll) /*model 4*/ 1356.4869 lrtest, using(t1) Logistic: likelihood-ratio test chi2(1) = 0.05 Prob > chi2 = 0.8186 logistic voted perclose [fweight=wv] (Output omitted.) di -2*e(ll) /*model 5 */ 1382.6582 lrtest, using(t2) Logistic: likelihood-ratio test chi2(2) = 19.11 Prob > chi2 = 0.0001 logistic voted inten1 inten2 [fweight=wv] (Output omitted.) di -2*e(ll) /*model 6*/ 1371.8382 lrtest, using(t2) Logistic: likelihood-ratio test chi2(1) = 8.29 Prob > chi2 = 0.0040