This page has been updated to work with Stata 12.
This chapter makes use of the user written program fitstat program, which is not part of base Stata. Prior to using the fitstat command, it needs to be downloaded by typing search fitstat in the command line (see How can I use the search command to search for programs and get additional help? for more information about using search).
Table 7.2 on page 178 using data on usage of alcohol, cigarettes and marijuana.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/drug, clear *MODEL 1 quietly glm count i.gender#i.race i.alcohol i.smoke i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 1325.1408 df=25 *MODEL 2 quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 15.340343 df=16 *MODEL 3 quietly glm count i.gender#i.race#i.alcohol i.gender#i.race#i.smoke /// i.gender#i.race#i.marij i.gender#i.alcohol#i.smoke /// i.gender#i.alcohol#i.marij i.gender#i.smoke#i.marij /// i.race#i.alcohol#i.smoke i.race#i.alcohol#i.marij /// i.race#i.smoke#i.marij i.alcohol#i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 5.2720008 df=6 *MODEL 4a quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 201.19931 df=17 *MODEL 4b quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 106.958 df=17 *MODEL 4c quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 513.47218 df=17 *MODEL 4d quietly glm count i.gender#i.race i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 18.716951 df=17 *MODEL 4e quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 20.320861 df=17 *MODEL 4f quietly glm count i.gender#i.race i.gender#i.alcohol /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 16.317184 df=17 *MODEL 4g quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 15.783467 df=17 *MODEL 4h quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.race#i.alcohol i.race#i.smoke i.race#i.marij /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 25.161008 df=17 *MODEL 4i quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke /// i.gender#i.marij i.race#i.alcohol i.race#i.smoke /// i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 18.928935 df=17 *MODEL 5 quietly glm count i.alcohol#i.smoke i.alcohol#i.marij /// i.smoke#i.marij i.alcohol#i.gender i.alcohol#i.race /// i.gender#i.marij i.gender#i.race i.marij#i.race, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 16.73504 df=18 *MODEL 6 quietly glm count i.alcohol#i.smoke i.alcohol#i.marij /// i.smoke#i.marij i.alcohol#i.gender i.alcohol#i.race /// i.gender#i.marij i.gender#i.race, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 19.908587 df=19 *MODEL 7 quietly glm count i.alcohol#i.smoke i.alcohol#i.marij /// i.smoke#i.marij i.alcohol#i.gender i.alcohol#i.race /// i.gender#i.race, fam(poi) display "deviance = " e(deviance) " df=" e(df) deviance = 28.80508 df=20
Table 7.3 on page 181.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/sex, clear glm count i.premar i.birth, fam(poi) Iteration 0: log likelihood = -111.95331 Iteration 1: log likelihood = -109.17365 Iteration 2: log likelihood = -109.16457 Iteration 3: log likelihood = -109.16457 Generalized linear models No. of obs = 16 Optimization : ML Residual df = 9 Scale parameter = 1 Deviance = 127.6529373 (1/df) Deviance = 14.18366 Pearson = 128.6835978 (1/df) Pearson = 14.29818 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] AIC = 14.52057 Log likelihood = -109.164565 BIC = 102.6996 ------------------------------------------------------------------------------ | OIM count | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- premar | 2 | -.9767888 .1216605 -8.03 0.000 -1.215239 -.7383387 3 | -.3446024 .0988072 -3.49 0.000 -.538261 -.1509438 4 | .5092049 .0805088 6.32 0.000 .3514105 .6669993 | birth | 2 | .1885912 .1072271 1.76 0.079 -.02157 .3987523 3 | .7118393 .0968283 7.35 0.000 .5220592 .9016194 4 | .4565487 .1013576 4.50 0.000 .2578914 .6552061 | _cons | 3.747418 .0962184 38.95 0.000 3.558834 3.936003 ------------------------------------------------------------------------------ predict pred (option mu assumed; predicted mean count) predict resid, p predict h, h gen aresid = resid/sqrt(1-h) drop h table premar birth, cont(mean count mean pred mean aresid) ------------------------------------------------------ | birth premar | 1 2 3 4 ----------+------------------------------------------- 1 | 81 68 60 38 | 42.41145 51.21382 86.42332 66.9514 | 7.603182 3.076708 -4.116706 -4.83965 | 2 | 24 26 29 14 | 15.96868 19.28294 32.53996 25.20842 | 2.328322 1.811479 -.8114836 -2.756819 | 3 | 18 41 74 42 | 30.0486 36.2851 61.2311 47.4352 | -2.681746 .9762287 2.247295 -1.026372 | 4 | 36 57 161 157 | 70.57127 85.21814 143.8056 111.405 | -6.063329 -4.603856 2.384558 6.784545 ------------------------------------------------------
Table 7.4 on page 184.
glm count i.premar i.birth c.premar#c.birth, fam(poi) Iteration 0: log likelihood = -53.12473 Iteration 1: log likelihood = -51.110631 Iteration 2: log likelihood = -51.104943 Iteration 3: log likelihood = -51.104943 Generalized linear models No. of obs = 16 Optimization : ML Residual df = 8 Scale parameter = 1 Deviance = 11.53369221 (1/df) Deviance = 1.441712 Pearson = 11.5084629 (1/df) Pearson = 1.438558 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] AIC = 7.388118 Log likelihood = -51.1049425 BIC = -10.64702 ------------------------------------------------------------------------------ | OIM count | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- premar | 2 | -1.645964 .1347333 -12.22 0.000 -1.910037 -1.381892 3 | -1.770023 .1646445 -10.75 0.000 -2.09272 -1.447326 4 | -1.753685 .2343172 -7.48 0.000 -2.212938 -1.294432 | birth | 2 | -.464105 .1195237 -3.88 0.000 -.6983672 -.2298428 3 | -.7245224 .1620067 -4.47 0.000 -1.04205 -.406995 4 | -1.879664 .2491019 -7.55 0.000 -2.367895 -1.391433 | c.premar#| c.birth | .2858355 .028238 10.12 0.000 .2304901 .3411809 | _cons | 4.106841 .0895105 45.88 0.000 3.931404 4.282279 ------------------------------------------------------------------------------ predict pred2 (option mu assumed; predicted mean count) table premar birth, cont(mean count mean pred2) -------------------------------------------------- | birth premar | 1 2 3 4 ----------+--------------------------------------- 1 | 81 68 60 38 | 80.85658 67.65406 69.39574 29.09363 | 2 | 24 26 29 14 | 20.75004 23.1065 31.5435 17.59996 | 3 | 18 41 74 42 | 24.3937 36.15178 65.68137 48.77315 | 4 | 36 57 161 157 | 32.99969 65.08765 157.3794 155.5333 --------------------------------------------------
Table 7.5 and calculation on page 186.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/cmh, clear
The command tab3way can be downloaded from the internet by typing search tab3way in the command line (see How can I use the search command to search for programs and get additional help? for more information about using search).
tab3way income satisf female [fw=count] Frequency weights are based on the expression: count Table entries are cell frequencies Missing categories ignored ------------------------------------------------------------ | female and satisf | ---------- M --------- ---------- F --------- income | 1 3 4 5 1 3 4 5 ----------+------------------------------------------------- 3 | 1 1 2 1 1 3 11 2 10 | 3 5 1 2 3 17 3 20 | 7 3 1 8 5 35 | 1 9 6 2 4 2 ------------------------------------------------------------ quietly poisson count i.female##i.income i.female##i.satisf fitstat, saving(m0) Measures of Fit for poisson of count Log-Lik Intercept Only: -96.116 Log-Lik Full Model: -48.006 D(18): 96.012 LR(13): 96.220 Prob > LR: 0.000 McFadden's R2: 0.501 McFadden's Adj R2: 0.355 ML (Cox-Snell) R2: 0.951 Cragg-Uhler(Nagelkerke) R2: 0.953 AIC: 3.875 AIC*n: 124.012 BIC: 33.628 BIC': -51.165 BIC used by Stata: 144.532 AIC used by Stata: 124.012 (Indices saved in matrix fs_m0) quietly poisson count i.income##i.satisf i.female##i.income i.female##i.satisf fitstat, using(m0) Measures of Fit for poisson of count Current Saved Difference Model: poisson poisson N: 32 32 0 Log-Lik Intercept Only -96.116 -96.116 0.000 Log-Lik Full Model -41.868 -48.006 6.137 D 83.737(9) 96.012(18) 12.275(9) LR 108.495(22) 96.220(13) 12.275(9) Prob > LR 0.000 0.000 0.198 McFadden's R2 0.564 0.501 0.064 McFadden's Adj R2 0.325 0.355 -0.030 ML (Cox-Snell) R2 0.966 0.951 0.016 Cragg-Uhler(Nagelkerke) R2 0.969 0.953 0.016 AIC 4.054 3.875 0.179 AIC*n 129.737 124.012 5.725 BIC 52.545 33.628 18.917 BIC' -32.248 -51.165 18.917 BIC used by Stata 163.449 144.532 18.917 AIC used by Stata 129.737 124.012 5.725 Difference of 18.917 in BIC' provides very strong support for saved model. Note: p-value for difference in LR is only valid if models are nested. .
Section 7.3.4.
recode income (3 = 1)(10 = 2)(20 = 3)(35 = 4), gen(score1) (32 differences between income and score) gen score2 = 1 replace score2 = satisf - 1 if satisf ~=1 (24 real changes made) poisson count i.female##i.income i.female##i.satisf c.score1#c.score2 Iteration 0: log likelihood = -45.68759 Iteration 1: log likelihood = -44.488147 Iteration 2: log likelihood = -44.485004 Iteration 3: log likelihood = -44.485002 Poisson regression Number of obs = 32 LR chi2(14) = 103.26 Prob > chi2 = 0.0000 Log likelihood = -44.485002 Pseudo R2 = 0.5372 ------------------------------------------------------------------------------- count | Coef. Std. Err. z P>|z| [95% Conf. Interval] --------------+---------------------------------------------------------------- 1.female | 1.395172 1.208614 1.15 0.248 -.9736688 3.764013 | income | 10 | -.5137616 .6911784 -0.74 0.457 -1.868446 .8409231 20 | -1.586161 1.030348 -1.54 0.124 -3.605606 .4332843 35 | -2.360276 1.476453 -1.60 0.110 -5.25407 .5335184 | female#income | 1 10 | -.19793 .6440272 -0.31 0.759 -1.4602 1.06434 1 20 | -.8761533 .6673139 -1.31 0.189 -2.184064 .4317579 1 35 | -1.894792 .6889112 -2.75 0.006 -3.245033 -.5445512 | satisf | 3 | .7452832 1.12599 0.66 0.508 -1.461616 2.952182 4 | 1.233372 1.207738 1.02 0.307 -1.133751 3.600495 5 | -.7053711 1.552878 -0.45 0.650 -3.748956 2.338213 | female#satisf | 1 3 | -.3253637 1.28563 -0.25 0.800 -2.845153 2.194425 1 4 | -.1124099 1.201721 -0.09 0.925 -2.467741 2.242921 1 5 | -.3043782 1.271368 -0.24 0.811 -2.796214 2.187458 | c.score1#| c.score2 | .3878149 .1547002 2.51 0.012 .0846081 .6910216 | _cons | -1.354202 1.054484 -1.28 0.199 -3.420953 .7125493 ------------------------------------------------------------------------------- fitstat, using(m0) Measures of Fit for poisson of count Current Saved Difference Model: poisson poisson N: 32 32 0 Log-Lik Intercept Only -96.116 -96.116 0.000 Log-Lik Full Model -44.485 -48.006 3.521 D 88.970(17) 96.012(18) 7.042(1) LR 103.261(14) 96.220(13) 7.042(1) Prob > LR 0.000 0.000 0.008 McFadden's R2 0.537 0.501 0.037 McFadden's Adj R2 0.381 0.355 0.026 ML (Cox-Snell) R2 0.960 0.951 0.010 Cragg-Uhler(Nagelkerke) R2 0.963 0.953 0.010 AIC 3.718 3.875 -0.158 AIC*n 118.970 124.012 -5.042 BIC 30.052 33.628 -3.576 BIC' -54.741 -51.165 -3.576 BIC used by Stata 140.956 144.532 -3.576 AIC used by Stata 118.970 124.012 -5.042 Difference of 3.576 in BIC' provides positive support for current model. Note: p-value for difference in LR is only valid if models are nested. test c.score1#c.score2 ( 1) [count]c.score1#c.score2 = 0 chi2( 1) = 6.28 Prob > chi2 = 0.0122