This page has been updated to work with Stata 12.
This chapter makes use of the user written program fitstat program, which is not part of base Stata. Prior to using the fitstat command, it needs to be downloaded by typing search fitstat in the command line (see How can I use the search command to search for programs and get additional help? for more information about using search).
Table 7.2 on page 178 using data on usage of alcohol, cigarettes and marijuana.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/drug, clear
*MODEL 1
quietly glm count i.gender#i.race i.alcohol i.smoke i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 1325.1408 df=25
*MODEL 2
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 15.340343 df=16
*MODEL 3
quietly glm count i.gender#i.race#i.alcohol i.gender#i.race#i.smoke ///
i.gender#i.race#i.marij i.gender#i.alcohol#i.smoke ///
i.gender#i.alcohol#i.marij i.gender#i.smoke#i.marij ///
i.race#i.alcohol#i.smoke i.race#i.alcohol#i.marij ///
i.race#i.smoke#i.marij i.alcohol#i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 5.2720008 df=6
*MODEL 4a
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 201.19931 df=17
*MODEL 4b
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 106.958 df=17
*MODEL 4c
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 513.47218 df=17
*MODEL 4d
quietly glm count i.gender#i.race i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 18.716951 df=17
*MODEL 4e
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 20.320861 df=17
*MODEL 4f
quietly glm count i.gender#i.race i.gender#i.alcohol ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 16.317184 df=17
*MODEL 4g
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 15.783467 df=17
*MODEL 4h
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.race#i.alcohol i.race#i.smoke i.race#i.marij ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 25.161008 df=17
*MODEL 4i
quietly glm count i.gender#i.race i.gender#i.alcohol i.gender#i.smoke ///
i.gender#i.marij i.race#i.alcohol i.race#i.smoke ///
i.alcohol#i.smoke i.alcohol#i.marij i.smoke#i.marij, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 18.928935 df=17
*MODEL 5
quietly glm count i.alcohol#i.smoke i.alcohol#i.marij ///
i.smoke#i.marij i.alcohol#i.gender i.alcohol#i.race ///
i.gender#i.marij i.gender#i.race i.marij#i.race, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 16.73504 df=18
*MODEL 6
quietly glm count i.alcohol#i.smoke i.alcohol#i.marij ///
i.smoke#i.marij i.alcohol#i.gender i.alcohol#i.race ///
i.gender#i.marij i.gender#i.race, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 19.908587 df=19
*MODEL 7
quietly glm count i.alcohol#i.smoke i.alcohol#i.marij ///
i.smoke#i.marij i.alcohol#i.gender i.alcohol#i.race ///
i.gender#i.race, fam(poi)
display "deviance = " e(deviance) " df=" e(df)
deviance = 28.80508 df=20
Table 7.3 on page 181.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/sex, clear
glm count i.premar i.birth, fam(poi)
Iteration 0: log likelihood = -111.95331
Iteration 1: log likelihood = -109.17365
Iteration 2: log likelihood = -109.16457
Iteration 3: log likelihood = -109.16457
Generalized linear models No. of obs = 16
Optimization : ML Residual df = 9
Scale parameter = 1
Deviance = 127.6529373 (1/df) Deviance = 14.18366
Pearson = 128.6835978 (1/df) Pearson = 14.29818
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 14.52057
Log likelihood = -109.164565 BIC = 102.6996
------------------------------------------------------------------------------
| OIM
count | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
premar |
2 | -.9767888 .1216605 -8.03 0.000 -1.215239 -.7383387
3 | -.3446024 .0988072 -3.49 0.000 -.538261 -.1509438
4 | .5092049 .0805088 6.32 0.000 .3514105 .6669993
|
birth |
2 | .1885912 .1072271 1.76 0.079 -.02157 .3987523
3 | .7118393 .0968283 7.35 0.000 .5220592 .9016194
4 | .4565487 .1013576 4.50 0.000 .2578914 .6552061
|
_cons | 3.747418 .0962184 38.95 0.000 3.558834 3.936003
------------------------------------------------------------------------------
predict pred
(option mu assumed; predicted mean count)
predict resid, p
predict h, h
gen aresid = resid/sqrt(1-h)
drop h
table premar birth, cont(mean count mean pred mean aresid)
------------------------------------------------------
| birth
premar | 1 2 3 4
----------+-------------------------------------------
1 | 81 68 60 38
| 42.41145 51.21382 86.42332 66.9514
| 7.603182 3.076708 -4.116706 -4.83965
|
2 | 24 26 29 14
| 15.96868 19.28294 32.53996 25.20842
| 2.328322 1.811479 -.8114836 -2.756819
|
3 | 18 41 74 42
| 30.0486 36.2851 61.2311 47.4352
| -2.681746 .9762287 2.247295 -1.026372
|
4 | 36 57 161 157
| 70.57127 85.21814 143.8056 111.405
| -6.063329 -4.603856 2.384558 6.784545
------------------------------------------------------
Table 7.4 on page 184.
glm count i.premar i.birth c.premar#c.birth, fam(poi)
Iteration 0: log likelihood = -53.12473
Iteration 1: log likelihood = -51.110631
Iteration 2: log likelihood = -51.104943
Iteration 3: log likelihood = -51.104943
Generalized linear models No. of obs = 16
Optimization : ML Residual df = 8
Scale parameter = 1
Deviance = 11.53369221 (1/df) Deviance = 1.441712
Pearson = 11.5084629 (1/df) Pearson = 1.438558
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 7.388118
Log likelihood = -51.1049425 BIC = -10.64702
------------------------------------------------------------------------------
| OIM
count | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
premar |
2 | -1.645964 .1347333 -12.22 0.000 -1.910037 -1.381892
3 | -1.770023 .1646445 -10.75 0.000 -2.09272 -1.447326
4 | -1.753685 .2343172 -7.48 0.000 -2.212938 -1.294432
|
birth |
2 | -.464105 .1195237 -3.88 0.000 -.6983672 -.2298428
3 | -.7245224 .1620067 -4.47 0.000 -1.04205 -.406995
4 | -1.879664 .2491019 -7.55 0.000 -2.367895 -1.391433
|
c.premar#|
c.birth | .2858355 .028238 10.12 0.000 .2304901 .3411809
|
_cons | 4.106841 .0895105 45.88 0.000 3.931404 4.282279
------------------------------------------------------------------------------
predict pred2
(option mu assumed; predicted mean count)
table premar birth, cont(mean count mean pred2)
--------------------------------------------------
| birth
premar | 1 2 3 4
----------+---------------------------------------
1 | 81 68 60 38
| 80.85658 67.65406 69.39574 29.09363
|
2 | 24 26 29 14
| 20.75004 23.1065 31.5435 17.59996
|
3 | 18 41 74 42
| 24.3937 36.15178 65.68137 48.77315
|
4 | 36 57 161 157
| 32.99969 65.08765 157.3794 155.5333
--------------------------------------------------
Table 7.5 and calculation on page 186.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/cmh, clear
The command tab3way can be downloaded from the internet by typing search tab3way in the command line (see How can I use the search command to search for programs and get additional help? for more information about using search).
tab3way income satisf female [fw=count]
Frequency weights are based on the expression: count
Table entries are cell frequencies
Missing categories ignored
------------------------------------------------------------
| female and satisf
| ---------- M --------- ---------- F ---------
income | 1 3 4 5 1 3 4 5
----------+-------------------------------------------------
3 | 1 1 2 1 1 3 11 2
10 | 3 5 1 2 3 17 3
20 | 7 3 1 8 5
35 | 1 9 6 2 4 2
------------------------------------------------------------
quietly poisson count i.female##i.income i.female##i.satisf
fitstat, saving(m0)
Measures of Fit for poisson of count
Log-Lik Intercept Only: -96.116 Log-Lik Full Model: -48.006
D(18): 96.012 LR(13): 96.220
Prob > LR: 0.000
McFadden's R2: 0.501 McFadden's Adj R2: 0.355
ML (Cox-Snell) R2: 0.951 Cragg-Uhler(Nagelkerke) R2: 0.953
AIC: 3.875 AIC*n: 124.012
BIC: 33.628 BIC': -51.165
BIC used by Stata: 144.532 AIC used by Stata: 124.012
(Indices saved in matrix fs_m0)
quietly poisson count i.income##i.satisf i.female##i.income i.female##i.satisf
fitstat, using(m0)
Measures of Fit for poisson of count
Current Saved Difference
Model: poisson poisson
N: 32 32 0
Log-Lik Intercept Only -96.116 -96.116 0.000
Log-Lik Full Model -41.868 -48.006 6.137
D 83.737(9) 96.012(18) 12.275(9)
LR 108.495(22) 96.220(13) 12.275(9)
Prob > LR 0.000 0.000 0.198
McFadden's R2 0.564 0.501 0.064
McFadden's Adj R2 0.325 0.355 -0.030
ML (Cox-Snell) R2 0.966 0.951 0.016
Cragg-Uhler(Nagelkerke) R2 0.969 0.953 0.016
AIC 4.054 3.875 0.179
AIC*n 129.737 124.012 5.725
BIC 52.545 33.628 18.917
BIC' -32.248 -51.165 18.917
BIC used by Stata 163.449 144.532 18.917
AIC used by Stata 129.737 124.012 5.725
Difference of 18.917 in BIC' provides very strong support for saved model.
Note: p-value for difference in LR is only valid if models are nested.
.
Section 7.3.4.
recode income (3 = 1)(10 = 2)(20 = 3)(35 = 4), gen(score1)
(32 differences between income and score)
gen score2 = 1
replace score2 = satisf - 1 if satisf ~=1
(24 real changes made)
poisson count i.female##i.income i.female##i.satisf c.score1#c.score2
Iteration 0: log likelihood = -45.68759
Iteration 1: log likelihood = -44.488147
Iteration 2: log likelihood = -44.485004
Iteration 3: log likelihood = -44.485002
Poisson regression Number of obs = 32
LR chi2(14) = 103.26
Prob > chi2 = 0.0000
Log likelihood = -44.485002 Pseudo R2 = 0.5372
-------------------------------------------------------------------------------
count | Coef. Std. Err. z P>|z| [95% Conf. Interval]
--------------+----------------------------------------------------------------
1.female | 1.395172 1.208614 1.15 0.248 -.9736688 3.764013
|
income |
10 | -.5137616 .6911784 -0.74 0.457 -1.868446 .8409231
20 | -1.586161 1.030348 -1.54 0.124 -3.605606 .4332843
35 | -2.360276 1.476453 -1.60 0.110 -5.25407 .5335184
|
female#income |
1 10 | -.19793 .6440272 -0.31 0.759 -1.4602 1.06434
1 20 | -.8761533 .6673139 -1.31 0.189 -2.184064 .4317579
1 35 | -1.894792 .6889112 -2.75 0.006 -3.245033 -.5445512
|
satisf |
3 | .7452832 1.12599 0.66 0.508 -1.461616 2.952182
4 | 1.233372 1.207738 1.02 0.307 -1.133751 3.600495
5 | -.7053711 1.552878 -0.45 0.650 -3.748956 2.338213
|
female#satisf |
1 3 | -.3253637 1.28563 -0.25 0.800 -2.845153 2.194425
1 4 | -.1124099 1.201721 -0.09 0.925 -2.467741 2.242921
1 5 | -.3043782 1.271368 -0.24 0.811 -2.796214 2.187458
|
c.score1#|
c.score2 | .3878149 .1547002 2.51 0.012 .0846081 .6910216
|
_cons | -1.354202 1.054484 -1.28 0.199 -3.420953 .7125493
-------------------------------------------------------------------------------
fitstat, using(m0)
Measures of Fit for poisson of count
Current Saved Difference
Model: poisson poisson
N: 32 32 0
Log-Lik Intercept Only -96.116 -96.116 0.000
Log-Lik Full Model -44.485 -48.006 3.521
D 88.970(17) 96.012(18) 7.042(1)
LR 103.261(14) 96.220(13) 7.042(1)
Prob > LR 0.000 0.000 0.008
McFadden's R2 0.537 0.501 0.037
McFadden's Adj R2 0.381 0.355 0.026
ML (Cox-Snell) R2 0.960 0.951 0.010
Cragg-Uhler(Nagelkerke) R2 0.963 0.953 0.010
AIC 3.718 3.875 -0.158
AIC*n 118.970 124.012 -5.042
BIC 30.052 33.628 -3.576
BIC' -54.741 -51.165 -3.576
BIC used by Stata 140.956 144.532 -3.576
AIC used by Stata 118.970 124.012 -5.042
Difference of 3.576 in BIC' provides positive support for current model.
Note: p-value for difference in LR is only valid if models are nested.
test c.score1#c.score2
( 1) [count]c.score1#c.score2 = 0
chi2( 1) = 6.28
Prob > chi2 = 0.0122
