This unit makes extensive use of the ipf (iterated proportional fitting) command written by Adrian Mander. Use search ipf in Stata to locate the command (see How can I use the search command to search for programs and get additional help? for more information about using search). We will use the glm command with the pois family to obtain coefficients.
Table 6.1, page 147.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/afterlife, clear list gender aftlife freq 1. females yes 435 2. females no 147 3. males yes 375 4. males no 134 table gender aftlife [fw=freq], cont(freq) ---------------------- | belief in | afterlife gender | no yes ----------+----------- male | 134 375 females | 147 435 ---------------------- ipf [fw=freq], fit(gender+aftlife) save(aftlif) exp nolog Deleting all matrices...... Expansion of the various marginal models ---------------------------------------- marginal model 1 varlist : gender marginal model 2 varlist : aftlife unique varlist gender aftlife ------------------------------------------------------------------- N.B. structural/sampling zeroes may lead to an incorrect df Residual degrees of freedom = 1 Goodness of Fit Tests --------------------- df = 1 Likelihood Ratio Statistic G^2 = 0.1620 p-value = 0.687 Pearson Statistic X^2 = 0.1621 p-value = 0.687 gender aftlife Efreq Ofreq prob 0 0 131.09899 134 .12016406 0 1 377.90101 375 .34638039 1 0 149.90101 147 .13739781 1 1 432.09899 435 .39605774 use aftlif, clear table gender aftlife, cont(mean Efreq) -------------------------------- | aftlife gender | 0 1 ----------+--------------------- 0 | 131.09899 377.90101 1 | 149.90101 432.09899 -------------------------------- generate lefreq = ln(Efreq) table gender aftlife, cont(mean lefreq) ------------------------------ | aftlife gender | 0 1 ----------+------------------- 0 | 4.875953 5.934632 1 | 5.009975 6.068655 ------------------------------ use https://stats.idre.ucla.edu/stat/stata/examples/icda/afterlife, clear glm freq gender aftlife, fam(pois) link(log) Generalized linear models No. of obs = 4 Optimization : ML: Newton-Raphson Residual df = 1 Scale param = 1 Deviance = .1619951194 (1/df) Deviance = .1619951 Pearson = .162083973 (1/df) Pearson = .162084 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] Standard errors : OIM Log likelihood = -14.70362649 AIC = 8.851813 BIC = -3.996887964 ------------------------------------------------------------------------------ freq | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gender | .1340224 .0606865 2.21 0.027 .0150791 .2529658 aftlife | 1.05868 .0692336 15.29 0.000 .9229843 1.194375 _cons | 4.875953 .0678732 71.84 0.000 4.742924 5.008982 ------------------------------------------------------------------------------ generate g2 = ~gender generate a2 = ~aftlife glm freq g2 a2, fam(pois) link(log) Generalized linear models No. of obs = 4 Optimization : ML: Newton-Raphson Residual df = 1 Scale param = 1 Deviance = .1619951194 (1/df) Deviance = .1619951 Pearson = .162083973 (1/df) Pearson = .162084 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] Standard errors : OIM Log likelihood = -14.70362649 AIC = 8.851813 BIC = -3.996887964 ------------------------------------------------------------------------------ freq | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- g2 | -.1340224 .0606865 -2.21 0.027 -.2529658 -.0150791 a2 | -1.05868 .0692336 -15.29 0.000 -1.194375 -.9229843 _cons | 6.068655 .0451242 134.49 0.000 5.980213 6.157096 ------------------------------------------------------------------------------ generate g3 = gender - g2 generate a3 = aftlife - a2 list gender aftlife freq g2 a2 g3 a3 1. females yes 435 0 0 1 1 2. females no 147 0 1 1 -1 3. male yes 375 1 0 -1 1 4. male no 134 1 1 -1 -1 glm freq g3 a3, fam(pois) link(log) Generalized linear models No. of obs = 4 Optimization : ML: Newton-Raphson Residual df = 1 Scale param = 1 Deviance = .1619951194 (1/df) Deviance = .1619951 Pearson = .162083973 (1/df) Pearson = .162084 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] Standard errors : OIM Log likelihood = -14.70362649 AIC = 8.851813 BIC = -3.996887964 ------------------------------------------------------------------------------ freq | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- g3 | .0670112 .0303432 2.21 0.027 .0075396 .1264829 a3 | .5293398 .0346168 15.29 0.000 .4614921 .5971874 _cons | 5.472304 .0346763 157.81 0.000 5.404339 5.540268 ------------------------------------------------------------------------------
Table 6.3, page 152.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/acm, clear describe Contains data from acm.dta obs: 8 vars: 4 28 Nov 2001 14:28 size: 72 (99.7% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- a byte %8.0g yn alcohol use c byte %8.0g yn cigarette use m byte %8.0g yn marijuana use freq int %8.0g ------------------------------------------------------------------------------- list a c m freq 1. yes no yes 44 2. no no yes 2 3. no yes yes 3 4. yes yes yes 911 5. no no no 279 6. no yes no 43 7. yes no no 456 8. yes yes no 538 table c m [fw=freq], by(a) ---------------------- alcohol | use and | marijuana cigarette | use use | yes no ----------+----------- yes | yes | 911 538 no | 44 456 ----------+----------- no | yes | 3 43 no | 2 279 ----------------------
Table 6.4, page 152, output edited.
/* (A, C, M) */ ipf [fw=freq], fit(a+c+m) exp a c m Efreq 1 1 1 539.98258 1 1 2 740.22612 1 2 1 282.09123 1 2 2 386.70007 2 1 1 90.597385 2 1 2 124.19392 2 2 1 47.328801 2 2 2 64.879898 /* (AC, M) */ ipf [fw=freq], fit(a*c+m) exp a c m Efreq 1 1 1 611.1775 1 1 2 837.8225 1 2 1 210.89631 1 2 2 289.10369 2 1 1 19.40246 2 1 2 26.59754 2 2 1 118.52373 2 2 2 162.47627 /* (AM, CM) */ ipf [fw=freq], fit(a*m+c*m) exp a m c Efreq 1 1 1 909.23958 1 1 2 45.760417 1 2 1 438.84043 1 2 2 555.15957 2 1 1 4.7604167 2 1 2 .23958333 2 2 1 142.15957 2 2 2 179.84043 /* (AC, AM, CM) */ ipf [fw=freq], fit(a*c+a*m+c*m) exp a c m Efreq 1 1 1 910.38316 1 1 2 538.61683 1 2 1 44.616829 1 2 2 455.38327 2 1 1 3.6168352 2 1 2 42.383171 2 2 1 1.3831706 2 2 2 279.61673 /* (ACM) */ ipf [fw=freq], fit(a*c*m) exp a c m Efreq 1 1 1 911 1 1 2 538 1 2 1 44 1 2 2 456 2 1 1 3 2 1 2 43 2 2 1 2 2 2 2 279
Table 6.6, page 155, output edited.
ipf [fw=freq], fit(a+c+m) df = 4 Likelihood Ratio Statistic G^2 = 1286.0199 p-value = 0.000 Pearson Statistic X^2 = 1411.3860 p-value = 0.000 ipf [fw=freq], fit(a+c*m) df = 3 Likelihood Ratio Statistic G^2 = 534.2117 p-value = 0.000 Pearson Statistic X^2 = 505.5977 p-value = 0.000 ipf [fw=freq], fit(c+a*m) df = 3 Likelihood Ratio Statistic G^2 = 939.5626 p-value = 0.000 Pearson Statistic X^2 = 824.1630 p-value = 0.000 ipf [fw=freq], fit(m+a*c) df = 3 Likelihood Ratio Statistic G^2 = 843.8267 p-value = 0.000 Pearson Statistic X^2 = 704.9071 p-value = 0.000 ipf [fw=freq], fit(a*c+a*m) df = 2 Likelihood Ratio Statistic G^2 = 497.3693 p-value = 0.000 Pearson Statistic X^2 = 443.7611 p-value = 0.000 ipf [fw=freq], fit(a*c+c*m) df = 2 Likelihood Ratio Statistic G^2 = 92.0184 p-value = 0.000 Pearson Statistic X^2 = 80.8148 p-value = 0.000 ipf [fw=freq], fit(a*m+c*m) df = 2 Likelihood Ratio Statistic G^2 = 187.7543 p-value = 0.000 Pearson Statistic X^2 = 177.6149 p-value = 0.000 ipf [fw=freq], fit(a*c+a*m+c*m) Likelihood Ratio Statistic G^2 = 0.3740 p-value = 0.541 Pearson Statistic X^2 = 0.4011 p-value = 0.527 ipf [fw=freq], fit(a*c*m) df = 0 Likelihood Ratio Statistic G^2 = 0.0000 p-value = . Pearson Statistic X^2 = 0.0000 p-value = .
Table 6.7, page 156.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/acm, clear gen ac=a*c gen am=a*m gen cm=c*m glm freq a c m am cm, fam(poi) Iteration 0: log likelihood = -306.78871 Iteration 1: log likelihood = -134.68656 Iteration 2: log likelihood = -119.80666 Iteration 3: log likelihood = -118.41883 Iteration 4: log likelihood = -118.39888 Iteration 5: log likelihood = -118.39887 Iteration 6: log likelihood = -118.39887 Generalized linear models No. of obs = 8 Optimization : ML: Newton-Raphson Residual df = 2 Scale parameter = 1 Deviance = 187.7543029 (1/df) Deviance = 93.87715 Pearson = 177.6148606 (1/df) Pearson = 88.80743 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] Standard errors : OIM Log likelihood = -118.3988656 AIC = 31.09972 BIC = 183.5954198 ------------------------------------------------------------------------------ freq | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- a | -9.377361 .8990551 -10.43 0.000 -11.13948 -7.615246 c | -6.213498 .3072696 -20.22 0.000 -6.815735 -5.611261 m | -8.077869 .4938394 -16.36 0.000 -9.045777 -7.109962 am | 4.125088 .4529445 9.11 0.000 3.237333 5.012843 cm | 3.224309 .1609812 20.03 0.000 2.908792 3.539826 _cons | 23.13194 .9652276 23.97 0.000 21.24013 25.02375 ------------------------------------------------------------------------------ predict fit1 (option mu assumed; predicted mean freq) predict h1, h predict res1, p gen ares1 = res1/sqrt(1-h1) glm freq a c m ac am cm, fam(poi) Iteration 0: log likelihood = -142.34193 Iteration 1: log likelihood = -37.961044 Iteration 2: log likelihood = -25.867183 Iteration 3: log likelihood = -24.719804 Iteration 4: log likelihood = -24.708713 Iteration 5: log likelihood = -24.708707 Iteration 6: log likelihood = -24.708707 Generalized linear models No. of obs = 8 Optimization : ML: Newton-Raphson Residual df = 1 Scale parameter = 1 Deviance = .3739858701 (1/df) Deviance = .3739859 Pearson = .4011005168 (1/df) Pearson = .4011005 Variance function: V(u) = u [Poisson] Link function : g(u) = ln(u) [Log] Standard errors : OIM Log likelihood = -24.70870712 AIC = 7.927177 BIC = -1.705455672 ------------------------------------------------------------------------------ freq | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- a | -10.56882 .9109278 -11.60 0.000 -12.3542 -8.78343 c | -7.918178 .3476245 -22.78 0.000 -8.599509 -7.236846 m | -6.358765 .4957275 -12.83 0.000 -7.330373 -5.387157 ac | 2.054534 .1740643 11.80 0.000 1.713374 2.395694 am | 2.986014 .464678 6.43 0.000 2.075262 3.896767 cm | 2.847889 .1638394 17.38 0.000 2.52677 3.169009 _cons | 23.77119 .9484083 25.06 0.000 21.91234 25.63003 ------------------------------------------------------------------------------ predict fit2 (option mu assumed; predicted mean freq) predict h2, h predict res2, p gen ares2 = res2/sqrt(1-h2) list a c m freq fit1 fit2 ares1 ares2 +----------------------------------------------------------------------+ | a c m freq fit1 fit2 ares1 ares2 | |----------------------------------------------------------------------| 1. | no no yes 2 .2395833 1.38317 3.695589 .6333249 | 2. | no yes yes 3 4.760417 3.61683 -3.695589 -.6333249 | 3. | no yes no 43 142.1596 42.38317 -12.80459 .6333254 | 4. | yes no yes 44 45.76042 44.61683 -3.695596 -.6333249 | 5. | no no no 279 179.8404 279.6168 12.80459 -.6333253 | |----------------------------------------------------------------------| 6. | yes no no 456 555.1595 455.3832 -12.80459 .6333241 | 7. | yes yes no 538 438.8404 538.6168 12.80459 -.6333285 | 8. | yes yes yes 911 909.2396 910.3832 3.695599 .6333305 | +----------------------------------------------------------------------+
Table 6.8, page 159.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/injury, clear describe Contains data from injury.dta obs: 16 vars: 5 29 Nov 2001 08:11 size: 160 (100.0% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- g byte %8.0g gen gender l byte %8.0g loc location s byte %8.0g yn seat-belt j byte %8.0g yn injury freq int %8.0g ------------------------------------------------------------------------------- list g l s j freq 1. female urban no no 7287 2. female urban no yes 996 3. female urban yes no 11587 4. female urban yes yes 759 5. female rural no no 3246 6. female rural no yes 973 7. female rural yes no 6134 8. female rural yes yes 757 9. male urban no no 10381 10. male urban no yes 812 11. male urban yes no 10969 12. male urban yes yes 380 13. male rural no no 6123 14. male rural no yes 1084 15. male rural yes no 6693 16. male rural yes yes 513 table s j [fw=freq], by(g l) -------------------------- gender, | location | and | injury seat-belt | no yes ----------+--------------- female | urban | no | 7,287 996 yes | 11,587 759 ----------+--------------- female | rural | no | 3,246 973 yes | 6,134 757 ----------+--------------- male | urban | no | 10,381 812 yes | 10,969 380 ----------+--------------- male | rural | no | 6,123 1,084 yes | 6,693 513 -------------------------- ipf [fw=freq], fit(g*j+g*l+g*s+j*l+j*s+l*s) exp save(inj2) Deleting all matrices...... Expansion of the various marginal models ---------------------------------------- marginal model 1 varlist : g j marginal model 2 varlist : g l marginal model 3 varlist : g s marginal model 4 varlist : j l marginal model 5 varlist : j s marginal model 6 varlist : l s unique varlist g j l s ------------------------------------------------------------------- N.B. structural/sampling zeroes may lead to an incorrect df Residual degrees of freedom = 13 Goodness of Fit Tests --------------------- df = 13 Likelihood Ratio Statistic G^2 = 23.3510 p-value = 0.038 Pearson Statistic X^2 = 23.3752 p-value = 0.037 g j l s Efreq Ofreq prob 1 1 1 1 7166.3695 7287 .10432308 1 1 1 2 11748.308 11587 .17102379 1 1 2 1 3353.8303 3246 .04882275 1 1 2 2 5985.4936 6134 .0871327 1 2 1 1 993.01641 996 .01445565 1 2 1 2 721.30528 759 .01050027 1 2 2 1 988.78428 973 .01439404 1 2 2 2 781.89238 757 .01138225 2 1 1 1 10471.495 10381 .15243682 2 1 1 2 10837.827 10969 .15776963 2 1 2 1 6045.3055 6123 .0880034 2 1 2 2 6811.3709 6693 .09915525 2 2 1 1 845.11924 812 .01230266 2 2 1 2 387.55922 380 .00564182 2 2 2 1 1038.0799 1084 .01511165 2 2 2 2 518.2432 513 .00754423 use inj2, clear table s j, by(g l) cont(mean Efreq) -------------------------------- g, l and | j s | 1 2 ----------+--------------------- 1 | 1 | 1 | 7166.3695 993.01641 2 | 11748.308 721.30528 ----------+--------------------- 1 | 2 | 1 | 3353.8303 988.78428 2 | 5985.4936 781.89238 ----------+--------------------- 2 | 1 | 1 | 10471.495 845.11924 2 | 10837.827 387.55922 ----------+--------------------- 2 | 2 | 1 | 6045.3055 1038.0799 2 | 6811.3709 518.2432 -------------------------------- use https://stats.idre.ucla.edu/stat/stata/examples/icda/injury, clear ipf [fw=freq], fit(g*l*s+g*j+j*l+j*s) exp save(inj3) Deleting all matrices...... Expansion of the various marginal models ---------------------------------------- marginal model 1 varlist : g l s marginal model 2 varlist : g j marginal model 3 varlist : j l marginal model 4 varlist : j s unique varlist g l s j ------------------------------------------------------------------- N.B. structural/sampling zeroes may lead to an incorrect df Residual degrees of freedom = 12 Goodness of Fit Tests --------------------- df = 12 Likelihood Ratio Statistic G^2 = 7.4645 p-value = 0.825 Pearson Statistic X^2 = 7.4874 p-value = 0.824 g l s j Efreq Ofreq prob 1 1 1 1 7273.2141 7287 .10587845 1 1 1 2 1009.7858 996 .01469977 1 1 2 1 11632.621 11587 .16933969 1 1 2 2 713.37784 759 .01038486 1 2 1 1 3254.6633 3246 .04737915 1 2 1 2 964.3383 973 .01403817 1 2 2 1 6093.502 6134 .08870501 1 2 2 2 797.49773 757 .01160942 2 1 1 1 10358.931 10381 .15079819 2 1 1 2 834.06847 812 .0121418 2 1 2 1 10959.234 10969 .15953699 2 1 2 2 389.76793 380 .00567397 2 2 1 1 6150.1915 6123 .08953026 2 2 1 2 1056.8074 1084 .01538428 2 2 2 1 6697.6432 6693 .09749968 2 2 2 2 508.3565 513 .0074003 use inj3 table s j, by(g l) cont(mean Efreq) -------------------------------- g, l and | j s | 1 2 ----------+--------------------- 1 | 1 | 1 | 7273.2141 1009.7858 2 | 11632.621 713.37784 ----------+--------------------- 1 | 2 | 1 | 3254.6633 964.3383 2 | 6093.502 797.49773 ----------+--------------------- 2 | 1 | 1 | 10358.931 834.06847 2 | 10959.234 389.76793 ----------+--------------------- 2 | 2 | 1 | 6150.1915 1056.8074 2 | 6697.6432 508.3565 --------------------------------
Table 6.9, page 160, output edited.
use https://stats.idre.ucla.edu/stat/stata/examples/icda/injury, clear ipf [fw=freq], fit(g+j+l+s) df = 11 Likelihood Ratio Statistic G^2 = 2792.7710 p-value = 0.000 Pearson Statistic X^2 = 2758.3408 p-value = 0.000 ipf [fw=freq], fit(g*j+g*l+g*s+j*l+j*s+l*s) df = 13 Likelihood Ratio Statistic G^2 = 23.3510 p-value = 0.038 Pearson Statistic X^2 = 23.3752 p-value = 0.037 ipf [fw=freq], fit(g*j*l+g*j*s+g*l*s+j*l*s) df = 7 Likelihood Ratio Statistic G^2 = 1.3253 p-value = 0.988 Pearson Statistic X^2 = 1.3246 p-value = 0.988 ipf [fw=freq], fit(g*j*l+g*s+j*s+l*s) df = 10 Likelihood Ratio Statistic G^2 = 18.5693 p-value = 0.046 Pearson Statistic X^2 = 18.5391 p-value = 0.047 ipf [fw=freq], fit(g*j*s+g*l+j*l+l*s) df = 10 Likelihood Ratio Statistic G^2 = 22.8468 p-value = 0.011 Pearson Statistic X^2 = 22.8250 p-value = 0.011 ipf [fw=freq], fit(g*l*s+g*j+j*l+j*s) df = 12 Likelihood Ratio Statistic G^2 = 7.4645 p-value = 0.825 Pearson Statistic X^2 = 7.4874 p-value = 0.824 ipf [fw=freq], fit(j*l*s+g*j+g*l+g*s) df = 10 Likelihood Ratio Statistic G^2 = 20.6334 p-value = 0.024 Pearson Statistic X^2 = 20.6131 p-value = 0.024