NOTE: This page is under construction!!
So far in this course we have analyzed data in which the response variable has had exactly two levels, but what about the situation in which there are more than two levels? In this chapter of the Logistic Regression with Stata, we cover the various commands used for multinomial and ordered logistic regression allowing for more than two categories. Multinomial response models have much in common with the logistic regression models that we have covered so far. However, you will find that there are differences in some of the assumptions, in the analyses and in the interpretation of these models.
4.2 Ordered Logistic Regression
4.2.1 Example 1
Let’s begin our discussion of ordered logistic regression with an example that has a binary outcome variable, honcomp, that indicates that a student is enrolled in an “honors composition” course. We begin with an ordinary logistic regression.
use https://stats.idre.ucla.edu/stat/stata/webbooks/logistic/hsblog, clear logit honcomp female Logit estimates Number of obs = 200 LR chi2(1) = 3.94 Prob > chi2 = 0.0473 Log likelihood = -113.6769 Pseudo R2 = 0.0170 ------------------------------------------------------------------------------ honcomp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .6513707 .3336752 1.95 0.051 -.0026207 1.305362 _cons | -1.400088 .2631619 -5.32 0.000 -1.915876 -.8842998 ------------------------------------------------------------------------------Next, we will run an ordered logistic regression for the same model using Stata's ologit command.
ologit honcomp female Ordered logit estimates Number of obs = 200 LR chi2(1) = 3.94 Prob > chi2 = 0.0473 Log likelihood = -113.6769 Pseudo R2 = 0.0170 ------------------------------------------------------------------------------ honcomp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- female | .6513707 .3336752 1.95 0.051 -.0026207 1.305362 -------------+---------------------------------------------------------------- _cut1 | 1.400088 .2631619 (Ancillary parameter) ------------------------------------------------------------------------------As you can see, the values of the coefficients and the standard errors are the same, except that, the sign for _cut1 is reversed from _cons. We will explain shortly what _cut1 is although it is already clear that it is related to the constant found in the logistic regression models.
4.2.2 Example 2
For our next example we will select ses as the response variable. It has three ordered categories. Here are the frequencies for each of the categories.tabulate ses ses | Freq. Percent Cum. ------------+----------------------------------- low | 47 23.50 23.50 middle | 95 47.50 71.00 high | 58 29.00 100.00 ------------+----------------------------------- Total | 200 100.00We can also obtain much of the same information using the codebook command.
codebook ses ses --------------------------------------------------------------- (unlabeled) type: numeric (float) label: sl range: [1,3] units: 1 unique values: 3 coded missing: 0 / 200 tabulation: Freq. Numeric Label 47 1 low 95 2 middle 58 3 highFor a predictor variable we will use the variable academic which is a dummy variable indicating whether or not students are in an academic program. Here is the ordered logistic model predicting ses using academic.
ologit ses academic Ordered logit estimates Number of obs = 200 LR chi2(1) = 11.83 Prob > chi2 = 0.0006 Log likelihood = -204.66504 Pseudo R2 = 0.0281 ------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- academic | .9299309 .2745004 3.39 0.001 .39192 1.467942 -------------+---------------------------------------------------------------- _cut1 | -.7643189 .2042487 (Ancillary parameters) _cut2 | 1.41461 .225507 ------------------------------------------------------------------------------The format of these results may seem confusing at first. What isn't clear from the output is that logistic regression is a multi-equation model. In this example, there are two equations, each with the same coefficients. This is known as the proportional odds model. Other logistics regression models, which do not assume proportional odds will have one equation, with their own constants and coefficients, for each of the k-1 equations.
In our example, the results are formatted like a single equation model when, in fact, this is a two equation model because there are three levels of ses. In ordered logistic regression, Stata sets the constant to zero and estimates the cut points for separating the various levels of the response variable. Other programs may parameterize the model differently by estimating the constant and setting the first cut point to zero. In order to show the multi-equation nature of this model, we will redisplay the results in a different format.
/* output showing the multi-equation nature of ordered logistic regression */ ------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- low | academic | .9299306 .2745004 3.39 0.001 .3919197 1.467941 _cons | .7643188 .2042487 3.74 0.000 .3639987 1.164639 -------------+---------------------------------------------------------------- middle | academic | .9299306 .2745004 3.39 0.001 .3919197 1.467941 _cons | -1.414609 .225507 -6.27 0.000 -1.856595 -.9726238 ------------------------------------------------------------------------------
With ordered logistic regression there are other possible methods that do not involve the proportional odds assumption. There is a program omodel (available from the Stata website) which can be used to test the proportional odds assumption. You can download omodel from within Stata by typing search omodel (see How can I use the search command to search for programs and get additional help? for more information about using search).
omodel logit ses academicOrdered logit estimates Number of obs = 200 LR chi2(1) = 11.83 Prob > chi2 = 0.0006 Log likelihood = -204.66504 Pseudo R2 = 0.0281
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- academic | .9299309 .2745004 3.39 0.001 .39192 1.467942 -------------+---------------------------------------------------------------- _cut1 | -.7643189 .2042487 (Ancillary parameters) _cut2 | 1.41461 .225507 ------------------------------------------------------------------------------
Approximate likelihood-ratio test of proportionality of odds across response categories: chi2(1) = 2.01 Prob > chi2 = 0.1563
These results suggest that the proportional odds approach is reasonable since the chi-square test is not significant. If the test of proportionality had been significant we could have tried the gologit2 program by Richard Williams of Notre Dame University. You can download gologit2 from within Stata by typing search gologit2 (see How can I use the search command to search for programs and get additional help? for more information about using search). gologit2 with the npl option does not assume proportional odds, let's try it just for "fun."
gologit2 ses academic, nplGeneralized Ordered Logit Estimates Number of obs = 200 LR chi2(2) = 13.83 Prob > chi2 = 0.0010 Log likelihood = -203.66708 Pseudo R2 = 0.0328
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- low | academic | .6374202 .3389678 1.88 0.060 -.0269444 1.301785 _cons | .8724881 .2250326 3.88 0.000 .4314324 1.313544 -------------+---------------------------------------------------------------- middle | academic | 1.191394 .3388816 3.52 0.000 .5271982 1.85559 _cons | -1.596859 .27415 -5.82 0.000 -2.134183 -1.059535 ------------------------------------------------------------------------------
These results clearly show the multiple equation nature of ordered logistic regression with different constants, coefficients and standard errors.
The gologit2 command provides us with an alternative method for testing the proportionality assumption. If the assumption of proportional odds is tenable then there should not be a significant difference between the coefficients for academic in the two equations. The test command computes a Wald test across the two equations.
test [low=middle]( 1) [low]academic - [middle]academic = 0
chi2( 1) = 1.98 Prob > chi2 = 0.1595
The results of this Wald test of proportionality are very similar to those found using the omodel command.
Let's rerun the ologit command followed by the listcoef and fitstat commands.
ologit ses academicOrdered logit estimates Number of obs = 200 LR chi2(1) = 11.83 Prob > chi2 = 0.0006 Log likelihood = -204.66504 Pseudo R2 = 0.0281
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- academic | .9299309 .2745004 3.39 0.001 .39192 1.467942 -------------+---------------------------------------------------------------- _cut1 | -.7643189 .2042487 (Ancillary parameters) _cut2 | 1.41461 .225507 ------------------------------------------------------------------------------
listcoef
ologit (N=200): Factor Change in Odds
Odds of: >m vs <=m
---------------------------------------------------------------------- ses | b z P>|z| e^b e^bStdX SDofX -------------+-------------------------------------------------------- academic | 0.92993 3.388 0.001 2.5343 1.5929 0.5006 ----------------------------------------------------------------------
fitstat
Measures of Fit for ologit of ses
Log-Lik Intercept Only: -210.583 Log-Lik Full Model: -204.665 D(197): 409.330 LR(1): 11.835 Prob > LR: 0.000 McFadden's R2: 0.028 McFadden's Adj R2: 0.014 Maximum Likelihood R2: 0.057 Cragg & Uhler's R2: 0.065 McKelvey and Zavoina's R2: 0.062 Variance of y*: 3.507 Variance of error: 3.290 Count R2: 0.475 Adj Count R2: 0.000 AIC: 2.077 AIC*n: 415.330 BIC: -634.438 BIC': -6.537
From the listcoef, we see that the relative risk ratio for academic is approximately 2.5, which means that the risk (odds) of being in the high ses versus medium and low ses is 2.5 times greater for students in the academic program. The same relative risk ratio also applies to the comparison of medium and high ses versus low ses.
4.2.3 Example 3
The variable academic that we used in the previous example is a dichotomization of the three category variable prog (program type). Let's look at the frequencies for each of the levels of prog and create dummy coded variables at the same time using the tabulate command.
tabulate prog, generate(prog)type of | program | Freq. Percent Cum. ------------+----------------------------------- general | 45 22.50 22.50 academic | 105 52.50 75.00 vocation | 50 25.00 100.00 ------------+----------------------------------- Total | 200 100.00
Now we can use prog1 and prog3 in an ordered logistic regression so that the academic group will be our comparison group.
ologit ses prog1 prog3Ordered logit estimates Number of obs = 200 LR chi2(2) = 12.06 Prob > chi2 = 0.0024 Log likelihood = -204.55398 Pseudo R2 = 0.0286
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- prog1 | -1.030315 .3479667 -2.96 0.003 -1.712317 -.3483126 prog3 | -.8500258 .3223129 -2.64 0.008 -1.481747 -.2183042 -------------+---------------------------------------------------------------- _cut1 | -1.695676 .2334022 (Ancillary parameters) _cut2 | .4852592 .195606 ------------------------------------------------------------------------------
Individually, prog1 and prog3 are statistically significant and we can determine from the likelihood ration chi-square (chi2(2) = 12.06) that they are jointly significant, i.e., that the variable prog is significant.
We will follow this analysis with the omodel command to check on the proportional odds assumption.
omodel logit ses prog1 prog3Ordered logit estimates Number of obs = 200 LR chi2(2) = 12.06 Prob > chi2 = 0.0024 Log likelihood = -204.55398 Pseudo R2 = 0.0286
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- prog1 | -1.030315 .3479667 -2.96 0.003 -1.712317 -.3483126 prog3 | -.8500258 .3223129 -2.64 0.008 -1.481747 -.2183042 -------------+---------------------------------------------------------------- _cut1 | -1.695676 .2334022 (Ancillary parameters) _cut2 | .4852592 .195606 ------------------------------------------------------------------------------
Approximate likelihood-ratio test of proportionality of odds across response categories: chi2(2) = 4.74 Prob > chi2 = 0.0933
The test of proportionality is not significant, thus we can continue looking at the results for the ologit command by following up with listcoef and fitstat.
listcoefologit (N=200): Factor Change in Odds
Odds of: >m vs <=m
---------------------------------------------------------------------- ses | b z P>|z| e^b e^bStdX SDofX -------------+-------------------------------------------------------- prog1 | -1.03031 -2.961 0.003 0.3569 0.6497 0.4186 prog3 | -0.85003 -2.637 0.008 0.4274 0.6914 0.4341 ----------------------------------------------------------------------
fitstat
Measures of Fit for ologit of ses
Log-Lik Intercept Only: -210.583 Log-Lik Full Model: -204.554 D(196): 409.108 LR(2): 12.057 Prob > LR: 0.002 McFadden's R2: 0.029 McFadden's Adj R2: 0.010 Maximum Likelihood R2: 0.059 Cragg & Uhler's R2: 0.067 McKelvey and Zavoina's R2: 0.064 Variance of y*: 3.513 Variance of error: 3.290 Count R2: 0.475 Adj Count R2: 0.000 AIC: 2.086 AIC*n: 417.108 BIC: -629.362 BIC': -1.460
Note that if the ones and zeros were reversed in both prog1 and prog3 then the relative risk ratio for prog1 would be 1/.3569 = 2.80 and for prog3 would be 1/.4274 = 2.34.
The fitstat gives a deviance of 409.11 which is lower than the deviance of 409.33 for the model that used the dichotomous variable academic. This is not a very big change in the deviance. If you look at the AIC you will see that the value for current model (2.086) is actually larger than for the model with academic (2.077). Again, this is a very small change which suggests that the three category predictor, prog, is not really any better than the dichotomous predictor academic.
4.2.4 Example 4
Next we will look at a model that has both categorical and continuous predictor variables and their interaction.
generate mathacad = math*academicologit ses academic math mathacad
Ordered logit estimates Number of obs = 200 LR chi2(3) = 19.02 Prob > chi2 = 0.0003 Log likelihood = -201.07214 Pseudo R2 = 0.0452
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- academic | .4449579 1.73113 0.26 0.797 -2.947995 3.837911 math | .0423708 .0243203 1.74 0.081 -.005296 .0900376 mathacad | .0025625 .0327299 0.08 0.938 -.061587 .0667119 -------------+---------------------------------------------------------------- _cut1 | 1.255304 1.181954 (Ancillary parameters) _cut2 | 3.4974 1.21058 ------------------------------------------------------------------------------
We can tell from the test of the individual coefficients that the interaction term is not significant but let's run a likelihood ratio test anyway, just to confirm what we already know.
lrtest, saving(0)ologit ses academic math
Ordered logit estimates Number of obs = 200 LR chi2(2) = 19.01 Prob > chi2 = 0.0001 Log likelihood = -201.07521 Pseudo R2 = 0.0451
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- academic | .578395 .3035933 1.91 0.057 -.0166369 1.173427 math | .0437666 .0165564 2.64 0.008 .0113166 .0762166 -------------+---------------------------------------------------------------- _cut1 | 1.322609 .8117558 (Ancillary parameters) _cut2 | 3.564826 .851694 ------------------------------------------------------------------------------
lrtest Ologit: likelihood-ratio test chi2(1) = 0.01 Prob > chi2 = 0.9376
Now we see that both math and academic are significant. However, the coefficient for math is for a one point change in the math test score, which is not very meaningful. Let's create a new variable math10 which is the math test score divided by ten. A change of ten points on the math test will be more meaningful than a one point change. The ologit will be followed by listcoef and fitstat.
generate math10 = math/10ologit ses academic math10
Ordered logit estimates Number of obs = 200 LR chi2(2) = 19.01 Prob > chi2 = 0.0001 Log likelihood = -201.07521 Pseudo R2 = 0.0451
------------------------------------------------------------------------------ ses | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- academic | .578395 .3035933 1.91 0.057 -.0166369 1.173427 math10 | .4376661 .1655641 2.64 0.008 .1131664 .7621657 -------------+---------------------------------------------------------------- _cut1 | 1.322609 .8117558 (Ancillary parameters) _cut2 | 3.564826 .851694 ------------------------------------------------------------------------------
listcoef
ologit (N=200): Factor Change in Odds
Odds of: >m vs <=m
---------------------------------------------------------------------- ses | b z P>|z| e^b e^bStdX SDofX -------------+-------------------------------------------------------- academic | 0.57840 1.905 0.057 1.7832 1.3358 0.5006 math10 | 0.43767 2.643 0.008 1.5491 1.5069 0.9368 ----------------------------------------------------------------------
fitstat
Measures of Fit for ologit of ses
Log-Lik Intercept Only: -210.583 Log-Lik Full Model: -201.075 D(196): 402.150 LR(2): 19.015 Prob > LR: 0.000 McFadden's R2: 0.045 McFadden's Adj R2: 0.026 Maximum Likelihood R2: 0.091 Cragg & Uhler's R2: 0.103 McKelvey and Zavoina's R2: 0.099 Variance of y*: 3.651 Variance of error: 3.290 Count R2: 0.480 Adj Count R2: 0.010 AIC: 2.051 AIC*n: 410.150 BIC: -636.320 BIC': -8.418
From the listcoef results we see that for every ten point increase in math the odds of being in high ses versus medium and low ses are about 1.5 times greater. The same thing is true for the odds of medium and high ses versus low ses. The relative risk ratio for math10 is less than that of academic which indicates that the odds are about 1.8 times greater from students in the academic program.
From the fitstat restults we can see that the deviance has dropped to 401.4 and the AIC is down to 2.05, both of which indicate that this model fits better than the model without math.