Ordered Logistic Regression | SAS Annotated Output

This page shows an example of an ordered logistic regression analysis with footnotes explaining the output. The data were collected on 200 high school students and are scores on various tests, including science, math, reading and social studies. The outcome measure in this analysis is socio-economic status (ses)- low, medium and high- from which we are going to see what relationships exist with science test scores (science), social science test scores (socst) and gender (female). Our response variable, ses, is going to be treated as ordinal under the assumption that the levels of ses status have a natural ordering (low to high), but the distances between adjacent levels are unknown. The dataset used in this page can be downloaded from SAS Web Books Regression with SAS.

proc logistic data = "C:\temp\hsb2" descending;
model ses = science socst female; 
run;

The LOGISTIC Procedure

              Model Information
Data Set                      TMP1.HSB2
Response Variable             ses
Number of Response Levels     3
Number of Observations        200
Model                         cumulative logit
Optimization Technique        Fisher's scoring

          Response Profile
 Ordered                      Total
   Value          ses     Frequency
       1            3            58
       2            2            95
       3            1            47

Probabilities modeled are cumulated over the lower Ordered Values.

                    Model Convergence Status
         Convergence criterion (GCONV=1E-8) satisfied.

Score Test for the Proportional Odds Assumption
Chi-Square       DF     Pr > ChiSq
    2.1498        3         0.5419

         Model Fit Statistics
                              Intercept
               Intercept         and
Criterion        Only        Covariates
AIC              425.165        399.605
SC               431.762        416.096
-2 Log L         421.165        389.605

        Testing Global Null Hypothesis: BETA=0

Test                 Chi-Square       DF     Pr > ChiSq
Likelihood Ratio        31.5604        3         <.0001
Score                   28.9853        3         <.0001
Wald                    29.0022        3         <.0001

              Analysis of Maximum Likelihood Estimates
                                 Standard          Wald
Parameter      DF    Estimate       Error    Chi-Square    Pr > ChiSq
Intercept 3     1     -5.1055      0.9226       30.6238        <.0001
Intercept 2     1     -2.7547      0.8607       10.2431        0.0014
science         1      0.0300      0.0159        3.5838        0.0583
socst           1      0.0532      0.0149       12.7778        0.0004
female          1     -0.4824      0.2785        3.0004        0.0832

           Odds Ratio Estimates
              Point          95% Wald
Effect     Estimate      Confidence Limits
science       1.030       0.999       1.063
socst         1.055       1.024       1.086
female        0.617       0.358       1.066


Association of Predicted Probabilities and Observed Responses
Percent Concordant     68.1    Somers' D    0.368
Percent Discordant     31.3    Gamma        0.370
Percent Tied            0.6    Tau-a        0.235
Pairs                 12701    c            0.684

Model Information

              Model Information
Data Set^a                      TMP1.HSB2
Response Variable^b             ses
Number of Response Levels^c     3
Number of Observations^d        200
Model^e                         cumulative logit
Optimization Technique^f        Fisher's scoring

          Response Profile
 Ordered                      Total
   Value^g          ses^g    Frequencyⁱ
       1             3           58
       2             2           95
       3             1           47

Probabilities modeled are cumulated over the lower Ordered Values.

a. Data Set – This is the SAS dataset that the ordered logistic regression was done on.

b. Response Variable – This is the dependent variable in the ordered logistic regression.

c. Number of Response Levels – This is the number of levels of the dependent variable. Our dependent variable has three levels: low, medium and high.

d. Number of Observations – This is the number of observations used in the ordered logistic regression. It may be less than the number of cases in the dataset if there are missing values for some variables in the equation. By default, SAS does a listwise deletion of incomplete cases.

e. Model – This is the model that SAS is fitting.

f. Optimization Technique – This refers to the iterative method of estimating the regression parameters. In SAS, the default is method is Fisher’s scoring method, whereas in Stata, it is the Newton-Raphson algorithm. Both techniques yield the same estimate for the regression coefficient; however, the standard errors differ between the two methods. For further discussion, see Regression Models for Categorical and Limited Dependent Variables by J. Scott Long (page 56).

g. Ordered Value and ses– Ordered value refers to how SAS orders/models the levels of the dependent variable, ses. When we specified the descending option in the procedure statement, SAS treats the levels of ses in a descending order (high to low), such that when the ordered logit regression coefficients are estimated, a positive coefficient corresponds to a positive relationship for ses status (i.e., increase values of the respective variable produces higher levels of ses) and a negative coefficient has a negative relationship with ses status (i.e., increase values of the respective variable produces lower levels of ses). Special attention needs to be placed on the ordered value since it can lead to erroneous interpretation.

i. Total Frequency – This is the observed frequency distribution of subjects in the dependent variable. Of our 200 subjects, 47 were of low ses, 95 were of middle ses and 58 reported high ses.

Model Fit Statistics

                    Model Convergence Status^k
         Convergence criterion (GCONV=1E-8) satisfied.

Score Test for the Proportional Odds Assumption^l
Chi-Square       DF     Pr > ChiSq
    2.1498        3         0.5419

         Model Fit Statistics
                              Intercept
               Intercept         and
Criterion^m        Onlyⁿ        Covariates^o
AIC              425.165        399.605
SC               431.762        416.096
-2 Log L         421.165        389.605

        Testing Global Null Hypothesis: BETA=0
Test^p                 Chi-Square^q       DF^q     Pr > ChiSq^q
Likelihood Ratio        31.5604        3         <.0001
Score                   28.9853        3         <.0001
Wald                    29.0022        3         <.0001

k. Model Convergence Status – This describes whether the maximum-likehood algorithm has converged or not and what kind of convergence criterion is used for convergence. The default convergence criterion is the relative gradient convergence criterion (GCONV), and the default precision is 10^-8.

l. Score Test for the Proportional Odds Assumption – This is the Chi-Square Score Test for the Proportional Odds Assumption. Since the ordered logit model estimates one equation over all levels of the dependent variable (as compared to the multinomial logit model, which models, assuming low ses is our referent level, an equation for medium ses versus low ses, and an equation for high ses versus low ses), the test for proportional odds tests whether our one-equation model is valid. If we were to reject the null hypothesis, we would conclude that ordered logit coefficients are not equal across the levels of the outcome and we would fit a less restrictive model (i.e., multinomial logit model). If we fail to reject the null hypothesis, we conclude that the assumption holds. For our model, the Proportional Odds Assumption appears to have held.

m. Criterion – Underneath are various measurements used to assess the model fit. The first two, Akaike Information Criterion (AIC) and Schwarz Criterion (SC) are deviants of negative two times the Log-Likelihood (-2 Log L). AIC and SC penalize the Log-Likelihood by the number of predictors in the model.

AIC – This is the Akaike Information Criterion. It is calculated as AIC = -2 Log L + 2((k-1) + s), where k is the number of levels of the dependent variable and s is the number of predictors in the model. AIC is used for the comparison of models from different samples or nonnested models. Ultimately, the model with the smallest AIC is considered the best.

SC – This is the Schwarz Criterion. It is defined as – 2 Log L + ((k-1) + s)*log(Σ f_i), where f_i‘s are the frequency values of the i^th observation, and k and s were defined previously. Like AIC, SC penalizes for the number of predictors in the model and the smallest SC is most desireable.

-2 Log L – This is negative two times the log likelihood. The -2 Log L is used in hypothesis tests for nested models.

n. Intercept Only – This column refers to the respective criterion statistics with no predictors.

o. Intercept and Covariates – This column corresponds to the respective criterion statistics for the fitted model. A fitted model includes all independent variables and the intercept. We can compare the values in this column with the criteria corresponding Intercept Only value to assess model fit/significance.

p. Test – These are three asymptotically equivalent Chi-Square tests. They test against the null hypothesis that all of the predictors’ regression coefficient are equal to zero in the model. The alternative hypothesis is that at least one of the predictors’ regression coefficients is not equal to zero. The difference between them are where on the log-likelihood function they are evaluated at. For further discussion, see Categorical Data Analysis, Second Edition, by Alan Agresti (pages 11-13).

Likelihood Ratio – This is the Likelihood Ratio (LR) Chi-Square test that at least one of the predictors’ regression coefficient is not equal to zero in the model. The LR Chi-Square statistic can be calculated by -2 Log L(null model) – 2 Log L(fitted model) = 421.165 – 389.605 = 31.5604, where L(null model) refers to the Intercept Only model and L(fitted model) refers to the Intercept and Covariates model.

Score – This is the Score Chi-Square Test that at least one of the predictors’ regression coefficient is not equal to zero in the model.

Wald – This is the Wald Chi-Square Test that at least one of the predictors’ regression coefficient is not equal to zero in the model.

q. Chi-Square, DF and Pr > ChiSq – These are the Chi-Square test statistic, Degrees of Freedom (DF) and associated p-value (PR>ChiSq) corresponding to the specific test that all of the predictors are simultaneously equal to zero. We are testing the probability (PR>ChiSq) of observing a Chi-Square statistic as extreme as, or more so, than the observed one under the null hypothesis; the null hypothesis is that all of the regression coefficients in the model are equal to zero. The DF defines the distribution of the Chi-Square test statistics and is defined by the number of predictors in the model. Typically, PR>ChiSq is compared to a specified alpha level, our willingness to accept a type I error, which is typically set at 0.05 or 0.01. The small p-value from the all three tests would lead us to conclude that at least one of the regression coefficients in the model is not equal to zero.

Analysis of Maximum Likelihood Estimates

              Analysis of Maximum Likelihood Estimates
                                 Standard          Wald
Parameter^s     DF^t   Estimate^u       Error^v   Chi-Square^w   Pr > ChiSq^w
Intercept 3     1     -5.1055      0.9226       30.6238        <.0001
Intercept 2     1     -2.7547      0.8607       10.2431        0.0014
science         1      0.0300      0.0159        3.5838        0.0583
socst           1      0.0532      0.0149       12.7778        0.0004
female          1     -0.4824      0.2785        3.0004        0.0832

           Odds Ratio Estimates
              Point          95% Wald
Effect^x     Estimate^y      Confidence Limits^z
science       1.030       0.999       1.063
socst         1.055       1.024       1.086
female        0.617       0.358       1.066

s. Parameter – These refer to the independent variables in the model as well as intercepts (a.k.a. constants) for the adjacent levels of the dependent variable.

t. DF – This column gives the degrees of freedom corresponding to the Parameter. For each Parameter estimated in the model, one DF is required, and the DF defines the Chi-Square distribution to test whether the individual regression coefficient is zero given the other variables are in the model, superscript w.

u. Estimate -These are the ordered log-odds (logit) regression coefficients.

Intercept 3 and Intercept 2 are the estimated ordered logits for the adjacent levels of the dependent variable, high versus med and low, and high and med versus low, respectively, when the independent variables are evaluated at zero. To identify this model, SAS set the first intercept, β₀, to zero. This constraint is not unique to identify the model; Stata sets the first cutpoint (a.k.a., thresholds) to zero. The different constraints do not result in different regression parameter estimates or predicted probabilities. For further discussion of the parameterization with respect to intercepts and cutpoints, we refer to Regression Models for Categorical and Limited Dependent Variables by J. Scott Long and the Stata FAQ: Fitting ordered logistic and probit models with constraints.

Intercept 3 – This is the estimated log odds for high ses versus low and middle ses when the predictor variables are evaluated at zero. The log odds of high ses versus low and middle ses for a male (female variable evaluated at zero) with a zero science and socst test score is -5.11. Note, evaluating science and socst at zero is out of the range of plausible test scores and if the test scores were mean-centered, the intercept would have a natural interpretation: log odds of high ses versus low & middle ses for a male with average science and socst test score.

Intercept 2 – This is the estimated log odds for high and middle ses versus low ses when the predictor variables are evaluated at zero. The log odds of high and middle ses versus low ses for a male with a zero science and socst test score is -2.75.

Standard interpretation of an ordered logit coefficients is that for a one unit increase in the predictor, the dependent variable level is expected to change by its respective regression coefficient in the ordered logit scale while the other variables in the model are held constant.

science – This is the ordered log-odds estimate for a one unit increase in science score on the expected ses level given the other variables are held constant in the model. If a subject were to increase his science score by one point, you’d expect his ses score would result in a 0.03 unit increase in the ordered log-odds scale while the other variables in the model are held constant.

socst – This is the ordered log-odds estimate for a one unit increase in socst score on the expected ses level given the other variables are held constant in the model. A one unit increase in socst test scores would result in a 0.053 unit increase in the expected value of ses in the ordered logit scale while the other variables in the model are held constant.

female – This is the ordered log-odds estimate of comparing females to males on expected ses given the other variables are held constant in the model. As one goes from males to females, we expect a -0.4824 unit decrease in the expected value of ses in the ordered logit scale while the other variables in the model are held constant.

v. Standard Error – These are the standard error of the individual regression coefficients. They are used in both the calculation of the Wald Chi-Square test statistic, superscript w, and the 95% Wald Confidence Limits, superscript z.

w. Wald Chi-Square & Pr > ChiSq – These are the test statistics and p-values, respectively, for the hypothesis test that an individual predictor’s regression coefficient is zero given the rest of the predictors are in the model. The Wald Chi-Square test statistic is the squared ratio of the Estimate to the Standard Error of the respective predictor. The probability that a particular Wald Chi-Square test statistic is as extreme as, or more so, than what has been observed under the null hypothesis is given by Pr > ChiSq. The Wald Chi-Square test statistic for the predictor science (0.030/0.016)² is 3.584 with an associated p-value of 0.0583. If we set our alpha level to 0.05, we would fail to reject the null hypothesis and conclude that the regression coefficient for science has not been found to be statistically different from zero in estimating ses given socst and female are in the model. The Wald Chi-Square test statistic for the predictor socst (0.053/0.015)² is 12.78 with an associated p-value of 0.0004. If we again set our alpha level to 0.05, this time we would reject the null hypothesis and conclude that the regression coefficient for socst has been found to be statistically different from zero in estimating ses given science and female are in the model. The interpretation for a dichotomous variable parallels the continuous variable.

x. Effect – Underneath are the independent variables that are to be interpreted in terms of proportional odds.

y. Point Estimate – These are the proportional odds ratios. They can be obtained by exponentiating the estimate, e^estimate. Since the response variable has multiple levels and the model assumes that as one moves to different levels of the response variable, the regression coefficients are the same and the only thing that changes is the intercept. If we view the change in levels in a cumulative sense and interpret the coefficients in odds, we are comparing the people who are in groups greater than k versus those who are in groups less than or equal to k, where k is a the level of the response variable. A standard interpretation is that for a one unit change in the predictor variable, the odds for cases in the level of the outcome that is greater than k versus less than or equal to k are the proportional odds times larger.

science – This is the proportional odds for a one unit increase in science score on ses level given the other variables are held constant in the model. For a one unit increase in science test score, the odds of high ses are 1.03 times greater than for the combined effect of middle and low ses given the all the other variables are held constant. Likewise, for a one unit increase in science test score, the odds of middle and high ses versus low ses is 1.03 times greater given all the other variables are held constant.

socst – This is the proportional odds for a one unit increase in socst score on ses level given the other variables are held constant in the model. For a one unit increase in socst test score, the odds of high ses versus the combined effect of middle and low ses is 1.05 times greater given all the other variables are held constant. Likewise, for a one unit increase in socst test score, the odds of middle and high ses versus low ses is 1.05 times greater given all the other variables are held constant.

female – This is the proportional odds of comparing females to males on ses given the other variables are held constant in the model. As one goes from males to females, the odds of high ses versus the combined effect of middle and low ses is 0.6173 times lower given all the other variables are held constant. Likewise, as one goes from males to females, the odds of middle and high ses versus low ses is 0.6173 times lower given all the other variables are held constant.

z. 95% Wald Confidence Limits – This is the Confidence Interval (CI) for the proportional odds ratio given the other predictors are in the model. For a given predictor with a level of 95% confidence, we say that we are 95% confident that the “true” population proportional odds ratio lies between the lower and upper limit of the interval. The CI is equivalent to the Wald Chi-Square test statistic; if the CI includes 1, we would fail to reject the null hypothesis that a particular ordered logit regression coefficient is zero given the other predictors are in the model at an alpha level of 0.05. The CI is more illustrative than the Wald Chi-Square test statistic.

Association of Predicted Probabilities and Observed Responses

Association of Predicted Probabilities and Observed Responses
Percent Concordant^a1     68.1    Somers' D^e1    0.368
Percent Discordant^b1     31.3    Gamma^f1        0.370
Percent Tied^c1            0.6    Tau-a^g1        0.235
Pairs^d1                 12701    c^h1            0.684

a1. Percent Concordant – A pair of observations with different observed responses is said to be concordant if the observation with the lower ordered response value has a lower predicted mean score than the observation with the higher ordered response value.

b1. Percent Discordant – If the observation with the lower ordered response value has a higher predicted mean score than the observation with the higher ordered response value, then the pair is discordant.

c1. Percent Tied – If a pair of observations with different responses is neither concordant nor discordant, it is a tie.

d1. Pairs – This is the total number of distinct pairs.

e1. Somer’s D – Somer’s D is used to determine the strength and direction of relation between pairs of variables. Its values range from -1.0 (all pairs disagree) to 1.0 (all pairs agree). It is defined as (n_c-n_d)/t where n_c is the number of pairs that are concordant, and n_d the number of pairs that are discordant, and t is the number of total number of pairs with different responses. In our example, it equals the difference between the percent concordant and the percent discordant divided by 100: (68.1-31.3)/100 = 0.368.

f1. Gamma – The Goodman-Kruskal Gamma method does not penalize for ties on either variable. Its values range from -1.0 (no association) to 1.0 (perfect association). Because it does not penalize for ties, its value will generally be greater than the values for Somer’s D.

g1. Tau-a – Kendall’s Tau-a is a modification of Somer’s D to take into the account the difference between the number of possible paired observations and the number of paired observations with different response. It is defined to be the ratio of the difference between the number of concordant pairs and the number of discordant pairs to the number of possible pairs (2(n_c-n_d)/(N(N-1)). Usually Tau-a is much smaller than Somer’s D since there would be many paired observations with the same response.

h1. c – Another measure of rank correlation of ordinal variables. It ranges from 0 to (no association) to 1 (perfect association). It is a variant of Somer’s D index.