You may also want to check out, FAQ: How do I use odds ratio to interpret logistic regression?, on our General FAQ page.
Introduction
Let’s begin with probability. Let’s say that the probability of success is .8, thus
- p = .8
Then the probability of failure is
- q = 1 – p = .2
The odds of success are defined as
- odds(success) = p/q = .8/.2 = 4,
that is, the odds of success are 4 to 1. The odds of failure would be
- odds(failure) = q/p = .2/.8 = .25.
This looks a little strange but it is really saying that the odds of failure are 1 to 4. The odds of success and the odds of failure are just reciprocals of one another, i.e., 1/4 = .25 and 1/.25 = 4. Next, we will add another variable to the equation so that we can compute an odds ratio.
Another example
This example is adapted from Pedhazur (1997). Suppose that seven out of 10 males are admitted to an engineering school while three of 10 females are admitted. The probabilities for admitting a male are,
- p = 7/10 = .7 q = 1 – .7 = .3
Here are the same probabilities for females,
- p = 3/10 = .3 q = 1 – .3 = .7
Now we can use the probabilities to compute the admission odds for both males and females,
- odds(male) = .7/.3 = 2.33333
odds(female) = .3/.7 = .42857
Next, we compute the odds ratio for admission,
- OR = 2.3333/.42857 = 5.44
Thus, for a male, the odds of being admitted are 5.44 times as large than the odds for a female being admitted.
Logistic regression in SAS
Here are the SAS logistic regression command and output for the example above. In this example admit is coded 1 for yes and 0 for no and gender is coded 1 for male and 0 for female. In the call to proc logistic, we use the desc option (which is short for descending) to indicate that SAS should model the 1s in the outcome variable and not the 0s (which is the default). Also, we use the expb option on the model statement to have SAS display the odds ratios in the output.
data temp; input admit gender freq; cards; 1 1 7 1 0 3 0 1 3 0 0 7 ; run; proc logistic data = temp desc; weight freq; model admit = gender / expb; run;The LOGISTIC Procedure Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Exp(Est) Intercept 1 -0.8473 0.6901 1.5076 0.2195 0.429 gender 1 1.6946 0.9759 3.0152 0.0825 5.444
Note that Wald = 3.0152 for both the coefficient for gender and for the odds ratio for gender (because the coefficient and the odds ratio are two ways of saying the same thing).
About logits
There is a direct relationship between the coefficients and the odds ratios. First, let’s define what is meant by a logit: A logit is defined as the log base e (log) of the odds,
- [1] logit(p) = log(odds) = log(p/q)
Logistic regression is in reality ordinary regression using the logit as the response variable,
- [2] logit(p) = a + bX
or
[3] log(p/q) = a + bX
This means that the coefficients in logistic regression are in terms of the log odds, that is, the coefficient 1.6946 implies that a one unit change in gender results in a 1.6946 unit change in the log of the odds.
Equation [3] can be expressed in odds by getting rid of the log. This is done by taking e to the power for both sides of the equation.
- [4] p/q = ea + bX
The end result of all the mathematical manipulations is that the odds ratio can be computed by raising e to the power of the logistic coefficient,
- [5] OR = eb = e1.694596 = 5.444