You may also want to check out, FAQ: How do I use odds ratio to interpret logistic regression?, on our General FAQ page.

## Introduction

Let’s begin with probability. Probabilities range between 0 and 1. Let’s say that the probability of success is .8, thus

**p = .8**

Then the probability of failure is

**q = 1 – p = .2**

Odds are determined from probabilities and range between 0 and infinity. Odds are defined as the ratio of the probability of success and the probability of failure. The odds of success are

**odds(success) = p/(1-p) or
p/q = .8/.2 = 4,**

that is, the odds of success are 4 to 1. The odds of failure would be

**odds(failure) = q/p = .2/.8 = .25.**

This looks a little strange but it is really saying that the odds of failure are 1 to 4. The odds of success and the odds of failure are just reciprocals of one another, i.e., 1/4 = .25 and 1/.25 = 4. Next, we will add another variable to the equation so that we can compute an odds ratio.

## Another example

Suppose that seven out of 10 male dogs are admitted to an obedience school while three of 10 female dogs are admitted. The probabilities for admitting a male are,

**p = 7/10 = .7 q = 1 – .7 = .3**

If male, the probability of being admitted is 0.7 and the probability of not being admitted is 0.3.

Here are the same probabilities for females,

**p = 3/10 = .3 q = 1 – .3 = .7**

If the dog is female it is just the opposite, the probability of being admitted is 0.3 and the probability of not being admitted is 0.7.

Now we can use the probabilities to compute the odds of admission for both males and females,

**odds(male) = .7/.3 = 2.33333
odds(female) = .3/.7 = .42857**

Next, we compute the odds ratio for admission,

**OR = 2.3333/.42857 = 5.44**

Thus, for a male, the odds of being admitted are 5.44 times as large as the odds for a female being admitted.

## Logistic regression in Stata

Here are the Stata logistic regression commands and
output for the example above. In this example **admit** is coded 1 for
yes and 0 for no
and **gender** is coded 1 for male and 0 for female. In Stata, the **logistic**
command produces results in terms of odds ratios while **logit** produces results in
terms of coefficients scales in log odds.

input admit gender freq 1 1 7 1 0 3 0 1 3 0 0 7 endThis data represents a 2×2 table that looks like this:

Admission 1 0 Gender 1 7 3 0 3 7 logit admit gender [fweight=freq], nolog or(frequency weights assumed) Logistic regression Number of obs = 20 LR chi2(1) = 3.29 Prob > chi2 = 0.0696 Log likelihood = -12.217286 Pseudo R2 = 0.1187 ------------------------------------------------------------------------------ admit | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gender | 5.444444 5.313234 1.74 0.082 .8040183 36.86729 ------------------------------------------------------------------------------/* Note: the above command is equivalent to -- logistic admit gender [weight=freq], nolog */ logit admit gender [weight=freq], nolog(frequency weights assumed) Logistic regression Number of obs = 20 LR chi2(1) = 3.29 Prob > chi2 = 0.0696 Log likelihood = -12.217286 Pseudo R2 = 0.1187 ------------------------------------------------------------------------------ admit | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gender | 1.694596 .9759001 1.74 0.082 -.2181333 3.607325 _cons | -.8472979 .6900656 -1.23 0.220 -2.199801 .5052058 ------------------------------------------------------------------------------

Note that **z** = 1.74 for the coefficient for
gender and for the odds ratio for gender.

## About logits

There is a direct relationship between the
coefficients produced by **logit** and the odds ratios produced by **logistic**.
First, let’s define what is meant by a logit: A logit is defined as the log
base e (log) of the odds. :

**[1] logit(p) = log(odds) = log(p/q)**

The range is negative infinity to positive infinity. In regression it is easiest to model unbounded outcomes. Logistic regression is in reality an ordinary regression using the logit as the response variable. The logit transformation allows for a linear relationship between the response variable and the coefficients:

**[2] logit(p) = a + bX**

**
or
**

**[3] log(p/q) = a + bX**

This means that the coefficients in a simple logistic regression are in terms of
the log odds, that is, the coefficient 1.694596 implies that a one unit change in gender
results in a 1.694596 unit change in the log of the odds. Equation [3] can be expressed in odds by getting rid of the **log**. This is done by taking **e** to the power for both sides of the equation.

**[4] e ^{log(p/q)} = e^{a + bX}**

**or
**

**[5]** p/q = e^{a + bX}

From this, let us define the odds of being admitted for females and males separately:

**[5a] odds _{female} = p0 /q0
**

**[5b] odds _{male} = p1 /q1**

The *odds ratio* for gender is defined as the odds of being admitted for males over the odds of being admitted for females:

**[6] OR = odds _{male} /odds_{female}**

For this particular example (which can be generalized for all simple logistic regression models), the coefficient *b *for a two category predictor can be defined as

**[7a] b = log(odds _{male}) – log(odds_{female})**

= log(odds_{male} / odds_{female})

by the quotient rule of logarithms. Using the inverse property of the log function, you can exponentiate both sides of the equality [7a] to result in [6]:

**[8] e ^{b} = e^{[log(oddsmale/oddsfemale)]} = odds_{male} /odds_{female} = OR **

which means the the exponentiated value of the coefficient *b *results in the odds ratio for gender. In our particular example, e^{1.694596} = 5.44 which implies that the odds of being admitted for males is 5.44 times that of females.