How can I compute power for contingency tables in Stata?

There are many different programs that will compute power for contingency tables including online power calculators. This FAQ page will show you how to compute contingency table power using the Stata command power twoproportions.

The observed likelihood-ratio chi-square statistic can be used as an estimate of a noncentrality parameter parameter for a noncentral chi-square distribution. The noncentral chi-square distribution is the distribution under the alternative hypothesis. To see how this works, consider a 2-by-2 contingency table with a likelihood-ratio chi-square value of 4.6. The critical value of chi-square for 2-by-2 table with one degree of freedom is 3.84. Using the nchi2() function, which gives the cumulative noncentral chi-squared distribution, we can compute the power as follows.

display 1 - nchi2(1, 4.6, 3.84)

.57347213

This calculates a power of about .57. Notice that we haven’t mentioned anything about the sample size. Say we double the sample size but multiplying each cell frequency by two. All of the cell proportions will remain the same but the chi-square value will double. Meaning, of course, that the noncentrality parameter is doubled. Here is what the power looks like if the sample size is doubled obtained by multiplying the noncentrality parameter by 2.

display 1 - nchi2(1, 2*4.6, 3.84)

.85848997

Doubling the sample size has increased the power to about .86. This means that we can get the power for any multiple of the sample size by multiplying the noncentrality parameter by a sample size factor. The power twoproportions command computes noncentrality for varying sample size factors to come up with estimates of power for the different sample sizes.

The power twoproportions command has the follwing syntax for computing power:

 power twoproportions p1 p2, n1(numlist) n2(numlist) [options]

where p1 and p2 are the two expected sample proportions, and n1(numlist) and n2(numlist) specify the number of observations in each group.

To compute the prospective power for a 2-by-2 table we need to come up with some reasonably good guesses as to how the observations will be distributed among the cells of the contingency table. Let’s say that in a sample of 100 subjects, we believe the row variable (e.g. treatment=control or experimental) will be split 70/30 between rows one and two. And further, within the first row, we expect the proportions in the two columns (e.g. diseased=no or yes) to be .5 and .5 (35 and 35 obs) while within the second row, we expect the proportions to be .66 and .33 (20 and 10 obs).

We can use tabi with frequencies taken from our guesses to compute a likelihood-ratio chi-square.

tabi 35 35 \ 20 10, row lrchi2

           |          col
       row |         1          2 |     Total
-----------+----------------------+----------
         1 |        35         35 |        70 
           |     50.00      50.00 |    100.00 
-----------+----------------------+----------
         2 |        20         10 |        30 
           |     66.67      33.33 |    100.00 
-----------+----------------------+----------
     Total |        55         45 |       100 
           |     55.00      45.00 |    100.00 

 likelihood-ratio chi2(1) =   2.3963   Pr = 0.122

This table has 100 observations with a likelihood-ratio chi-square of 2.396 and a p-value of .122. We can get the power of this analysis by using power twoproportion, specifying the two proportions in either column, and the sample sizes in the rows.

power twoproportions .5 .3333, n1(70) n2(30) test(lrchi2)

Estimated power for a two-sample proportions test
Likelihood-ratio test
Ho: p2 = p1  versus  Ha: p2 != p1

Study parameters:

        alpha =    0.0500
            N =       100
           N1 =        70
           N2 =        30
        N2/N1 =    0.4286
        delta =   -0.1667  (difference)
           p1 =    0.5000
           p2 =    0.3333

Estimated power:

        power =    0.3405

This shows that the power for the sample with our guesses is approximately .341. Now, let’s see what happens to the power when we multiple each cell frequency by 2.

tabi 70 70 \ 40 20, row lrchi2

           |          col
       row |         1          2 |     Total
-----------+----------------------+----------
         1 |        70         70 |       140 
           |     50.00      50.00 |    100.00 
-----------+----------------------+----------
         2 |        40         20 |        60 
           |     66.67      33.33 |    100.00 
-----------+----------------------+----------
     Total |       110         90 |       200 
           |     55.00      45.00 |    100.00 

 likelihood-ratio chi2(1) =   4.7926   Pr = 0.029
 
power twoproportions .5 .3333, n1(140) n2(60) test(lrchi2)

Estimated power for a two-sample proportions test
Likelihood-ratio test
Ho: p2 = p1  versus  Ha: p2 != p1

Study parameters:

        alpha =    0.0500
            N =       200
           N1 =       140
           N2 =        60
        N2/N1 =    0.4286
        delta =   -0.1667  (difference)
           p1 =    0.5000
           p2 =    0.3333

Estimated power:

        power =    0.5909

There are 200 observations and the power has increased to .591.

Let’s view the power for total samples sizes ranging from 100 to 500, using the same row ratio (70/30) and column proportions (.5 and .333) as before. We will specify a list of sample sizes within n1() and n2(). However, we must also specify the options parallel so that Stata will pair corresponding numbers in the numlists in n1() and n2() — otherwise it will cross all the sample sizes in each list and calculate power for all crossings.

power twoproportions .5 .3333, n1(70 140 210 280 350) n2(30 60 90 120 150) test(lrchi2) parallel


  +-----------------------------------------------------------------+
  |   alpha   power       N      N1      N2   delta      p1      p2 |
  |-----------------------------------------------------------------|
  |     .05   .3405     100      70      30  -.1667      .5   .3333 |
  |     .05   .5909     200     140      60  -.1667      .5   .3333 |
  |     .05   .7648     300     210      90  -.1667      .5   .3333 |
  |     .05   .8722     400     280     120  -.1667      .5   .3333 |
  |     .05   .9335     500     350     150  -.1667      .5   .3333 |
  +-----------------------------------------------------------------+

We see the power estimates for N=100 and N=200 replicated from above. The power for a sample of 300 is about .76 while for 400 it is about .87.

We can use an alternate syntax to examine the effects of varying total sample sizes and row ratios on power. We might not be sure exactly how many subjects we can recruit, nor what proportion can be allocated to the experimental group (perhaps the drug is costly). This alternate syntax is:

 power twoproportions p1 p2, n(numlist) nratio(numlist) [options]

where n(numlist) is a list of total sample sizes and nratio(numlist) is a list of row ratios. In the code below, we study power for total sample sizes ranging from 100 to 300 in increments of 100, with ratios of control to experimental group sizes ranging from 5 to 1 in increments of 1. We use the numlist syntax “begin(increment)end”:

 power twoproportions .5 .3333, n(100(100)300) nratio(5(-1)1)

+-------------------------------------------------------------------------+
  |   alpha   power       N      N1      N2  nratio   delta      p1      p2 |
  |-------------------------------------------------------------------------|
  |     .05   .2533     100      16      83       5  -.1667      .5   .3333 |
  |     .05   .2877     100      20      80       4  -.1667      .5   .3333 |
  |     .05   .3229     100      25      75       3  -.1667      .5   .3333 |
  |     .05    .362     100      33      66       2  -.1667      .5   .3333 |
  |     .05   .3924     100      50      50       1  -.1667      .5   .3333 |
  |     .05   .4466     200      33     166       5  -.1667      .5   .3333 |
  |     .05   .4989     200      40     160       4  -.1667      .5   .3333 |
  |     .05   .5581     200      50     150       3  -.1667      .5   .3333 |
  |     .05   .6215     200      66     133       2  -.1667      .5   .3333 |
  |     .05   .6691     200     100     100       1  -.1667      .5   .3333 |
  |     .05   .6071     300      50     250       5  -.1667      .5   .3333 |
  |     .05   .6648     300      60     240       4  -.1667      .5   .3333 |
  |     .05   .7295     300      75     225       3  -.1667      .5   .3333 |
  |     .05   .7958     300     100     200       2  -.1667      .5   .3333 |
  |     .05   .8371     300     150     150       1  -.1667      .5   .3333 |
  +-------------------------------------------------------------------------+

We see that of course power increases as N grows from 100 to 300, but also as nratio decreases from 5 to 1. This suggests that balanced group allocation will yield more power.

All of the above results used the default alpha of .05. Imagine that we are afraid of making a false positive claim, so we set alpha to be 0.01 instead. If we change alpha to be .01 the power will go down as shown in the next example.

power twoproportions .5 .3333, n(100(100)300) nratio(5(-1)1) alpha(0.01)

  +-------------------------------------------------------------------------+
  |   alpha   power       N      N1      N2  nratio   delta      p1      p2 |
  |-------------------------------------------------------------------------|
  |     .01   .1033     100      16      83       5  -.1667      .5   .3333 |
  |     .01   .1227     100      20      80       4  -.1667      .5   .3333 |
  |     .01   .1434     100      25      75       3  -.1667      .5   .3333 |
  |     .01   .1671     100      33      66       2  -.1667      .5   .3333 |
  |     .01   .1846     100      50      50       1  -.1667      .5   .3333 |
  |     .01   .2322     200      33     166       5  -.1667      .5   .3333 |
  |     .01   .2732     200      40     160       4  -.1667      .5   .3333 |
  |     .01   .3232     200      50     150       3  -.1667      .5   .3333 |
  |     .01   .3812     200      66     133       2  -.1667      .5   .3333 |
  |     .01   .4256     200     100     100       1  -.1667      .5   .3333 |
  |     .01   .3725     300      50     250       5  -.1667      .5   .3333 |
  |     .01   .4307     300      60     240       4  -.1667      .5   .3333 |
  |     .01   .5026     300      75     225       3  -.1667      .5   .3333 |
  |     .01    .585     300     100     200       2  -.1667      .5   .3333 |
  |     .01   .6397     300     150     150       1  -.1667      .5   .3333 |
  +-------------------------------------------------------------------------+

Reference

Satorra, A. and Sarris, W.E. 1985. Power of the likelihood ratio test in covariance structure analysis. Psychometrica, 50: 83-90.