How can I graph chi-square power curves in Stata?

It is fairly easy to generate power curves for the chi-square distribution using Stata. This FAQ page will show you one way you can do it. Here is a code fragment that you can paste into you do-file editor and run. We’ll talk about it after you take a look at it. Please note that you can change the graph scheme to a different one if you don’t care for lean1.

drop _all
local df = 1
range ncp 0 40 201 
foreach a of numlist 1 5 10 50 {
  local alpha=`a'/100
  local cv = invchi2(`df', 1-`alpha')
  generate p`a' = 1-nchi2(`df', ncp, `cv')
}
twoway (line p1 p5 p10 p50 ncp, yline(.8)), legend(order(1 ".01" 2 ".05" 3 ".10" 4 ".50")) ///
       t2title("power curves for chi-square (df=`df')") aspect(.9) scheme(lean1)

In the code fragment above the range command generates the values of the noncentrality parameter, ncp (which we will talk about below), which ranges from 0 to 40 in steps of 0.2. The foreach loop steps through each of the four values to be used as the alpha levels, .01, .05, .10, and .50. The invchi2() function is used to get the critical value of the central chi-square distribution, i.e., the distribution if chi-square under the null hypothesis. To show you how this function works, we will run it manually from the command line.

display invchi2(1, 1-.05)

3.8414588

You might recognize that value, 3.84 as the critical value of chi-square for alpha equal 0.05 with one degree of freedom. There is another chi-square function, nchi2(), which computes the probabilities for the noncentral chi-square distribution, that is, the distribution of chi-square under the alternative hypothesis. The noncentrality parameter (ncp) indicates how different the noncentral distribution is from the central distribution. The larger the ncp the greater the difference from the central chi-square distribution. Relative to the central chi-square distribution the noncentral distribution is shifted to the right and has greater variability as shown in the figure below with degrees of freedom equal 6 and ncp equal to 10.

The noncentral chi-square function, nchi2(), requires both a critical value of chi-square and a noncentrality parameter. The trick to computing power for chi-square is to use the critical value from the central chi-square distribution along with a noncentrality parameter from a noncentral chi-square distribution to compute the probability of rejecting the null hypothesis when it is false.

So now we can go ahead and run the program and look at the graph it produces.

The power curves show that power increases as the nocentrality parameter increases and power decreases as alpha gets smaller. This is exactly the way you would expect power to behave.

If you change the value of df in the third line of the program to 6, you will get the power curves for six degrees of freedom.

And if we change df to 10, we get the following set of curves.

Now, let’s see how you might be able to make use of this information. To illustrate this we will download the hsblog dataset and run a logistic regression. By the way, it is just a coincidence that there are 200 observations in the hsblog dataset and 200 data points used in the plotting of the power curves.

use https://stats.idre.ucla.edu/stat/data/hsblog, clear

logit honcomp female, nolog

Logistic regression                               Number of obs   =        200
                                                  LR chi2(1)      =       3.94
                                                  Prob > chi2     =     0.0473
Log likelihood =  -113.6769                       Pseudo R2       =     0.0170

------------------------------------------------------------------------------
     honcomp |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      female |   .6513707   .3336752     1.95   0.051    -.0026207    1.305362
       _cons |  -1.400088   .2631619    -5.32   0.000    -1.915876   -.8842998
------------------------------------------------------------------------------

We can use the likelihood ratio chi-square value (3.94) as an estimate of the noncentrality parameter with one degree of freedom. We will also note that the p-value is 0.0473 which is very close to our alpha level of 0.05. Next we will rerun the code fragment above setting the df back to one. After the program runs, list ncp and p5 (p5 is the name of the variable that contains the power for alpha equal to 0.05).

clist ncp p5 in 1/25

           ncp         p5
  1.         0        .05
  2.        .2   .0732097
  3.        .4   .0969355
  4.        .6   .1210593
  5.        .8   .1454725
  6.         1    .170075
  7.       1.2   .1947752
  8.       1.4   .2194893
  9.       1.6   .2441412
 10.       1.8   .2686618
 11.         2   .2929889
 12.       2.2   .3170667
 13.       2.4   .3408451
 14.       2.6     .36428
 15.       2.8   .3873324
 16.         3   .4099681
 17.       3.2   .4321576
 18.       3.4   .4538757
 19.       3.6   .4751009
 20.       3.8   .4958155
 21.         4   .5160053
 22.       4.2   .5356588
 23.       4.4   .5547677
 24.       4.6   .5733261
 25.       4.8   .5913305

Our estimate of the noncentrality parameter was 3.94 which falls between 3.8 and 4.0 in the above listing with power falling between .4958155 and .5160053, i.e., a power of approximately .5 which is about what we would expect with an alpha of approximately 0.05.

Let’s try one more example, this time using a two degree of freedom test.

use https://stats.idre.ucla.edu/stat/data/hsblog, clear

logit honcomp i.prog, nolog

Logistic regression                             Number of obs     =        200
                                                LR chi2(2)        =      16.15
                                                Prob > chi2       =     0.0003
Log likelihood =  -107.5719                     Pseudo R2         =     0.0698

------------------------------------------------------------------------------
     honcomp |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        prog |
   academic  |   1.206168   .4577746     2.63   0.008     .3089465     2.10339
   vocation  |  -.3007541   .5988045    -0.50   0.615    -1.474389    .8728812
             |
       _cons |  -1.691676   .4113064    -4.11   0.000    -2.497822   -.8855303
------------------------------------------------------------------------------

We will need to run the code fragment one more time setting the df to 2 then we can list the results for a range of ncp values.

clist ncp p5 if ncp>15.5 & ncp<17.5

           ncp         p5
 79.      15.6   .9519728
 80.      15.8   .9543857
 81.        16   .9566863
 82.      16.2   .9588793
 83.      16.4   .9609691
 84.      16.6   .9629601
 85.      16.8   .9648565
 86.        17   .9666622
 87.      17.2   .9683813
 88.      17.4   .9700173

Using the likelihood ratio chi-square of 16.15 as the ncp estimate we see that the estimated power falls between .9566863 and .9588793, which again seems reasonable given the very small p-value.

Please note that we are not advocating post-hoc power analysis with these two examples; rather, we are just demonstrating how the power curves work.