I have to do an analysis of variance on some test scores that were given to me as percentile scores. My question is, “How should I analyze percentile rank data?”
The problem, of course, is that percentile rank data are not normally distributed. Percentile ranks are ordinal and usually form a rectangular (uniform) distribution. The easiest solution is to transform the percentile rank scores into z-scores (standard normal scores) using an inverse normal function. The z-scores will be normally distributed with mean equal to zero and a standard deviation of one. The range of the z-scores will be between ±2.33. In Stata, the transformation would look like this:
generate zscore = invnorm(pctrank/100)
Specialists in testing often transform percentile ranks into NCE (normal curve equivalence) scores. NCEs are a type of standardized score with a mean of 50 and a standard deviation of 21.06. NCEs have a range of one to 99 and in many ways look a lot like percentile ranks. Here is how the NCE transformation would look in Stata:
generate nce = invnorm(pctrank/100)*21.06 + 50
Here is a table that gives a rank of percentile rank scores and their equivalent
z-scores and NCE scores:
pctrank zscore nce 1. 1 -2.326348 1.007114 2. 2 -2.053749 6.748048 3. 3 -1.880794 10.39049 4. 4 -1.750686 13.13055 5. 5 -1.644854 15.35938 6. 10 -1.281552 23.01052 7. 20 -.8416212 32.27546 8. 25 -.6744897 35.79525 9. 30 -.5244005 38.95612 10. 40 -.2533471 44.66451 11. 50 0 50 12. 60 .2533471 55.33549 13. 70 .5244005 61.04388 14. 75 .6744897 64.20476 15. 80 .8416212 67.72454 16. 90 1.281552 76.98948 17. 95 1.644854 84.64062 18. 96 1.750686 86.86945 19. 97 1.880794 89.60951 20. 98 2.053749 93.25195 21. 99 2.326348 98.99289