Among many user-written packages, package pastecs has an easy to use function called stat.desc to display a table of descriptive statistics for a list of variables. You can download the package and then load it into memory as shown below, assuming that your computer is connected to the internet. We will illustrate this using the hs0 data file.
install.packages("pastecs") library(pastecs)
hs0<-read.table("https://stats.idre.ucla.edu/stat/data/hs0.csv", sep=",", header=T) head(hs0)
id female race ses schtyp prog read write math science socst 1 70 male white low public general 57 52 41 47 57 2 121 female white middle public vocation 68 59 53 63 61 3 86 male white high public general 44 33 54 58 31 4 141 male white high public vocation 63 44 47 53 56 5 172 male white middle public academic 47 52 57 53 61 6 113 male white middle public academic 44 52 51 63 61
Let’s say we want a table of descriptive statistics for test scores.
attach(hs0) scores<-cbind(read, write, math, science, socst) stat.desc(scores) read write math science socst nbr.val 2.000000e+02 2.000000e+02 2.000000e+02 1.950000e+02 2.000000e+02 nbr.null 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 nbr.na 0.000000e+00 0.000000e+00 0.000000e+00 5.000000e+00 0.000000e+00 min 2.800000e+01 3.100000e+01 3.300000e+01 2.600000e+01 2.600000e+01 max 7.600000e+01 6.700000e+01 7.500000e+01 7.400000e+01 7.100000e+01 range 4.800000e+01 3.600000e+01 4.200000e+01 4.800000e+01 4.500000e+01 sum 1.044600e+04 1.055500e+04 1.052900e+04 1.007400e+04 1.048100e+04 median 5.000000e+01 5.400000e+01 5.200000e+01 5.300000e+01 5.200000e+01 mean 5.223000e+01 5.277500e+01 5.264500e+01 5.166154e+01 5.240500e+01 SE.mean 7.249921e-01 6.702372e-01 6.624493e-01 7.065208e-01 7.591352e-01 CI.mean.0.95 1.429653e+00 1.321679e+00 1.306321e+00 1.393448e+00 1.496982e+00 var 1.051227e+02 8.984359e+01 8.776781e+01 9.733846e+01 1.152573e+02 std.dev 1.025294e+01 9.478586e+00 9.368448e+00 9.866026e+00 1.073579e+01 coef.var 1.963036e-01 1.796037e-01 1.779551e-01 1.909743e-01 2.048620e-01
Well, you may not like the scientific notation that much. Here is what you can do to change the format of display by setting the options.
options(scipen=100) options(digits=2) stat.desc(scores) read write math science socst nbr.val 200.00 200.00 200.00 195.00 200.00 nbr.null 0.00 0.00 0.00 0.00 0.00 nbr.na 0.00 0.00 0.00 5.00 0.00 min 28.00 31.00 33.00 26.00 26.00 max 76.00 67.00 75.00 74.00 71.00 range 48.00 36.00 42.00 48.00 45.00 sum 10446.00 10555.00 10529.00 10074.00 10481.00 median 50.00 54.00 52.00 53.00 52.00 mean 52.23 52.77 52.65 51.66 52.41 SE.mean 0.72 0.67 0.66 0.71 0.76 CI.mean.0.95 1.43 1.32 1.31 1.39 1.50 var 105.12 89.84 87.77 97.34 115.26 std.dev 10.25 9.48 9.37 9.87 10.74 coef.var 0.20 0.18 0.18 0.19 0.20
What if we only want the descriptive statistics, such as the mean, median and std.dev? We can add an option as shown below.
stat.desc(scores, basic=F) read write math science socst median 50.00 54.00 52.00 53.00 52.00 mean 52.23 52.77 52.65 51.66 52.41 SE.mean 0.72 0.67 0.66 0.71 0.76 CI.mean.0.95 1.43 1.32 1.31 1.39 1.50 var 105.12 89.84 87.77 97.34 115.26 std.dev 10.25 9.48 9.37 9.87 10.74 coef.var 0.20 0.18 0.18 0.19 0.20
In the same fashion, we can also display only the basic statistics such as the number of observations and number of missing values.
stat.desc(scores, desc=F) read write math science socst nbr.val 200 200 200 195 200 nbr.null 0 0 0 0 0 nbr.na 0 0 0 5 0 min 28 31 33 26 26 max 76 67 75 74 71 range 48 36 42 48 45 sum 10446 10555 10529 10074 10481