Introduction

A table is a structured arrangement of data, typically organized in rows and columns. It helps you see and compare data easily, making it simpler to understand and communicate the results.

Key components of a statistical table:

  • Title: Clearly describes the content and context of the table.

  • Rows: Represent different categories, groups, or individual data points.

  • Columns: Indicate variables or measures being reported.

  • Cells: Contain the actual data values corresponding to the intersection of rows and columns.

  • Headings: Labels for rows and columns to clarify the data being presented.

  • Footnotes: Additional information or explanations about the data.

We will explore how to effectively generate and present data using R, with a focus on utilizing RMarkdown for creating professional reports.

In this workshop we will cover r packages: kableExtra, flextable, gt, gtExtras, DT, table1, sjPlot, and gtsummary.

Basic Report and Result Tables in R

  • A data.frame in R is a type of data structure used to store data in a table format. It is one of the most common and versatile structures in R, allowing for the storage of different types of data (e.g., numeric, character, factor) in a single object.

There are two generic function in R that are used to display the output in the console.

  • The print() function is used to display the contents of an object in the console. It’s the most basic way to output data or results in R.

  • The summary() function provides a quick overview of the main statistical features of an object.

Both functions above work on various object types, such as vectors, data frames, and models.

Note: Implicit Printing: When we type an object’s name and run it, R internally calls the print() function to display the object’s contents. This is why you see the output in the console even if you don’t explicitly use print().

Review of R basic outputs

The first data that we are using in this workshop is the hsbdemo data set. The data is a sample of high school performance for 200 students.

The first step in any statistical analysis is to understand our data.

Note: The datasets used in this workshop are not real and are intended solely to demonstrate statistical analysis.

#Read the data
hsb <- read.csv("https://stats.idre.ucla.edu/stat/data/hsbdemo.csv")
#Names of columns
names(hsb)
##  [1] "id"      "female"  "ses"     "schtyp"  "prog"    "read"    "write"  
##  [8] "math"    "science" "socst"   "honors"  "awards"  "cid"
#Structure of data.frame
str(hsb)
## 'data.frame':    200 obs. of  13 variables:
##  $ id     : int  45 108 15 67 153 51 164 133 2 53 ...
##  $ female : chr  "female" "male" "male" "male" ...
##  $ ses    : chr  "low" "middle" "high" "low" ...
##  $ schtyp : chr  "public" "public" "public" "public" ...
##  $ prog   : chr  "vocation" "general" "vocation" "vocation" ...
##  $ read   : int  34 34 39 37 39 42 31 50 39 34 ...
##  $ write  : int  35 33 39 37 31 36 36 31 41 37 ...
##  $ math   : int  41 41 44 42 40 42 46 40 33 46 ...
##  $ science: int  29 36 26 33 39 31 39 34 42 39 ...
##  $ socst  : int  26 36 42 32 51 39 46 31 41 31 ...
##  $ honors : chr  "not enrolled" "not enrolled" "not enrolled" "not enrolled" ...
##  $ awards : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ cid    : int  1 1 1 1 1 1 1 1 1 1 ...
#Change categorical variables from character to factors
hsb <- within(hsb,{
             female <- factor(female)
             ses <- factor(ses)
             schtyp <- factor(schtyp)
             prog <- factor(prog)
             honors <- factor(honors)
             })
#Print first 6 rows of data
hsb6 <- head(hsb)
print(hsb6)
##    id female    ses schtyp     prog read write math science socst       honors
## 1  45 female    low public vocation   34    35   41      29    26 not enrolled
## 2 108   male middle public  general   34    33   41      36    36 not enrolled
## 3  15   male   high public vocation   39    39   44      26    42 not enrolled
## 4  67   male    low public vocation   37    37   42      33    32 not enrolled
## 5 153   male middle public vocation   39    31   40      39    51 not enrolled
## 6  51 female   high public  general   42    36   42      31    39 not enrolled
##   awards cid
## 1      0   1
## 2      0   1
## 3      0   1
## 4      0   1
## 5      0   1
## 6      0   1

By printing the first 6 rows of the data we created a Tabular of the first 6 observations.

We can use summary() function to report summary statistics. for example we can get the summary statistics of students scores and honors.

#Summary statistics
summary(hsb[c("read", "write", "math", "science", "socst", "honors")])
##       read           write            math          science     
##  Min.   :28.00   Min.   :31.00   Min.   :33.00   Min.   :26.00  
##  1st Qu.:44.00   1st Qu.:45.75   1st Qu.:45.00   1st Qu.:44.00  
##  Median :50.00   Median :54.00   Median :52.00   Median :53.00  
##  Mean   :52.23   Mean   :52.77   Mean   :52.65   Mean   :51.85  
##  3rd Qu.:60.00   3rd Qu.:60.00   3rd Qu.:59.00   3rd Qu.:58.00  
##  Max.   :76.00   Max.   :67.00   Max.   :75.00   Max.   :74.00  
##      socst                honors   
##  Min.   :26.00   enrolled    : 53  
##  1st Qu.:46.00   not enrolled:147  
##  Median :52.00                     
##  Mean   :52.41                     
##  3rd Qu.:61.00                     
##  Max.   :71.00

Contingency Table with table() and xtab()

The table() function from R base creates frequency tables that summarize categorical data. We can also use function xtab from R stats package.

In our data we want to make a cross tabulate or contingency table for variables ses and honors.

#Tow-way Contingency Table
tab1 <- table(hsb$ses, hsb$honors)
tab1
##         
##          enrolled not enrolled
##   high         26           32
##   low          11           36
##   middle       16           79
#Proportional table
prop.table(tab1)
##         
##          enrolled not enrolled
##   high      0.130        0.160
##   low       0.055        0.180
##   middle    0.080        0.395
#Proportional table by row
prop.table(tab1, margin = 1)
##         
##           enrolled not enrolled
##   high   0.4482759    0.5517241
##   low    0.2340426    0.7659574
##   middle 0.1684211    0.8315789
#Tow-way Contingency Table
tab2 <- xtabs(~ ses + honors, data = hsb)
tab2
##         honors
## ses      enrolled not enrolled
##   high         26           32
##   low          11           36
##   middle       16           79
#Proportional table
prop.table(tab2)
##         honors
## ses      enrolled not enrolled
##   high      0.130        0.160
##   low       0.055        0.180
##   middle    0.080        0.395
#Proportional table by row
prop.table(tab2, margin = 1)
##         honors
## ses       enrolled not enrolled
##   high   0.4482759    0.5517241
##   low    0.2340426    0.7659574
##   middle 0.1684211    0.8315789
#Summary on a table object will perform a chi-squared test
summary(tab2)
## Call: xtabs(formula = ~ses + honors, data = hsb)
## Number of cases in table: 200 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 14.783, df = 2, p-value = 0.0006164
#Three-way cross tab
tab3 <- xtabs(~ ses + honors + female, data = hsb)
tab3
## , , female = female
## 
##         honors
## ses      enrolled not enrolled
##   high         15           14
##   low          10           22
##   middle       10           38
## 
## , , female = male
## 
##         honors
## ses      enrolled not enrolled
##   high         11           18
##   low           1           14
##   middle        6           41
ftable(tab3)
##                     female female male
## ses    honors                         
## high   enrolled                15   11
##        not enrolled            14   18
## low    enrolled                10    1
##        not enrolled            22   14
## middle enrolled                10    6
##        not enrolled            38   41

Regression models

Regression model often used to understand the relationship between a dependent variable and one or more independent variables.

In R we use summary() function to extract and report results of a regression model.

As an example we are using hsb data to regress math score on read and write score and prog.

#Run regression of math on read, write, and prog
m1 <- lm(math ~ read + write + prog, data = hsb)
lm.result <- summary(m1)
lm.result
## 
## Call:
## lm(formula = math ~ read + write + prog, data = hsb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -19.257  -4.564  -0.211   4.271  17.527 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  19.20202    3.35561   5.722 3.91e-08 ***
## read          0.37186    0.05685   6.541 5.24e-10 ***
## write         0.29591    0.06149   4.812 2.98e-06 ***
## proggeneral  -2.87185    1.18968  -2.414  0.01670 *  
## progvocation -3.79862    1.23942  -3.065  0.00249 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.408 on 195 degrees of freedom
## Multiple R-squared:  0.5415, Adjusted R-squared:  0.5321 
## F-statistic: 57.57 on 4 and 195 DF,  p-value: < 2.2e-16
#Extracting coefficients table (it is a matrix)
lm.result$coefficients
##                Estimate Std. Error   t value     Pr(>|t|)
## (Intercept)  19.2020151 3.35561280  5.722357 3.911030e-08
## read          0.3718589 0.05684928  6.541138 5.240877e-10
## write         0.2959093 0.06149161  4.812190 2.984266e-06
## proggeneral  -2.8718518 1.18968055 -2.413969 1.670404e-02
## progvocation -3.7986171 1.23941526 -3.064846 2.486293e-03
#Adding Confidence interval for coefficients
lm.table <- cbind(lm.result$coefficients, confint(m1))
#Changing the names of columns
colnames(lm.table)[c(5,6)] <- c("LL", "UL")
#Round number to 4 digits and print
round(lm.table, 4)
##              Estimate Std. Error t value Pr(>|t|)      LL      UL
## (Intercept)   19.2020     3.3556  5.7224   0.0000 12.5841 25.8200
## read           0.3719     0.0568  6.5411   0.0000  0.2597  0.4840
## write          0.2959     0.0615  4.8122   0.0000  0.1746  0.4172
## proggeneral   -2.8719     1.1897 -2.4140   0.0167 -5.2181 -0.5256
## progvocation  -3.7986     1.2394 -3.0648   0.0025 -6.2430 -1.3542

Advanced Tables in R Using R Packages within RMarkdown

Advantages of Using R Markdown to Create Tables

  • Reproducibility: Embedding code directly in the document ensures that tables can be easily reproduced by anyone, reducing errors from manual copying.

  • Dynamic Updates: Changes to the data or analysis are automatically reflected in the tables when the document is re-rendered, eliminating the need for manual updates.

  • Automation: RMarkdown generates and formats tables automatically, streamlining the process and avoiding repetitive tasks like reformatting.

  • Consistency: Tables maintain a consistent format and style throughout the document, which is particularly useful for large reports.

  • Customization and Formatting Advanced table customization options allow for professional and polished presentation, without needing to rely on external tools for formatting.

In summary, using RMarkdown to create tables ensures automation, reproducibility, and consistency, while also providing powerful customization and formatting options that are not available with simple copy-pasting from the console.

In the rest of the workshops we are introducing some of those packages with examples.

knitr::kable and kableExtra

Advantages of knitr::kable and kableExtra

  • Simplicity: knitr::kable offers a straightforward way to create clean tables with minimal code. It’s easy to use for beginners and perfect for simple tables that don’t require extensive customization.

  • Integration with RMarkdown: kable() is designed to work seamlessly with RMarkdown, making it easy to generate tables that fit well within dynamic documents.

  • Flexibility with kableExtra: When paired with kableExtra, kable becomes highly customizable. You can add advanced features like multi-row headers, colors, borders, alignment adjustments, column spanning, custom styling, and footnotes.

  • Theme: kableExtra offers some alternative HTML table themes other than the default bootstrap theme.

The kable() function in package knitr is a very simple table generator. It only generates tables for strictly rectangular data such as matrices and data frames.

This function does have a large number of arguments for you to customize the appearance of tables:

kable(x, format, digits = getOption(“digits”), row.names = NA, col.names = NA, align, caption = NULL, label = NULL, format.args = list(), escape = TRUE, …)

format is A character string. Possible values are latex, html, pipe (Pandoc’s pipe tables), … .

If you only need one table format that is not the default format for a document, you can set the global R option knitr.table.format, e.g.,

options(knitr.table.format = “html”)

We are using dataset state.x77.

state7 <- data.frame(state.x77)[1:7,]

knitr::kable(head(state7), format = "html")
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766

If we use pipe format the table will be like below image which is a Pandoc’s pipe table and depends on type of R Markdown specifications it will be rendered.

For example, since I use ioslide to create this slides, If I do not specify format or I use format = pipe the output table will be look like this:

my.table <- knitr::kable(state7, format = "pipe")
my.table
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

To learn more about knitr::kable() and it’s options you can check out the link below:

rmarkdown-cookbook, 10.1 The function knitr::kable()

kableExtra

Package kableExtra is an addition to knitr::kable(). The goal of R package kableExtra is to help you build common complex tables and manipulate table styles. It imports the pipe %>% symbol from magrittr (also works with R base pipe, |>) and verbalize all the functions, so basically you can add “layers” to a kable output in a way that is similar with ggplot2.

The basic HTML output is just a plain HTML table without any styling.

#plain HTML
kbl(state7)
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

Bootstrap theme

kable_styling() will automatically apply twitter bootstrap theme to the table.

To see more option for this function please check the help file:

?kable_styling

state7 %>%
  kbl() %>%
  #twitter bootstrap theme
  kable_styling()
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

Alternative themes

kableExtra also offers 6 other alternative HTML table themes other than the default bootstrap theme. They are: kable_paper, kable_classic, kable_classic_2, kable_minimal, kable_material and kable_material_dark.

We can also use options in kable_styling() to customize output table.

Here is some examples:

state7  %>%
  kbl() %>%
  #paper  theme with hover and full_width = F
  kable_paper("hover", full_width = F)
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

Full width

state7 %>%
  kbl(caption = "Recreating booktabs style table") %>%
  # classic theme  and other options
  kable_classic(full_width = F, html_font = "Cambria",  position = "left")
Recreating booktabs style table
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

striped

state7 %>%
  kbl() %>%
  #material theme with striped rows
  kable_material(lightable_options= c("striped"))
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

Column / Row Specification

kbl(state7) %>%
  #paper theme 
  kable_paper(full_width = F) %>%
  #Make first column bold and add border
  column_spec(1, bold = T, border_right = T) %>%
  #Make column 9 width larger and background yellow
  column_spec(9, width = "6em", background = "yellow")
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862
kbl(state7) %>%
  #paper theme 
kable_paper(full_width = F) %>%
  #Conditional formatting column 2 
  column_spec(2, color = spec_color(state7$Population, palette = c("black", "red"))) %>%
  #Conditional formatting background column 4 text white 
  column_spec(4, color = "white",
              background = spec_color(state7$Illiteracy<=1.5, palette = c("red", "green"))) %>% 
  #Change fist row angle
    row_spec(0, angle = -45)
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862

One of the nice future of kableExtra which only available for html format is Scroll box. If you have a huge table and you want to include it in your website or HTML document but don’t want to use a lots of space, using scroll box is a good solution.

kbl(state.x77) %>%
  kable_paper() %>%
  #Add scroll bar
  scroll_box(width = "400px", height = "200px")
Population Income Illiteracy Life Exp Murder HS Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862
Delaware 579 4809 0.9 70.06 6.2 54.6 103 1982
Florida 8277 4815 1.3 70.66 10.7 52.6 11 54090
Georgia 4931 4091 2.0 68.54 13.9 40.6 60 58073
Hawaii 868 4963 1.9 73.60 6.2 61.9 0 6425
Idaho 813 4119 0.6 71.87 5.3 59.5 126 82677
Illinois 11197 5107 0.9 70.14 10.3 52.6 127 55748
Indiana 5313 4458 0.7 70.88 7.1 52.9 122 36097
Iowa 2861 4628 0.5 72.56 2.3 59.0 140 55941
Kansas 2280 4669 0.6 72.58 4.5 59.9 114 81787
Kentucky 3387 3712 1.6 70.10 10.6 38.5 95 39650
Louisiana 3806 3545 2.8 68.76 13.2 42.2 12 44930
Maine 1058 3694 0.7 70.39 2.7 54.7 161 30920
Maryland 4122 5299 0.9 70.22 8.5 52.3 101 9891
Massachusetts 5814 4755 1.1 71.83 3.3 58.5 103 7826
Michigan 9111 4751 0.9 70.63 11.1 52.8 125 56817
Minnesota 3921 4675 0.6 72.96 2.3 57.6 160 79289
Mississippi 2341 3098 2.4 68.09 12.5 41.0 50 47296
Missouri 4767 4254 0.8 70.69 9.3 48.8 108 68995
Montana 746 4347 0.6 70.56 5.0 59.2 155 145587
Nebraska 1544 4508 0.6 72.60 2.9 59.3 139 76483
Nevada 590 5149 0.5 69.03 11.5 65.2 188 109889
New Hampshire 812 4281 0.7 71.23 3.3 57.6 174 9027
New Jersey 7333 5237 1.1 70.93 5.2 52.5 115 7521
New Mexico 1144 3601 2.2 70.32 9.7 55.2 120 121412
New York 18076 4903 1.4 70.55 10.9 52.7 82 47831
North Carolina 5441 3875 1.8 69.21 11.1 38.5 80 48798
North Dakota 637 5087 0.8 72.78 1.4 50.3 186 69273
Ohio 10735 4561 0.8 70.82 7.4 53.2 124 40975
Oklahoma 2715 3983 1.1 71.42 6.4 51.6 82 68782
Oregon 2284 4660 0.6 72.13 4.2 60.0 44 96184
Pennsylvania 11860 4449 1.0 70.43 6.1 50.2 126 44966
Rhode Island 931 4558 1.3 71.90 2.4 46.4 127 1049
South Carolina 2816 3635 2.3 67.96 11.6 37.8 65 30225
South Dakota 681 4167 0.5 72.08 1.7 53.3 172 75955
Tennessee 4173 3821 1.7 70.11 11.0 41.8 70 41328
Texas 12237 4188 2.2 70.90 12.2 47.4 35 262134
Utah 1203 4022 0.6 72.90 4.5 67.3 137 82096
Vermont 472 3907 0.6 71.64 5.5 57.1 168 9267
Virginia 4981 4701 1.4 70.08 9.5 47.8 85 39780
Washington 3559 4864 0.6 71.72 4.3 63.5 32 66570
West Virginia 1799 3617 1.4 69.48 6.7 41.6 100 24070
Wisconsin 4589 4468 0.7 72.48 3.0 54.5 149 54464
Wyoming 376 4566 0.6 70.29 6.9 62.9 173 97203

To learn more about table styles and options in kable_styling You can check the link below:

Create Awesome HTML Table with knitr::kable and kableExtra

Using the flextable

flextable is designed to create and format tables that can be easily exported into Word and PowerPoint documents. It allows users to create richly formatted tables with features like text formatting, colors, borders, and alignment, making it ideal for generating professional-looking tables in document reports.

Advantages of flextable Compared to Other Packages

  • Extensive Customization: flextable offers detailed control over the formatting of tables, including text alignment, fonts, colors, borders, and cell-level styling. This level of customization goes beyond what simpler packages like kable can offer, allowing for professional and polished tables.

  • Integration with Word and PowerPoint: One of the standout features of flextable is its seamless integration with Microsoft Word and PowerPoint through the officer package. You can directly export beautifully formatted tables into these documents, making it ideal for users who frequently work with Word and PowerPoint.

  • Conditional Formatting: The package allows for conditional formatting based on the values in the table, which is useful for highlighting key data points or making tables more informative visually.

  • Predefined Themes: flextable offers built-in themes that provide consistent, aesthetically pleasing styles for tables. This reduces the effort needed to style tables while maintaining a professional appearance.

The main function is flextable which takes a data.frame as argument and returns a flextable object.

#def
ft <- flextable(hsb[1:10, -13])
ft

id

female

ses

schtyp

prog

read

write

math

science

socst

honors

awards

45

female

low

public

vocation

34

35

41

29

26

not enrolled

0

108

male

middle

public

general

34

33

41

36

36

not enrolled

0

15

male

high

public

vocation

39

39

44

26

42

not enrolled

0

67

male

low

public

vocation

37

37

42

33

32

not enrolled

0

153

male

middle

public

vocation

39

31

40

39

51

not enrolled

0

51

female

high

public

general

42

36

42

31

39

not enrolled

0

164

male

middle

public

vocation

31

36

46

39

46

not enrolled

0

133

male

middle

public

vocation

50

31

40

34

31

not enrolled

0

2

female

middle

public

vocation

39

41

33

42

41

not enrolled

0

53

male

middle

public

vocation

34

37

46

39

31

not enrolled

0

ft |>
  #add header row
  add_header_row(
    colwidths = c(3, 2, 5, 2),
    values = c("Student", "School", "Grades", "Achievements")) |>
  #Use theme_vanilla
  theme_vanilla() |>
  #Add footer
  add_footer_lines("This data is simulated and it is not real") |>
  color(part = "footer", color = "#666666") |>
  #set Caption
  set_caption(caption = "First 10 rows of a sample of high school data") |>
  #Align header to center
  align(align = "center", part = "header", i = 1)
First 10 rows of a sample of high school data

Student

School

Grades

Achievements

id

female

ses

schtyp

prog

read

write

math

science

socst

honors

awards

45

female

low

public

vocation

34

35

41

29

26

not enrolled

0

108

male

middle

public

general

34

33

41

36

36

not enrolled

0

15

male

high

public

vocation

39

39

44

26

42

not enrolled

0

67

male

low

public

vocation

37

37

42

33

32

not enrolled

0

153

male

middle

public

vocation

39

31

40

39

51

not enrolled

0

51

female

high

public

general

42

36

42

31

39

not enrolled

0

164

male

middle

public

vocation

31

36

46

39

46

not enrolled

0

133

male

middle

public

vocation

50

31

40

34

31

not enrolled

0

2

female

middle

public

vocation

39

41

33

42

41

not enrolled

0

53

male

middle

public

vocation

34

37

46

39

31

not enrolled

0

This data is simulated and it is not real

The flextable package will not aggregate data for you but it will help you to present aggregated data. However, it has some useful function to generate descriptive statistics.

Cross tab with proc_freq()

Function proc_freq() compute a contingency table and create a flextable from the result. The aim of the function is to reproduce the results of the SAS PROC FREQ.

proc_freq(hsb, "ses", "honors",
          include.row_percent = TRUE,
          include.column_percent = TRUE,
          include.table_percent = TRUE)

ses

honors

enrolled

not enrolled

Total

high

Count

26 (13.0%)

32 (16.0%)

58 (29.0%)

Mar. pct (1)

49.1% ; 44.8%

21.8% ; 55.2%

low

Count

11 (5.5%)

36 (18.0%)

47 (23.5%)

Mar. pct

20.8% ; 23.4%

24.5% ; 76.6%

middle

Count

16 (8.0%)

79 (39.5%)

95 (47.5%)

Mar. pct

30.2% ; 16.8%

53.7% ; 83.2%

Total

Count

53 (26.5%)

147 (73.5%)

200 (100.0%)

(1) Columns and rows percentages

There are many more flexibility in the flextable package, especially when used in conjunction with other packages, that we cannot cover in this workshop.

For more on flextable you can check the links below:

Using flextable

flextable cheat sheet

Function reference (manuals)

flextable gallery

DT

The R package DT provides an R interface to the JavaScript library DataTables. R data objects (matrices or data frames) can be displayed as tables on HTML pages, and DataTables provides interactive table with filtering, pagination, sorting, and many other features in the tables.

Key Features of DT Package

  • Interactivity: DT creates interactive tables with features like sorting, and searching, ideal for web use and Shiny apps. flextable and kableExtra focus on static tables.

JavaScript Integration: DT leverages DataTables for advanced client-side features like inline editing and exporting, making it great for web applications.

Ease of Use for Web Applications: DT is best for web applications and easy to use and implement.

The main function in this package is datatable(). It creates an HTML widget to display R data objects with DataTables.

datatable(diamonds[1:200,])

If you are familiar with DataTables Javascript HTML table library, you may use the options argument to customize the table.

We can added a filter argument in datatable() to automatically generate column filters. By default, the filters are not shown since filter = "none". You can enable these filters by filter = "top" or "bottom".

#Add filter
datatable(diamonds[1:200,], filter = 'top', options = list(
  pageLength = 5, autoWidth = TRUE
))

For more examples and options for package DT you can check the link below:

DT: An R interface to the DataTables library

gt

Package gt is aimed to distinguish between data tables (e.g., tibbles, data.frames, etc.) and presentation tables and summary tables.

Advantage of the gt Package

  • Customization:

    gt provides extensive options for customizing table appearance, including fonts, colors, borders, and spacing. This allows for creating visually appealing and professionally formatted tables.

  • Easy to Use:

    The package has a user-friendly syntax that simplifies the creation of complex tables. It’s designed to be intuitive and easy to learn, making table creation straightforward.

  • Integration with RMarkdown:

    gt integrates well with RMarkdown, enabling you to include sophisticated tables in dynamic documents. It supports rendering in HTML and integrates seamlessly into RMarkdown reports.

  • Publication-Ready Tables:

    gt is designed for generating publication-quality tables that are clean and well-formatted. It’s ideal for academic papers, reports, and presentations where table aesthetics are important.

Here we run one simple example from the package reference page. The package gt is very similar to package flextable but it currently supports HTML, LaTex, and RTF. Package flextable is compatible with Microsoft software like word and power point.

# Modify the `airquality` dataset by adding the year
# of the measurements (1973) and limiting to 10 rows
airquality_m <- 
  airquality |>
  #add year 1973
  mutate(Year = 1973L) |>
  #select the first 10 rows
  slice(1:10)
  
# Create a display table using the `airquality`
# dataset; arrange columns into groups
gt_tbl <- 
  gt(airquality_m)
#Print gt table
gt_tbl
Ozone Solar.R Wind Temp Month Day Year
41 190 7.4 67 5 1 1973
36 118 8.0 72 5 2 1973
12 149 12.6 74 5 3 1973
18 313 11.5 62 5 4 1973
NA NA 14.3 56 5 5 1973
28 NA 14.9 66 5 6 1973
23 299 8.6 65 5 7 1973
19 99 13.8 59 5 8 1973
8 19 20.1 61 5 9 1973
NA 194 8.6 69 5 10 1973
gt_tbl |>
  #Add title and subtitle 
  tab_header(
    title = "New York Air Quality Measurements",
    subtitle = "Daily measurements in New York City (May 1-10, 1973)"
  ) |>
  #Span columns 
  tab_spanner(
    label = "Time",
    columns = c(Year, Month, Day)
  ) |>
  tab_spanner(
    label = "Measurement",
    columns = c(Ozone, Solar.R, Wind, Temp)
  )
New York Air Quality Measurements
Daily measurements in New York City (May 1-10, 1973)
Measurement Time
Ozone Solar.R Wind Temp Year Month Day
41 190 7.4 67 1973 5 1
36 118 8.0 72 1973 5 2
12 149 12.6 74 1973 5 3
18 313 11.5 62 1973 5 4
NA NA 14.3 56 1973 5 5
28 NA 14.9 66 1973 5 6
23 299 8.6 65 1973 5 7
19 99 13.8 59 1973 5 8
8 19 20.1 61 1973 5 9
NA 194 8.6 69 1973 5 10

The reference for package gt

gt package

gtExtras

Package gtExtras also provide additional functions to assist with package gt, specially if you want to include plots in your tables:

Overall, there are four families of functions in gtExtras:

  • Themes: 7 themes that style almost every element of a gt table, built off of data journalism-styled tables

  • Utilities: Helper functions for aligning/padding numbers, adding fontawesome icons, images, highlighting, dividers, styling by group, creating two tables or two column layouts, extracting ordered data from a gt table internals, or generating a random dataset.

  • Plotting: 12 plotting functions for inline sparklines, win-loss charts, distributions (density/histogram), percentiles, dot + bar, bar charts, confidence intervals, or summarizing an entire dataframe!

  • Colors: 3 functions, a palette for “Hulk” style scale (purple/green), coloring rows with good defaults from paletteer, or adding a “color box” along with the cell value

gt_tbl %>% 
  #USe theme NYT
  gt_theme_nytimes() %>% 
  #Change header title
  tab_header(title = "Table styled like the NY Times") %>% 
  #Hulk data_color
  #Trim provides a tighter range of purple/green
  gt_hulk_col_numeric(Ozone, trim = TRUE)
Table styled like the NY Times
Ozone Solar.R Wind Temp Month Day Year
41 190 7.4 67 5 1 1973
36 118 8.0 72 5 2 1973
12 149 12.6 74 5 3 1973