Output Tables in R

Goal of this workshop

Tables are the main way we present statistical results in papers, reports, and dissertations.
Journals expect tables that are clear, properly labeled, and reproducible.
Manually formatted tables are time-consuming to update and easy to get wrong.
In this workshop, we’ll focus on creating publication-ready tables directly from R, so formatting and numbers stay in sync with your analysis.
We will introduce packages such as kableExtra, flextable, gt, gtExtras, DT, sjPlot, and gtsummary as examples of reproducible table creation in R.

From copy-paste to reproducible tables

A common workflow is:

run a model in R
copy estimates and p-values into Word or PowerPoint
adjust the table by hand

This is slow and error-prone. Any change in the code or data requires editing the table again.

A code-based workflow (for example using R Markdown):

keeps code, results, and text connected
generates tables directly from fitted models and data
updates tables automatically when the analysis changes
makes it easier to reproduce and share the results later

What is a statistical table?

A statistical table is a structured arrangement of data in rows and columns that helps readers see patterns and compare results.

Key components:

Title – briefly describes what the table shows and in what context
Rows – groups, categories, or observations
Columns – variables or summary measures
Cells – the numerical values or estimates
Headings – labels for rows and columns
Footnotes – details such as sample size, model type, or abbreviations

The following table is an example of a well-formatted statistical table created using the flextable package in R. It includes a title, clear column headings, and footnotes for additional context.

Table 1. Annual Population and Growth Rate, 2020–2024
Year	Population (millions)	Growth Rate (%)
2020	3.98†	1.2
2021	4.02	1.0
2022	4.05	0.8
2023	4.10	1.2
2024	4.12	0.5
†Population estimates based on mid-year counts.
‡ Source: National Statistics Department (2024).

Basic tables and output in R

For simple tabular results in R, you can put values into a small data.frame and print it.
The print() function is the most common way to display output in the console.
The summary() function provides a quick overview of the main features of an object.
Both print() and summary() are generic functions. They behave differently for different R objects (for example, data frames, model objects, and factors).

Example: Summary statistics of `mtcars` dataset

# Load the mtcars dataset
data(mtcars)

# First 6 rows
head(mtcars)

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

#create a data frame with mean and sd of mpg in mtcars
tab <- data.frame(
  Statistic = c("Mean", "SD"),
  Value     = c(mean(mtcars$mpg), sd(mtcars$mpg))
)

print(tab)      # print using print function (explicitly)

##   Statistic     Value
## 1      Mean 20.090625
## 2        SD  6.026948

#tab        #Or Implicit Printing

summary(mtcars[,1:4]) # summary of first 4 columns of mtcars

##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0

hsbdemo dataset

hsbdemo, is a sample of high school performance for 200 students.

The first step in any statistical analysis is to understand the data.

Note: The datasets used in this workshop are not real. They are only for demonstrating statistical analysis.

# Read the data
hsb <- read.csv("https://stats.idre.ucla.edu/stat/data/hsbdemo.csv")
# Variable names
names(hsb)

##  [1] "id"      "female"  "ses"     "schtyp"  "prog"    "read"    "write"  
##  [8] "math"    "science" "socst"   "honors"  "awards"  "cid"

# Structure of the data frame
str(hsb)

## 'data.frame':    200 obs. of  13 variables:
##  $ id     : int  45 108 15 67 153 51 164 133 2 53 ...
##  $ female : chr  "female" "male" "male" "male" ...
##  $ ses    : chr  "low" "middle" "high" "low" ...
##  $ schtyp : chr  "public" "public" "public" "public" ...
##  $ prog   : chr  "vocation" "general" "vocation" "vocation" ...
##  $ read   : int  34 34 39 37 39 42 31 50 39 34 ...
##  $ write  : int  35 33 39 37 31 36 36 31 41 37 ...
##  $ math   : int  41 41 44 42 40 42 46 40 33 46 ...
##  $ science: int  29 36 26 33 39 31 39 34 42 39 ...
##  $ socst  : int  26 36 42 32 51 39 46 31 41 31 ...
##  $ honors : chr  "not enrolled" "not enrolled" "not enrolled" "not enrolled" ...
##  $ awards : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ cid    : int  1 1 1 1 1 1 1 1 1 1 ...

# First 6 rows
head(hsb)

##    id female    ses schtyp     prog read write math science socst       honors
## 1  45 female    low public vocation   34    35   41      29    26 not enrolled
## 2 108   male middle public  general   34    33   41      36    36 not enrolled
## 3  15   male   high public vocation   39    39   44      26    42 not enrolled
## 4  67   male    low public vocation   37    37   42      33    32 not enrolled
## 5 153   male middle public vocation   39    31   40      39    51 not enrolled
## 6  51 female   high public  general   42    36   42      31    39 not enrolled
##   awards cid
## 1      0   1
## 2      0   1
## 3      0   1
## 4      0   1
## 5      0   1
## 6      0   1

By printing the first 6 rows of the data we created a Tabular of the first 6 observations.

We can use summary() function to report summary statistics. for example we can get the summary statistics of students scores and honors.

#Summary statistics
summary(hsb[c("read", "write", "math", "science", "socst", "honors")])

##       read           write            math          science     
##  Min.   :28.00   Min.   :31.00   Min.   :33.00   Min.   :26.00  
##  1st Qu.:44.00   1st Qu.:45.75   1st Qu.:45.00   1st Qu.:44.00  
##  Median :50.00   Median :54.00   Median :52.00   Median :53.00  
##  Mean   :52.23   Mean   :52.77   Mean   :52.65   Mean   :51.85  
##  3rd Qu.:60.00   3rd Qu.:60.00   3rd Qu.:59.00   3rd Qu.:58.00  
##  Max.   :76.00   Max.   :67.00   Max.   :75.00   Max.   :74.00  
##      socst          honors         
##  Min.   :26.00   Length:200        
##  1st Qu.:46.00   Class :character  
##  Median :52.00   Mode  :character  
##  Mean   :52.41                     
##  3rd Qu.:61.00                     
##  Max.   :71.00

which variables look categorical?

Contingency tables with `table()` and `xtabs()`

The table() function in base R creates frequency tables that summarize categorical data.

Some variables in hsbdemo are categorical.
We convert them to factors before creating tables.

In our data, we want a cross-tabulation (contingency table) for the variables ses and honors.

# Convert categorical variables to factors
hsb <- within(hsb, {
  female <- factor(female)
  ses    <- factor(ses)
  schtyp <- factor(schtyp)
  prog   <- factor(prog)
  honors <- factor(honors)
})

# Two-way contingency table
tab1 <- table(hsb$ses, hsb$honors)
tab1

##         
##          enrolled not enrolled
##   high         26           32
##   low          11           36
##   middle       16           79

# Proportional table
prop.table(tab1)

##         
##          enrolled not enrolled
##   high      0.130        0.160
##   low       0.055        0.180
##   middle    0.080        0.395

# Proportional table by row
prop.table(tab1, margin = 1)

##         
##           enrolled not enrolled
##   high   0.4482759    0.5517241
##   low    0.2340426    0.7659574
##   middle 0.1684211    0.8315789

We can also use the xtabs() function from the stats package.

#Two-way contingency table using xtabs()
tab2 <- xtabs(~ ses + honors, data = hsb)
tab2

##         honors
## ses      enrolled not enrolled
##   high         26           32
##   low          11           36
##   middle       16           79

#Proportional table
prop.table(tab2)

##         honors
## ses      enrolled not enrolled
##   high      0.130        0.160
##   low       0.055        0.180
##   middle    0.080        0.395

#Proportional table by row
prop.table(tab2, margin = 1)

##         honors
## ses       enrolled not enrolled
##   high   0.4482759    0.5517241
##   low    0.2340426    0.7659574
##   middle 0.1684211    0.8315789

#Summary on a table object will perform a chi-squared test
summary(tab2)

## Call: xtabs(formula = ~ses + honors, data = hsb)
## Number of cases in table: 200 
## Number of factors: 2 
## Test for independence of all factors:
##  Chisq = 14.783, df = 2, p-value = 0.0006164

#Three-way cross-tabulation
tab3 <- xtabs(~ ses + honors + female, data = hsb)

tab3

## , , female = female
## 
##         honors
## ses      enrolled not enrolled
##   high         15           14
##   low          10           22
##   middle       10           38
## 
## , , female = male
## 
##         honors
## ses      enrolled not enrolled
##   high         11           18
##   low           1           14
##   middle        6           41

ftable(tab3)

##                     female female male
## ses    honors                         
## high   enrolled                15   11
##        not enrolled            14   18
## low    enrolled                10    1
##        not enrolled            22   14
## middle enrolled                10    6
##        not enrolled            38   41

Regression models

Regression models are often used to understand the relationship between a dependent variable and one or more independent variables.

In R, we use the summary() function to extract and report results from a regression model.

As an example, we use the hsb data to regress math score on read and write scores and prog.

# Run regression of math on read, write, and prog
m1 <- lm(math ~ read + write + prog, data = hsb)
lm.result <- summary(m1)
lm.result

## 
## Call:
## lm(formula = math ~ read + write + prog, data = hsb)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -19.257  -4.564  -0.211   4.271  17.527 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  19.20202    3.35561   5.722 3.91e-08 ***
## read          0.37186    0.05685   6.541 5.24e-10 ***
## write         0.29591    0.06149   4.812 2.98e-06 ***
## proggeneral  -2.87185    1.18968  -2.414  0.01670 *  
## progvocation -3.79862    1.23942  -3.065  0.00249 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.408 on 195 degrees of freedom
## Multiple R-squared:  0.5415, Adjusted R-squared:  0.5321 
## F-statistic: 57.57 on 4 and 195 DF,  p-value: < 2.2e-16

# Extracting coefficients table (it is a matrix)
lm.result$coefficients

##                Estimate Std. Error   t value     Pr(>|t|)
## (Intercept)  19.2020151 3.35561280  5.722357 3.911030e-08
## read          0.3718589 0.05684928  6.541138 5.240877e-10
## write         0.2959093 0.06149161  4.812190 2.984266e-06
## proggeneral  -2.8718518 1.18968055 -2.413969 1.670404e-02
## progvocation -3.7986171 1.23941526 -3.064846 2.486293e-03

# Adding confidence interval for coefficients
lm.table <- cbind(lm.result$coefficients, confint(m1))

# Changing the names of columns
colnames(lm.table)[c(5,6)] <- c("LL", "UL")

# Round numbers to 4 digits and print
round(lm.table, 4)

##              Estimate Std. Error t value Pr(>|t|)      LL      UL
## (Intercept)   19.2020     3.3556  5.7224   0.0000 12.5841 25.8200
## read           0.3719     0.0568  6.5411   0.0000  0.2597  0.4840
## write          0.2959     0.0615  4.8122   0.0000  0.1746  0.4172
## proggeneral   -2.8719     1.1897 -2.4140   0.0167 -5.2181 -0.5256
## progvocation  -3.7986     1.2394 -3.0648   0.0025 -6.2430 -1.3542

Advanced Tables in R Using R Packages within R Markdown

Advantages of using R Markdown to create tables

Reproducibility
Embedding code directly in the document ensures that tables can be reproduced easily and reduces errors from manual copying.
Dynamic updates
Changes to the data or analysis are automatically reflected in the tables when the document is re-rendered.
Automation
R Markdown generates and formats tables automatically, avoiding repetitive tasks such as reformatting or retyping results.
Consistency
Tables maintain a consistent format and style throughout the document, which is especially useful for longer reports.
Customization and formatting
Table packages in R allow advanced customization and professional presentation, without relying on external tools for formatting.

In summary, using R Markdown to create tables supports automation, reproducibility, and consistency, while also providing more control over formatting than simple copy–paste from the console.

In the rest of the workshop, we introduce some of these packages through examples.

knitr::kable and kableExtra

The kable() function in the knitr package is a simple table generator for rectangular data (matrices and data frames).
It creates basic tables in formats such as HTML, LaTeX, and Pandoc pipe tables, and is often the starting point before adding styling with kableExtra.

Advantages of knitr::kable and kableExtra

Simplicity
knitr::kable provides a straightforward way to create clean tables with minimal code. It is easy to use for simple tables that do not require extensive customization.
Integration with R Markdown
kable() is designed to work seamlessly with R Markdown, making it easy to generate tables that fit well within dynamic documents.
Flexibility with kableExtra
When paired with kableExtra, kable becomes highly customizable. You can add features such as multi-row headers, colors, borders, alignment adjustments, column spanning, custom styling, and footnotes.
Themes
kableExtra offers several alternative HTML table themes beyond the default Bootstrap theme.

The kable() function in the knitr package is a very simple table generator. It only generates tables for strictly rectangular data such as matrices and data frames.

This function has a number of arguments that can be used to customize the appearance of tables:

kable(x, format, digits = getOption("digits"), row.names = NA, col.names = NA, align, caption = NULL, label = NULL, format.args = list(), escape = TRUE, …)

format is a character string. Possible values include "latex", "html", "pipe" (Pandoc’s pipe tables), and others.

If you only need one table format that is not the default format for a document, you can set the global R option knitr.table.format, for example:

options(knitr.table.format = "html")

In the examples below, we use the dataset state.x77 to make table of population, income, and literacy for the first seven states in state.x77.

state7 <- data.frame(state.x77)[1:7, ]

knitr::kable(head(state7), format = "html", 
             caption = "Table 1. Population, Income, and Literacy for First Seven States")

Table 1. Population, Income, and Literacy for First Seven States
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766

If we use the pipe format, the table will look like the image below. This is a Pandoc pipe table and how it is rendered depends on the type of R Markdown output format.

For example, since these slides use ioslides, if we do not specify format or if we use format = "pipe", the output table will look like this:

my.table <- knitr::kable(state7, 
             caption = "Table 1. Population, Income, and Literacy for First Seven States")
my.table

Table 1. Population, Income, and Literacy for First Seven States
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

To learn more about knitr::kable() and it’s options you can check out the link below:

rmarkdown-cookbook, 10.1 The function knitr::kable()

kableExtra

The kableExtra package extends knitr::kable(). Its goal is to help you build common complex tables and manipulate table styles.
It imports the pipe %>% from magrittr (and also works with the base R pipe |>) and provides a set of functions that can be added in layers to a kable output, in a way similar to how layers are added in ggplot2.

The basic HTML output from kbl() is just a plain HTML table without any styling.

#plain HTML
kbl(state7, 
             caption = "Table 1. Population, Income, and Literacy for First Seven States")

Table 1. Population, Income, and Literacy for First Seven States
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

Bootstrap theme

kable_styling() will automatically apply a Bootstrap theme to the table.

To see more options for this function, check the help file:

?kable_styling

state7 %>%
  kbl(caption = "Table 1. Population, Income, and Literacy for First Seven States") %>%
  # Bootstrap theme
  kable_styling()

Table 1. Population, Income, and Literacy for First Seven States
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

Alternative themes

kableExtra also offers six alternative HTML table themes besides the default Bootstrap theme:
kable_paper, kable_classic, kable_classic_2, kable_minimal, kable_material, and kable_material_dark.

We can also use options in kable_styling() and theme functions to customize the output table.

Here are some examples:

state7 %>%
  kbl(caption = "Table 1. Population, Income, and Literacy for First Seven States") %>%
  # paper theme with hover and full_width = FALSE
  kable_paper("hover", full_width = FALSE)

Table 1. Population, Income, and Literacy for First Seven States
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

Full width

state7 %>%
  kbl(caption = "Recreating booktabs-style table") %>%
  # classic theme and other options
  kable_classic(full_width = FALSE, html_font = "Cambria", position = "left")

Recreating booktabs-style table
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

striped

state7 %>%
  kbl(caption = "material theme with striped rows") %>%
  # material theme with striped rows
  kable_material(lightable_options = c("striped"))

material theme with striped rows
	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

Column / Row Specification

kbl(state7) %>%
  # paper theme 
  kable_paper(full_width = FALSE) %>%
  # make first column bold and add right border
  column_spec(1, bold = TRUE, border_right = TRUE) %>%
  # make column 9 wider with yellow background
  column_spec(9, width = "6em", background = "yellow")

	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

Conditional formatting

kbl(state7) %>%
  # paper theme 
  kable_paper(full_width = FALSE) %>%
  # conditional text color in column 2
  column_spec(2, color = spec_color(state7$Population, palette = c("black", "red"))) %>%
  # conditional background in column 4 with white text
  column_spec(4, color = "white",
              background = spec_color(state7$Illiteracy <= 1.5, palette = c("red", "green"))) %>%
  # change first row angle
  row_spec(0, angle = -45)

	Population	Income	Illiteracy	Life.Exp	Murder	HS.Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862

One useful feature of kableExtra (available for HTML output) is scroll_box().
If you have a large table and want to include it in a website or HTML document without using a lot of space, adding a scroll box is a good option.

kbl(state.x77) %>%
  kable_paper() %>%
  # add scroll bar
  scroll_box(width = "400px", height = "200px")

	Population	Income	Illiteracy	Life Exp	Murder	HS Grad	Frost	Area
Alabama	3615	3624	2.1	69.05	15.1	41.3	20	50708
Alaska	365	6315	1.5	69.31	11.3	66.7	152	566432
Arizona	2212	4530	1.8	70.55	7.8	58.1	15	113417
Arkansas	2110	3378	1.9	70.66	10.1	39.9	65	51945
California	21198	5114	1.1	71.71	10.3	62.6	20	156361
Colorado	2541	4884	0.7	72.06	6.8	63.9	166	103766
Connecticut	3100	5348	1.1	72.48	3.1	56.0	139	4862
Delaware	579	4809	0.9	70.06	6.2	54.6	103	1982
Florida	8277	4815	1.3	70.66	10.7	52.6	11	54090
Georgia	4931	4091	2.0	68.54	13.9	40.6	60	58073
Hawaii	868	4963	1.9	73.60	6.2	61.9	0	6425
Idaho	813	4119	0.6	71.87	5.3	59.5	126	82677
Illinois	11197	5107	0.9	70.14	10.3	52.6	127	55748
Indiana	5313	4458	0.7	70.88	7.1	52.9	122	36097
Iowa	2861	4628	0.5	72.56	2.3	59.0	140	55941
Kansas	2280	4669	0.6	72.58	4.5	59.9	114	81787
Kentucky	3387	3712	1.6	70.10	10.6	38.5	95	39650
Louisiana	3806	3545	2.8	68.76	13.2	42.2	12	44930
Maine	1058	3694	0.7	70.39	2.7	54.7	161	30920
Maryland	4122	5299	0.9	70.22	8.5	52.3	101	9891
Massachusetts	5814	4755	1.1	71.83	3.3	58.5	103	7826
Michigan	9111	4751	0.9	70.63	11.1	52.8	125	56817
Minnesota	3921	4675	0.6	72.96	2.3	57.6	160	79289
Mississippi	2341	3098	2.4	68.09	12.5	41.0	50	47296
Missouri	4767	4254	0.8	70.69	9.3	48.8	108	68995
Montana	746	4347	0.6	70.56	5.0	59.2	155	145587
Nebraska	1544	4508	0.6	72.60	2.9	59.3	139	76483
Nevada	590	5149	0.5	69.03	11.5	65.2	188	109889
New Hampshire	812	4281	0.7	71.23	3.3	57.6	174	9027
New Jersey	7333	5237	1.1	70.93	5.2	52.5	115	7521
New Mexico	1144	3601	2.2	70.32	9.7	55.2	120	121412
New York	18076	4903	1.4	70.55	10.9	52.7	82	47831
North Carolina	5441	3875	1.8	69.21	11.1	38.5	80	48798
North Dakota	637	5087	0.8	72.78	1.4	50.3	186	69273
Ohio	10735	4561	0.8	70.82	7.4	53.2	124	40975
Oklahoma	2715	3983	1.1	71.42	6.4	51.6	82	68782
Oregon	2284	4660	0.6	72.13	4.2	60.0	44	96184
Pennsylvania	11860	4449	1.0	70.43	6.1	50.2	126	44966
Rhode Island	931	4558	1.3	71.90	2.4	46.4	127	1049
South Carolina	2816	3635	2.3	67.96	11.6	37.8	65	30225
South Dakota	681	4167	0.5	72.08	1.7	53.3	172	75955
Tennessee	4173	3821	1.7	70.11	11.0	41.8	70	41328
Texas	12237	4188	2.2	70.90	12.2	47.4	35	262134
Utah	1203	4022	0.6	72.90	4.5	67.3	137	82096
Vermont	472	3907	0.6	71.64	5.5	57.1	168	9267
Virginia	4981	4701	1.4	70.08	9.5	47.8	85	39780
Washington	3559	4864	0.6	71.72	4.3	63.5	32	66570
West Virginia	1799	3617	1.4	69.48	6.7	41.6	100	24070
Wisconsin	4589	4468	0.7	72.48	3.0	54.5	149	54464
Wyoming	376	4566	0.6	70.29	6.9	62.9	173	97203

To learn more about table styles and options in kable_styling You can check the link below:

Create Awesome HTML Table with knitr::kable and kableExtra

Using the `flextable` package

flextable is designed to create and format tables that can be easily exported to Word and PowerPoint documents. It allows users to build richly formatted tables with features such as text formatting, colors, borders, and alignment, making it useful for generating professional-looking tables in document reports.

Advantages of flextable compared to other packages

Extensive customization
flextable offers detailed control over the formatting of tables, including text alignment, fonts, colors, borders, and cell-level styling. This level of customization goes beyond what simpler packages like kable can offer.
Integration with Word and PowerPoint
A key feature of flextable is its integration with Microsoft Word and PowerPoint through the officer package. You can export formatted tables directly into these documents, which is ideal if you frequently work in Word or PowerPoint.
Conditional formatting
The package supports conditional formatting based on the values in the table, which is useful for highlighting key data points or making tables more informative visually.
Predefined themes
flextable provides built-in themes that give tables a consistent and professional appearance, while reducing the effort needed to style them.

The main function is flextable(), which takes a data.frame as an argument and returns a flextable object.

ft <- flextable(hsb[1:10, -13])
ft |>
  # add header row
  add_header_row(
    colwidths = c(3, 2, 5, 2),
    values = c("Student", "School", "Grades", "Achievements")
  ) |>
  # apply vanilla theme
  theme_vanilla() |>
  # add footer
  add_footer_lines("This data is simulated and it is not real") |>
  color(part = "footer", color = "#666666") |>
  # set caption
  set_caption(caption = "First 10 rows of a sample of high school data") |>
  # align header to center
  align(align = "center", part = "header", i = 1)

First 10 rows of a sample of high school data
Student			School		Grades					Achievements
id	female	ses	schtyp	prog	read	write	math	science	socst	honors	awards
45	female	low	public	vocation	34	35	41	29	26	not enrolled	0
108	male	middle	public	general	34	33	41	36	36	not enrolled	0
15	male	high	public	vocation	39	39	44	26	42	not enrolled	0
67	male	low	public	vocation	37	37	42	33	32	not enrolled	0
153	male	middle	public	vocation	39	31	40	39	51	not enrolled	0
51	female	high	public	general	42	36	42	31	39	not enrolled	0
164	male	middle	public	vocation	31	36	46	39	46	not enrolled	0
133	male	middle	public	vocation	50	31	40	34	31	not enrolled	0
2	female	middle	public	vocation	39	41	33	42	41	not enrolled	0
53	male	middle	public	vocation	34	37	46	39	31	not enrolled	0
This data is simulated and it is not real

The flextable package will not aggregate data for you, but it helps you present aggregated data. It also has some useful functions to generate descriptive statistics.

Cross-tabulation with `proc_freq()`

The function proc_freq() computes a contingency table and creates a flextable from the result. The goal is to reproduce the output of the SAS PROC FREQ.

proc_freq(
  hsb, "ses", "honors",
  include.row_percent    = TRUE,
  include.column_percent = TRUE,
  include.table_percent  = TRUE
)

ses		honors
ses		enrolled	not enrolled	Total
high	Count	26 (13.0%)	32 (16.0%)	58 (29.0%)
high	Mar. pct (1)	49.1% ; 44.8%	21.8% ; 55.2%
low	Count	11 (5.5%)	36 (18.0%)	47 (23.5%)
low	Mar. pct	20.8% ; 23.4%	24.5% ; 76.6%
middle	Count	16 (8.0%)	79 (39.5%)	95 (47.5%)
middle	Mar. pct	30.2% ; 16.8%	53.7% ; 83.2%
Total	Count	53 (26.5%)	147 (73.5%)	200 (100.0%)
(1) Columns and rows percentages

There is much more flexibility in the flextable package, especially when used in conjunction with other packages, than we can cover in this workshop.

For more on flextable you can check the links below:

DT

The R package DT provides an interface to the JavaScript library DataTables. R data objects (matrices or data frames) can be displayed as tables on HTML pages, and DataTables provides interactive tables with filtering, pagination, sorting, and many other features.

Key features of the DT package

Interactivity
DT creates interactive tables with features such as sorting and searching, which is ideal for web use and Shiny apps. In contrast, flextable and kableExtra focus on static tables.
JavaScript integration
DT leverages the DataTables library for advanced client-side features such as filtering, pagination, and exporting, making it well suited for web applications.
Ease of use for web applications
DT is easy to use and implement when building HTML reports or Shiny apps.

The main function in this package is datatable(). It creates an HTML widget to display R data objects with DataTables.

library(ggplot2)

datatable(diamonds[1:200, ])

If you are familiar with the DataTables JavaScript table library, you can use the options argument to customize the table.

We can also add a filter argument to datatable() to automatically generate column filters. By default, the filters are not shown (filter = "none"). You can enable these filters with filter = "top" or filter = "bottom".

datatable(
  diamonds[1:200, ],
  filter  = "top",
  options = list(
    pageLength = 5,
    autoWidth  = TRUE
  )
)

For more examples and options for the DT package, see:

DT: An R interface to the DataTables library

gt (Grammar of Tables)

The gt package is designed to take data frames and tibbles and turn them into presentation-ready tables, with tools for labels, footnotes, and formatting suitable for reports and publications.

Advantages of the gt package

Customization
gt provides extensive options for customizing table appearance, including fonts, colors, borders, and spacing. This makes it possible to create visually appealing and professionally formatted tables.
Easy to use
The package has a user-friendly syntax that simplifies the creation of complex tables. It is designed to be intuitive and relatively easy to learn.
Integration with R Markdown
gt integrates well with R Markdown, enabling you to include sophisticated tables in dynamic documents. It supports rendering to HTML and other formats commonly used in reports.
Publication-ready tables
gt is designed for generating clean, well-formatted tables suitable for academic papers, reports, and presentations where table aesthetics are important.

Here we run a simple example adapted from the package reference page.

The gt package is similar to flextable, but it currently supports HTML, LaTeX, and RTF output.
The flextable package is particularly compatible with Microsoft Word and PowerPoint through the officer package.

library(dplyr)
library(gt)

# Modify the `airquality` dataset by adding the year
# of the measurements (1973) and limiting to 10 rows
airquality_m <-
  airquality |>
  # add year 1973
  mutate(Year = 1973L) |>
  # select the first 10 rows
  slice(1:10)

# Create a display table using the modified `airquality` dataset
gt_tbl <- gt(airquality_m)

# Print basic gt table
gt_tbl

Ozone	Solar.R	Wind	Temp	Month	Day	Year
41	190	7.4	67	5	1	1973
36	118	8.0	72	5	2	1973
12	149	12.6	74	5	3	1973
18	313	11.5	62	5	4	1973
NA	NA	14.3	56	5	5	1973
28	NA	14.9	66	5	6	1973
23	299	8.6	65	5	7	1973
19	99	13.8	59	5	8	1973
8	19	20.1	61	5	9	1973
NA	194	8.6	69	5	10	1973

# Add title, subtitle, and column groups
gt_tbl |>
  tab_header(
    title = "New York Air Quality Measurements",
    subtitle = "Daily measurements in New York City (May 1–10, 1973)"
  ) |>
  tab_spanner(
    label   = "Time",
    columns = c(Year, Month, Day)
  ) |>
  tab_spanner(
    label   = "Measurement",
    columns = c(Ozone, Solar.R, Wind, Temp)
  )

Measurement				Time
New York Air Quality Measurements
Daily measurements in New York City (May 1–10, 1973)
Ozone	Solar.R	Wind	Temp	Year	Month	Day
41	190	7.4	67	1973	5	1
36	118	8.0	72	1973	5	2
12	149	12.6	74	1973	5	3
18	313	11.5	62	1973	5	4
NA	NA	14.3	56	1973	5	5
28	NA	14.9	66	1973	5	6
23	299	8.6	65	1973	5	7
19	99	13.8	59	1973	5	8
8	19	20.1	61	1973	5	9
NA	194	8.6	69	1973	5	10

Reference for the gt package:

gt package

gtExtras

The gtExtras package provides additional functions to extend the gt package, especially when you want to include plots in tables or apply more advanced styling.

Overall, there are four families of functions in gtExtras:

Themes
Seven themes that style almost every element of a gt table, inspired by data journalism–styled tables.
Utilities
Helper functions for aligning and padding numbers, adding Font Awesome icons and images, highlighting, adding dividers, styling by group, creating two-table or two-column layouts, extracting ordered data from gt internals, and generating example datasets.
Plotting
Twelve plotting functions for inline sparklines, win–loss charts, distributions (density/histogram), percentiles, dot + bar plots, bar charts, confidence intervals, and summarizing an entire data frame.
Colors
Three functions for color scales, including a "Hulk" style (purple/green), coloring rows with default palettes from paletteer, and adding a "color box" next to cell values.

gt_tbl |>
  # use NYT-style theme
  gt_theme_nytimes() |>
  # change header title
  tab_header(title = "Table styled like the New York Times") |>
  # apply Hulk color scale to Ozone column (trim gives a tighter range)
  gt_hulk_col_numeric(Ozone, trim = TRUE)

Ozone	Solar.R	Wind	Temp	Month	Day	Year
Table styled like the New York Times
41	190	7.4	67	5	1	1973
36	118	8.0	72	5	2	1973
12	149	12.6	74	5	3	1973
18	313	11.5	62	5	4	1973
NA	NA	14.3	56	5	5	1973
28	NA	14.9	66	5	6	1973
23	299	8.6	65	5	7	1973
19	99	13.8	59	5	8	1973
8	19	20.1	61	5	9	1973
NA	194	8.6	69	5	10	1973

For more options and features in the gtExtras package, see:

Summary tables

Several R packages can create publication-ready summary tables with minimal effort.
These packages provide built-in functions to generate standard summary table formats.

In this part of the workshop, we will look at:

table1
sjPlot
gtsummary

Advantages and Disadvantages of Summary Table Packages

Advantage: no need to write separate data-preparation code to summarize data or model results.
Disadvantage: less flexibility, and it can be harder to customize tables beyond the defaults.

"Table 1" in statistical analysis and the `table1` package

In journal articles, especially in epidemiology and health research, the first table ("Table 1") usually presents descriptive statistics of baseline characteristics for the study sample.
This table is often stratified by one or more grouping variables, such as treatment group or outcome status.

The table1 package in R simplifies the creation of such tables.

Key features of the table1 package

Descriptive statistics
Provides means, medians, standard deviations, and proportions for various variables.
Stratification
Allows grouping and stratifying by one or more categorical variables.
Customization
Offers options to customize labels, units, and layout to meet publication standards, but customization is not always straightforward.
Easy to use (with limits)
table1 is easy to use for default tables, but users may find more advanced customization challenging.
Converting to other table packages
The output of table1() can be converted (with some limitations) to data.frame, kableExtra, or flextable using as.data.frame(), t1kable(), and t1flex().

Example: basic `table1` table

The data used here are from the boot package (melanoma).
The data consist of measurements on patients with malignant melanoma.

The grouping variable is patient status at the end of the study:
1 = melanoma death, 2 = alive, 3 = non-melanoma death.

library(boot)
library(table1)

melanoma1 <- melanoma

# Change status to factor with labels
melanoma1$status <- factor(
  melanoma1$status,
  levels = c(2, 1, 3),
  labels = c("Alive",           # reference
             "Melanoma death",
             "Non-melanoma death")
)

# Change sex to factor and label
melanoma1$sex <- factor(
  melanoma1$sex,
  labels = c("Male", "Female")
)

# Change ulcer to factor and label
melanoma1$ulcer <- factor(
  melanoma1$ulcer,
  labels = c("Absent", "Present")
)

# Basic Table 1
(table1.1 <- table1(~ sex + age + ulcer + thickness | status,
                    data = melanoma1))

	Alive (N=134)	Melanoma death (N=57)	Non-melanoma death (N=14)	Overall (N=205)
sex
Male	91 (67.9%)	28 (49.1%)	7 (50.0%)	126 (61.5%)
Female	43 (32.1%)	29 (50.9%)	7 (50.0%)	79 (38.5%)
age
Mean (SD)	50.0 (15.9)	55.1 (17.9)	65.3 (10.9)	52.5 (16.7)
Median [Min, Max]	52.0 [4.00, 84.0]	56.0 [14.0, 95.0]	65.0 [49.0, 86.0]	54.0 [4.00, 95.0]
ulcer
Absent	92 (68.7%)	16 (28.1%)	7 (50.0%)	115 (56.1%)
Present	42 (31.3%)	41 (71.9%)	7 (50.0%)	90 (43.9%)
thickness
Mean (SD)	2.24 (2.33)	4.31 (3.57)	3.72 (3.63)	2.92 (2.96)
Median [Min, Max]	1.36 [0.100, 12.9]	3.54 [0.320, 17.4]	2.26 [0.160, 12.6]	1.94 [0.100, 17.4]

Improving labels and adding units

We can improve the table by:

adding descriptive labels for variables
specifying units for continuous variables
adding a caption and a footnote
labeling the "Total" column and placing it on the left

melanoma2 <- melanoma1

# Label variables
label(melanoma2$sex)       <- "Sex"
label(melanoma2$age)       <- "Age"
label(melanoma2$ulcer)     <- "Ulceration"
# use asterisk for footnote
label(melanoma2$thickness) <- "Thickness *"

# Assign units
units(melanoma2$age)       <- "years"
units(melanoma2$thickness) <- "mm"

# Caption and footnote
caption  <- "Descriptive statistics of patient characteristics by status"
footnote <- "* Also known as Breslow thickness"

table1(
  ~ sex + age + ulcer + thickness | status,
  data     = melanoma2,
  overall  = c(left = "Total"),
  caption  = caption,
  footnote = footnote
)

Descriptive statistics of patient characteristics by status
	Total (N=205)	Alive (N=134)	Melanoma death (N=57)	Non-melanoma death (N=14)
* Also known as Breslow thickness
Sex
Male	126 (61.5%)	91 (67.9%)	28 (49.1%)	7 (50.0%)
Female	79 (38.5%)	43 (32.1%)	29 (50.9%)	7 (50.0%)
Age (years)
Mean (SD)	52.5 (16.7)	50.0 (15.9)	55.1 (17.9)	65.3 (10.9)
Median [Min, Max]	54.0 [4.00, 95.0]	52.0 [4.00, 84.0]	56.0 [14.0, 95.0]	65.0 [49.0, 86.0]
Ulceration
Absent	115 (56.1%)	92 (68.7%)	16 (28.1%)	7 (50.0%)
Present	90 (43.9%)	42 (31.3%)	41 (71.9%)	7 (50.0%)
Thickness * (mm)
Mean (SD)	2.92 (2.96)	2.24 (2.33)	4.31 (3.57)	3.72 (3.63)
Median [Min, Max]	1.94 [0.100, 17.4]	1.36 [0.100, 12.9]	3.54 [0.320, 17.4]	2.26 [0.160, 12.6]

Grouping strata under a common heading

Now we group the two death strata (Melanoma and Non-melanoma) under a common "Death" heading.

# Labels for variables and group header
labels <- list(
  variables = list(
    sex       = "Sex",
    age       = "Age (years)",
    ulcer     = "Ulceration",
    thickness = "Thickness* (mm)"
  ),
  groups = list("", "", "Death")
)

# Remove the word "death" from the levels, since it appears above
levels(melanoma2$status) <- c("Alive", "Melanoma", "Non-melanoma")

# Set up strata (columns) as a list of data frames
strata <- c(list(Total = melanoma2),
            split(melanoma2, melanoma2$status))

# New Table 1 with grouped columns
table1(
  strata,
  labels,
  groupspan = c(1, 1, 2),
  caption   = caption,
  footnote  = footnote
)

Descriptive statistics of patient characteristics by status
			Death
	Total (N=205)	Alive (N=134)	Melanoma (N=57)	Non-melanoma (N=14)
* Also known as Breslow thickness
Sex
Male	126 (61.5%)	91 (67.9%)	28 (49.1%)	7 (50.0%)
Female	79 (38.5%)	43 (32.1%)	29 (50.9%)	7 (50.0%)
Age (years)
Mean (SD)	52.5 (16.7)	50.0 (15.9)	55.1 (17.9)	65.3 (10.9)
Median [Min, Max]	54.0 [4.00, 95.0]	52.0 [4.00, 84.0]	56.0 [14.0, 95.0]	65.0 [49.0, 86.0]
Ulceration
Absent	115 (56.1%)	92 (68.7%)	16 (28.1%)	7 (50.0%)
Present	90 (43.9%)	42 (31.3%)	41 (71.9%)	7 (50.0%)
Thickness* (mm)
Mean (SD)	2.92 (2.96)	2.24 (2.33)	4.31 (3.57)	3.72 (3.63)
Median [Min, Max]	1.94 [0.100, 17.4]	1.36 [0.100, 12.9]	3.54 [0.320, 17.4]	2.26 [0.160, 12.6]

Customizing table1 in this way is powerful but not very straightforward and requires extra work.

Converting to `flextable` for further customization

One advantage of table1 is the ability to convert its output to other table formats, such as flextable, where formatting and layout may be easier to control.

library(flextable)

# Convert table1.1 to flextable
tab1.flex <- table1.1 |> t1flex()
tab1.flex

	Alive (N=134)	Melanoma death (N=57)	Non-melanoma death (N=14)	Overall (N=205)
sex
Male	91 (67.9%)	28 (49.1%)	7 (50.0%)	126 (61.5%)
Female	43 (32.1%)	29 (50.9%)	7 (50.0%)	79 (38.5%)
age
Mean (SD)	50.0 (15.9)	55.1 (17.9)	65.3 (10.9)	52.5 (16.7)
Median [Min, Max]	52.0 [4.00, 84.0]	56.0 [14.0, 95.0]	65.0 [49.0, 86.0]	54.0 [4.00, 95.0]
ulcer
Absent	92 (68.7%)	16 (28.1%)	7 (50.0%)	115 (56.1%)
Present	42 (31.3%)	41 (71.9%)	7 (50.0%)	90 (43.9%)
thickness
Mean (SD)	2.24 (2.33)	4.31 (3.57)	3.72 (3.63)	2.92 (2.96)
Median [Min, Max]	1.36 [0.100, 12.9]	3.54 [0.320, 17.4]	2.26 [0.160, 12.6]	1.94 [0.100, 17.4]

Now we modify tab1.flex directly with flextable functions:

tab1.flex |>
  # add header row
  add_header_row(
    values    = c("", "Death", ""),       # labels for top header row
    colwidths = c(2, 2, 1)                # columns spanned by each header
  ) |>
  # remove the border line under "Death"
  hline(part = "header", i = 1,
        border = officer::fp_border(width = 0)) |>
  # add line over Melanoma and Non-melanoma columns
  hline(part = "header", i = 1,
        border = officer::fp_border(width = 1.5), j = 3:4) |>
  # change labels in the first column
  compose(i = 1,  j = 1, as_paragraph(as_chunk("SEX"))) |>
  compose(i = 4,  j = 1, as_paragraph(as_chunk("AGE (years)"))) |>
  compose(i = 7,  j = 1, as_paragraph(as_chunk("Ulceration"))) |>
  compose(i = 10, j = 1, as_paragraph(as_chunk("Thickness (mm)"))) |>
  # add caption
  set_caption(caption = "Table 1: Descriptive statistics of patient characteristics by status") |>
  # add footnote
  footnote(
    i           = 10,
    j           = 1,
    ref_symbols = "a",
    value       = as_paragraph("Also known as Breslow thickness")
  ) |>
  # adjust font size in footer/footnote if needed
  fontsize(i = 1, j = 1, size = 9, part = "footer")

Table 1: Descriptive statistics of patient characteristics by status
		Death
	Alive (N=134)	Melanoma death (N=57)	Non-melanoma death (N=14)	Overall (N=205)
SEX
Male	91 (67.9%)	28 (49.1%)	7 (50.0%)	126 (61.5%)
Female	43 (32.1%)	29 (50.9%)	7 (50.0%)	79 (38.5%)
AGE (years)
Mean (SD)	50.0 (15.9)	55.1 (17.9)	65.3 (10.9)	52.5 (16.7)
Median [Min, Max]	52.0 [4.00, 84.0]	56.0 [14.0, 95.0]	65.0 [49.0, 86.0]	54.0 [4.00, 95.0]
Ulceration
Absent	92 (68.7%)	16 (28.1%)	7 (50.0%)	115 (56.1%)
Present	42 (31.3%)	41 (71.9%)	7 (50.0%)	90 (43.9%)
Thickness (mm)a
Mean (SD)	2.24 (2.33)	4.31 (3.57)	3.72 (3.63)	2.92 (2.96)
Median [Min, Max]	1.36 [0.100, 12.9]	3.54 [0.320, 17.4]	2.26 [0.160, 12.6]	1.94 [0.100, 17.4]
aAlso known as Breslow thickness

You may find that this approach uses more lines of code, but the steps are more explicit and can be easier to control.

To learn more about the table1 package, see:

Using the table1 Package to Create HTML Tables of Descriptive Statistics

sjPlot

The sjPlot package is a collection of plotting and table-output functions for data visualization.

Results of many statistical analyses (commonly used in the social sciences) can be visualized using this package, including simple and cross-tabulated frequencies, linear models, GLM models, mixed-effects models, PCA and correlation matrices, cluster analyses, and more.

Key features of `sjPlot`

Cross-tabulation
tab_xtab() creates cross-tabulations with options for adding row and column percentages and association statistics.
Regression tables
tab_model() creates tables of regression models with detailed statistical summaries, including coefficients, standard errors, p-values, and confidence intervals. It supports various model types such as linear, logistic, and mixed-effects models.
Multiple models
You can combine results from multiple models into a single table for comparative analysis.

Cross-tabulation with `tab_xtab()`

library(sjPlot)

# Cross-tabulation of SES by sex for the hsb data
tab_xtab(
  var.row     = hsb$ses,
  var.col     = hsb$female,
  show.col.prc = TRUE
)

ses	female		Total
ses	female	male	Total
high	29 26.6 %	29 31.9 %	58 29 %
low	32 29.4 %	15 16.5 %	47 23.5 %
middle	48 44 %	47 51.6 %	95 47.5 %
Total	109 100 %	91 100 %	200 100 %
χ²=4.577 · df=2 · Cramer’s V=0.151 · p=0.101

tab_xtab(
  var.row      = hsb$ses,
  var.col      = hsb$female,
  show.row.prc = TRUE,
  statistics   = "phi"
)

ses	female		Total
ses	female	male	Total
high	29 50 %	29 50 %	58 100 %
low	32 68.1 %	15 31.9 %	47 100 %
middle	48 50.5 %	47 49.5 %	95 100 %
Total	109 54.5 %	91 45.5 %	200 100 %
χ²=4.577 · df=2 · &phi=0.151 · p=0.101

Regression tables with `tab_model()`

library(MASS)   # for glm.nb

# Poisson model of awards on math, read, and SES
m.pois <- glm(awards ~ math + read + ses,
              family = poisson(),
              data   = hsb)

# Print using tab_model
tab_model(
  m.pois,
  dv.labels = "Poisson model"
)

	Poisson model
Predictors	Incidence Rate Ratios	CI	p
(Intercept)	0.03	0.01 – 0.07	<0.001
math	1.05	1.03 – 1.06	<0.001
read	1.03	1.01 – 1.04	<0.001
ses [low]	0.89	0.65 – 1.22	0.491
ses [middle]	0.78	0.61 – 1.00	0.049
Observations	200
R² Nagelkerke	0.629

# Poisson model with cluster-robust covariance matrix and deviance
tab_model(
  m.pois,
  vcov.fun  = "CL",
  vcov.args = list(type = "HC1", cluster = hsb$cid),
  dv.labels = "Poisson with cluster-robust covariance matrix"
)

	Poisson with cluster-robust covariance matrix
Predictors	Incidence Rate Ratios	CI	p
(Intercept)	0.03	0.01 – 0.07	<0.001
math	1.05	1.03 – 1.06	<0.001
read	1.03	1.01 – 1.04	<0.001
ses [low]	0.89	0.65 – 1.22	0.487
ses [middle]	0.78	0.61 – 1.00	0.035
Observations	200
R² Nagelkerke	0.629

# Negative binomial model of awards on math, read, and SES
m.nbin <- glm.nb(awards ~ math + read + ses, data = hsb)

# Print two models together in one table, adding AIC and deviance
tab_model(
  m.pois,
  m.nbin,
  vcov.fun  = "CL",
  vcov.args = list(type = "HC1", cluster = hsb$cid),
  show.dev  = TRUE,
  show.aic  = TRUE,
  dv.labels = c("Poisson", "Negative binomial")
)

	Poisson			Negative binomial
Predictors	Incidence Rate Ratios	CI	p	Incidence Rate Ratios	CI	p
(Intercept)	0.03	0.01 – 0.07	<0.001	0.03	0.01 – 0.06	<0.001
math	1.05	1.03 – 1.06	<0.001	1.05	1.03 – 1.07	<0.001
read	1.03	1.01 – 1.04	<0.001	1.03	1.01 – 1.05	<0.001
ses [low]	0.89	0.65 – 1.22	0.487	0.89	0.62 – 1.27	0.484
ses [middle]	0.78	0.61 – 1.00	0.035	0.79	0.60 – 1.04	0.040
Observations	200			200
R² Nagelkerke	0.629			0.595
Deviance	256.818			221.505
AIC	613.047			611.548

For more details, see:

Summary of regression models as HTML table

gtsummary

The gtsummary package is designed to create summary tables for a variety of statistical analyses.
It focuses on making publication-ready tables that are easy to generate and aesthetically pleasing.

Key advantages

Ease of use
Minimal coding is required to generate complex, publication-quality tables.
Flexibility
Tables can be customized to suit the needs of different publications or audiences.
Integrated statistical reporting
Automatically includes relevant statistics such as p-values, confidence intervals, and effect sizes.
Exportability
Tables can be converted to gt or flextable objects for further customization, and exported to formats suitable for Word and PowerPoint.
Themes
It is possible to set themes in gtsummary. Themes control many aspects of how a table is printed (labels, style, formatting, etc.).

Descriptive summary tables

library(gtsummary)

# Descriptive summary table (similar to Table 1)
tab1_gt <- tbl_summary(
  melanoma1,
  include = -c(time, year),
  by      = status
)

tab1_gt

Characteristic	Alive N = 134¹	Melanoma death N = 57¹	Non-melanoma death N = 14¹
sex
Male	91 (68%)	28 (49%)	7 (50%)
Female	43 (32%)	29 (51%)	7 (50%)
age	52 (40, 62)	56 (44, 68)	65 (56, 72)
thickness	1.36 (0.81, 2.90)	3.54 (2.24, 4.84)	2.26 (1.29, 6.12)
ulcer
Absent	92 (69%)	16 (28%)	7 (50%)
Present	42 (31%)	41 (72%)	7 (50%)
¹ n (%); Median (Q1, Q3)

# Add p-values and modify headers/labels
tab1_gt |>
  add_p() |>
  modify_header(label = "**Variable**") |>
  bold_labels()

Variable	Alive N = 134¹	Melanoma death N = 57¹	Non-melanoma death N = 14¹	p-value²
sex				0.033
Male	91 (68%)	28 (49%)	7 (50%)
Female	43 (32%)	29 (51%)	7 (50%)
age	52 (40, 62)	56 (44, 68)	65 (56, 72)	0.001
thickness	1.36 (0.81, 2.90)	3.54 (2.24, 4.84)	2.26 (1.29, 6.12)	<0.001
ulcer				<0.001
Absent	92 (69%)	16 (28%)	7 (50%)
Present	42 (31%)	41 (72%)	7 (50%)
¹ n (%); Median (Q1, Q3)
² Pearson’s Chi-squared test; Kruskal-Wallis rank sum test

# Summary statistics for continuous variables as mean (SD),
# add overall column and confidence intervals
tbl_summary(
  melanoma1,
  include   = -c(time, year),
  by        = status,
  statistic = all_continuous() ~ "{mean} ({sd})"
) |>
  add_overall() |>
  add_ci(
    pattern          = "{stat} ({ci})",
    all_categorical() ~ "wald"
  ) |>
  modify_spanning_header(
    c("stat_2", "stat_3") ~ "**Death**"
  )

Characteristic	Overall N = 205 (95% CI)¹	Alive N = 134 (95% CI)¹	Death
Characteristic	Overall N = 205 (95% CI)¹	Alive N = 134 (95% CI)¹	Melanoma death N = 57 (95% CI)¹	Non-melanoma death N = 14 (95% CI)¹
sex
Male	126 (61%) (55%, 68%)	91 (68%) (60%, 76%)	28 (49%) (35%, 63%)	7 (50%) (20%, 80%)
Female	79 (39%) (32%, 45%)	43 (32%) (24%, 40%)	29 (51%) (37%, 65%)	7 (50%) (20%, 80%)
age	52 (17) (50, 55)	50 (16) (47, 53)	55 (18) (50, 60)	65 (11) (59, 72)
thickness	2.92 (2.96) (2.5, 3.3)	2.24 (2.33) (1.8, 2.6)	4.31 (3.57) (3.4, 5.3)	3.72 (3.63) (1.6, 5.8)
ulcer
Absent	115 (56%) (49%, 63%)	92 (69%) (60%, 77%)	16 (28%) (16%, 41%)	7 (50%) (20%, 80%)
Present	90 (44%) (37%, 51%)	42 (31%) (23%, 40%)	41 (72%) (59%, 84%)	7 (50%) (20%, 80%)
Abbreviation: CI = Confidence Interval
¹ n (%); Mean (SD)

Cross tables of categorical variables

# Basic cross-tabulation with p-value
tbl_cross(
  row     = ses,
  col     = honors,
  percent = "row",
  data    = hsb
) |>
  add_p()

	honors		Total	p-value¹
	enrolled	not enrolled	Total	p-value¹
ses				<0.001
high	26 (45%)	32 (55%)	58 (100%)
low	11 (23%)	36 (77%)	47 (100%)
middle	16 (17%)	79 (83%)	95 (100%)
Total	53 (27%)	147 (74%)	200 (100%)
¹ Pearson’s Chi-squared test

# Set compact theme
theme_gtsummary_compact()

## Setting theme "Compact"

tbl_cross(
  row     = ses,
  col     = honors,
  percent = "row",
  data    = hsb
) |>
  add_p()

	honors		Total	p-value¹
	enrolled	not enrolled	Total	p-value¹
ses				<0.001
high	26 (45%)	32 (55%)	58 (100%)
low	11 (23%)	36 (77%)	47 (100%)
middle	16 (17%)	79 (83%)	95 (100%)
Total	53 (27%)	147 (74%)	200 (100%)
¹ Pearson’s Chi-squared test

Formatted table of regression model results

# Logistic regression model
m2 <- glm(
  honors ~ math + ses,
  family = binomial(link = "logit"),
  data   = hsb
)

# Reset to gtsummary default theme
reset_gtsummary_theme()

# Regression table with odds ratios
tab1.glm <- tbl_regression(m2, exponentiate = TRUE)
tab1.glm

Characteristic	OR	95% CI	p-value
math	0.84	0.79, 0.88	<0.001
ses
high	—	—
low	1.05	0.35, 3.10	>0.9
middle	3.44	1.43, 8.58	0.007
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

# Set theme to Journal of the American Medical Association (JAMA)
theme_gtsummary_journal(journal = "jama")

## Setting theme "JAMA"

tab1.glm

Characteristic	OR	95% CI	p-value
math	0.84	0.79, 0.88	<0.001
ses
high	—	—
low	1.05	0.35, 3.10	>0.9
middle	3.44	1.43, 8.58	0.007
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

# Convert table for further formatting with gt
tab1.glm |>
  add_global_p() |>          # add overall p-value for ses
  bold_p(t = 0.01) |>        # bold p-values < 0.01
  as_gt() |>                 # convert to gt table
  tab_source_note(           # add source note (Markdown interpreted)
    md("*This data is simulated*")
  )

Characteristic	OR	95% CI	p-value
math	0.84	0.79, 0.88	<0.001
ses			0.011
high	—	—
low	1.05	0.35, 3.10
middle	3.44	1.43, 8.58
Abbreviations: CI = Confidence Interval, OR = Odds Ratio
This data is simulated

Inline reporting with `inline_text()`

Reproducible reports are an important part of good practice.
We often need to report results from a table in the text of an R Markdown report.
Inline reporting is made simple with inline_text().

The method inline_text.tbl_regression() has the following format:

inline_text( x, variable, level = NULL, pattern = “{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})“, estimate_fun = x$inputs$estimate_fun, pvalue_fun = label_style_pvalue(prepend_p = TRUE), … )

We can use inline_text function inside two backtick, ` r inline_text() `, to report result of a gtsummary table.

For example we can use inline_text() to report the OR of the regression table in the text we can type:

For every unit increase of math score we expect on average the odds of not enrolled in honors program changes by a factor of `r inline_text(tab1.glm, variable = math, pattern = ” {estimate}; 95% CI ({conf.low}, {conf.high})“)` keeping ses constant.

In the report it will appear like this:

For every unit increase of math score we expect on average the odds of not enrolled in honors program changes by a factor of 0.84; 95% CI (0.79, 0.88) keeping ses constant.

Converting to other packages

The output of gtsummary tables can be converted to gt, kableExtra, or flextable objects.

Below is a summary of various Quarto and R Markdown output types and the print engines that support them (image from the gtsummary website):

To learn more about using gtsummary in R Markdown, see:

https://www.danieldsjoberg.com/gtsummary/articles/rmarkdown.html

gtsummary reference

Sjoberg DD, Whiting K, Curry M, Lavery JA, Larmarange J.
Reproducible summary tables with the gtsummary package. The R Journal 2021;13:570–80.
https://doi.org/10.32614/RJ-2021-053

Principles for effective statistical tables

1. Purpose and audience
Be clear why the table exists (compare, summarize, show trends, show relationships) and design it for the intended readers.
2. Clear structure
Arrange rows and columns in a logical order (time, region, alphabetical, or by size).
Use short, specific headings and explain any units (e.g., "in millions", "in %").
3. Consistent units and scale
Use the same units and number of decimal places within a table.
If different units are needed, label them clearly.
4. Simplicity and visual balance
Keep the layout simple and easy to scan.
Avoid overcrowding—use spacing, alignment, and borders to keep the table readable.
If needed, split an overloaded table into two smaller related tables.
5. Accuracy and comparability
Check that all numbers, totals, and derived values are correct.
Ensure data from different years, regions, or sources are truly comparable
(same definitions, time periods, and methods).
6. Titles, notes, and sources
Use a precise title that states what, where, and when the data represent.
Add footnotes for special cases or symbols, and always include the data source.
7. Emphasize key figures
Use subtle formatting (e.g., bold, spacing, or grouping) to highlight important totals, averages, or results.