How to use this seminar

This seminar aims to teach the user basic R Markdown syntax to make dynamic, reproducible reports.

First we will discuss what R Markdown is, how it is used, and how it works.

The rest of the seminar focuses on R Markdown syntax, specifically:

  • Markdown tags to format text
  • knitr options to format R code and output
  • YAML coding to control the output document type and its appearance

This seminar does not attempt to explain all of the R code used in the example reports.

Text that appears with this typeface and background is usually code syntax you can use when authoring your R Markdown files. Buttons and menus in RStudio will also appear formatted this way.

Text that appears blockquoted like this is a set of instructions to construct an R Markdown file. Click the Knit button after finishing all instructions within a block to view the results of your coding.

What is R Markdown?

“An authoring framework for data science” – R Markdown creators

R Markdown allows us to create reproducible documents that weave narrative text together with R code and the output it produces when executed.

For example, here is an R code block inserted into the R Markdown file that generates this slide show. Underneath the code is its output:

barplot(HairEyeColor[,,1],
        col=c("#4d4d4d", "#bf812d", "#f4a582", "#f6e8c3"),
        legend.text=TRUE, xlab="Eye Color", 
        args.legend=list(title="Hair Color"))

Barplot showing counts of eye colors for different hair colors

These documents are dynamically generated – update code or data, then re-compile the file and the output will be automatically updated in the resulting document.

Your documents can thus provide the most up-to-date content.

Installing R Markdown and working in RStudio

We highly recommend working with R Markdown in RStudio, which has many features that facilitate R Markdown file editing, including:

  • syntax highlighting: headers are colored blue, R code chunks have a gray background
  • compiling the R Markdown file with a single button push
  • ability to run code chunks individually without having to compile whole file
  • RStudio will produce a preview of the document when compiling
  • RStudio includes the document converter Pandoc

Once R and RStudio are installed, you can install R Markdown with install.packages("rmarkdown") as usual.

Starting our first markdown file

We can open a new Markdown file template through the File menu in RStudio.

A. Choose File -> New File -> R Markdown...
B. Fill the Title field and Author fields with “Practice” and your name, respectively.
C. In the left menu, select Document, and for Default Output Format select option HTML (these are the defaults).
D. Click OK

R Markdown files typically use the extension .Rmd or .rmd

This new R Markdown file contains the basic elements of an R Markdown file:

  • YAML metadata header
  • text, some formatted by Markdown
  • R code chunks

Note: If you have not installed package rmarkdown and try to open a .rmd file through the File menu, RStudio may ask you to install rmarkdown immediately.

Elements of an R Markdown file - YAML header

The header atop the code file contains optional metadata (title:, author: etc.).

Additionally, YAML code can be specified to control:

  • output document format (HTML, LaTeX pdf, etc.)
  • overall appearance of the document
  • adding other files to add content or style the document (e.g. CSS)

“YAML Ain’t Markup Language” or “Yet Another Markup Language” is a human-readable language, used here to communicate with Pandoc.

Pandoc converts documents between formats and controls their overall appearance. Pandoc is installed with RStudio.

Elements of an R Markdown file - Markdown

In the body of the document, you’ll find text with special characters that have been highlighted blue:

  • ## R Markdown and ## Including plots: ## signifies that the text is to be treated as a section header (or a new slide for a slide show)
  • **Knit**: The ** signify that “Knit” is to appear in bold

## and ** are Markdown tags which format the text enclosed within them.

Markdown is a markup language, a system of code tags to format text.

Elements of an R Markdown file - R code chunks

Finally, we also see R code chunks, highlighted with gray backgrounds and enclosed within ```{r } and ```.

The R code chunks are actually processed by the package knitr, which is installed with rmarkdown.

When the R Markdown file is compiled and rendered, the output of the code chunk will be embedded in the document underneath the code.

rmarkdown (via knitr) provides a large array of options to control the appearance of both the R code and its output.

Compiling and rendering an R Markdown file

When you are ready to compile and render your .rmd file, click the Knit button (Ctrl+Shift+K) in RStudio.

The output document will be saved in the same directory as where the .rmd file is located.

RStudio will also provide a preview of the output document.

Progress and messages produced by rendering the .rmd file will be displayed in the R Markdown console, which appears in its own tab next to the regular Console.

Click the Knit button. Name the file whatever you want, but use the .rmd extension. Observe how elements of the .rmd file appear in the output.

How it all works

R Markdown is thus the unification of 3 frameworks:

  • YAML/pandoc, to allow a variety of output formats
  • Markdown, to format text
  • knitr, to process R code chunks

When we render the document, the following happens:

First, knitr converts all of the R code and output into text and Markdown tags, resulting in a Markdown file (.md) of just text, Markdown tags, and links to image files.

Screenshot showing an R Markdown file on the left and the resulting Markdown file on the right, demonstrating how the R Markdown content is converted into Markdown.

Then, pandoc converts the Markdown (.md) file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.

Screenshot showing a Markdown file on the left and its rendered HTML output on the right, demonstrating how the Markdown content is formatted in HTML

Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.

Screenshot showing a Markdown file on the left and its rendered ioslides presentation output on the right, demonstrating how the Markdown content is formatted in ioslides presentation

Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.

Screenshot showing a Markdown file on the left and its rendered LaTeX PDF output on the right, demonstrating how the Markdown content is formatted in PDF

Learning R Markdown

Screenshot illustrating the workflow from an R Markdown file to multiple output documents, with icons for Rmd, knitr, Markdown, and pandoc showing the conversion process.

Thus, to use R Markdown proficiently, we must learn each of the 3 coding frameworks: Markdown, knitr and YAML (pandoc).

Fortunately, the coding is not difficult to learn, and you can find lots of free help online.

You can produce a wide variety of documents with just a little coding knowledge, but very fine control over documents will require that you learn more advanced coding (such as HTML or LaTeX).

R Markdown keeps it simple for the user by requiring a single format, the .rmd file, to produce this wide variety of output.

Markdown

First, what is a markup language and what is HTML?

A markup language is a system of tags (code) used to format documents. Tags are used to define sections, change the appearance of text, build tables, link images, and so on.

Hypertext Markup Language (HTML) is a markup language designed to be used for web pages.

HTML tags generally include both an opening and closing tag, and have this form <tag> and </tag>.

For example, enclosing text in <em> tags results in italicized text.

<em>hello</em> becomes hello when viewing the document in a web browser.

What is Markdown?

Markdown is a lightweight markup language, simple and easy to type.

Markdown was originally designed to be a shorthand for HTML. For example we just learned to italicize text in HTML, you enclose it in <em> </em>. In Markdown, you just enclose it in * *.

However, pandoc can convert Markdown to many different output formats besides just HTML.

R Markdown uses pandoc’s version of Markdown, which differs a bit from standard Markdown.

Because of its simplicity, Markdown is very easy to use, so let’s dive in!

In your currently open .rmd file, erase all content except for the YAML header.

Spacing and paragraphs

Newlines (carriage returns) are considered spaces in Markdown.

First, try adding the following text broken by a single new-line (and no additional spaces) after the first period:

First.
Second.

In order to begin a new paragraph, insert a blank line (i.e. 2 newlines) before beginning the new paragraph. This will double-space paragraphs.

Now, try retyping the text with 2 newlines after the first period:

First.

Second.

Or, you can add 2 spaces before the new-line to single-space the paragraphs.

Finally, try retyping the text with 2 spaces and a single newline after the period:

First.  
Second.

Headers

Section headers are often displayed in larger, bolder fonts.

To format text as a header, place one to six # tags at the beginning of the header text. The number of # signs indicate the level of the header (higher levels will be larger).

Put a space between the # tags and the header text.

Headers must be preceded by at least one blank line.

Add a level 1 header called “Big Header” and a level 3 header called “Small Header”

Bold, indent, underline, strikethrough

Markdown provides simple tags to format text for emphasis as well as super- and subscripting:

  • *italics* produces italics
  • **bold** produces bold
  • ~~strikethrough~~ produces strikethrough
  • `code` produces code
  • text^super^ produces textsuper
  • text~sub~ produces textsub

Do not insert spaces in between formatting tags and the text.

Any character preceded by a backslash will be treated as a literal character and not as a code tag:

\*italics\* produces *italics*

Try recreating this formatted text with Markdown syntax:

We multiplied x by z2 to create the interaction variable x*z2

Bulleted Lists

Markdown bulleted lists are much easier to specify than HTML lists.

To create a list, precede each list item with * (or + or -) and a space:

* item1
* item2

To add sublists, indent 4 spaces:

* item1
    + sub1.1
    + sub1.2
* item2

Use numbers with periods as tags for numbered lists:

1. item1
2. item2
3. item3

Links: internal and external

External images

Embedding external images (not created by R code within the document) in a document uses syntax very similar to linking:

![Caption](image_location)

Note: Do not put the image path/name in quotes!

So, assuming there is a file named “densities.png” in the same directory as the .rmd file:

![Fig 1 Densities by diet](rmarkdown_images/densities.png)

Alternatively, you can use HTML code:

<img src="rmarkdown_images/densities.png" alt="Fig 1 Densities by diet" style="width: 50%;">

Fig 1 Densities by diet

TeX Math

Text enclosed by $ symbols will be treated as TeX Math, another set of markup tags used to format mathematical expressions.

For example, $mean(X) = \frac{\sum\_{i=1}^nX}{n}$ will be rendered as \(mean(X) = \frac{\sum_{i=1}^nX}{n}\)

A TeX expression enclosed by two $ sybmols on each side will be displayed as a equation, generally centered in the document. So, $$mean(X) = \frac{\sum_{i=1}^n X}{n}$$ produces \[mean(X) = \frac{\sum_{i=1}^n X}{n}\]

Although TeX Math is often associated with LaTeX documents, it can be used in any document type supported by RMarkdown.

R code chunks

R Markdown, knitr, and R code chunks

By itself, R package knitr is used to weave text and R code output together into reports.

R Markdown builds upon knitr by allowing Markdown tags to format text and using Pandoc to convert between document formats.

knitr executes code chunks sequentially when the .rmd file is knit, so R objects created in a chunk are available to all subsequent chunks.

A large array of knitr options provides control over the appearance of R code, text output, and graphical output in the final document.

Please open the sleep_study.rmd file to practice using knitr code chunks. You may close the .rmd file we used to practice Markdown.

knitr R code chunk delimiters

R code chunks are delimited by ```{r chunk_label, options} at the beginning and ``` at the end. The chunk_label and options are indeed optional and are separated by commas (much more on this soon).

Three ways to add R code tags:

  • keyboard shortcut Ctrl + Alt + I (Cmd + Option + I on Macs)
  • Insert Code button in RStudio, on the same taskbar as the knit button and looks like this:
  • type them manually

Add an R code chunk using the keyboard shortcut or the Insert button to the sleep_study.rmd file after the text “Here are the data:”

Our first output

Data sets stored as data.frame objects in R can be printed to the document by simply specifying the object name.

Inside our new code chunk, add the following code:
sleep
summary(sleep)

Let’s look closely at the output:

  • the R code itself is displayed with a colored (gray) background
  • Each line of output is demarcated with ##, double comment symbols, allowing the user to copy and paste all of the R-related content without interfering with execution of the R code.
  • Each line of R code is followed by its output

We can control all of this!

Code chunk options

Much of the power of rmarkdown via knitr lies in its wide array of options to control the appearance of R code output.

See here for a full list of knitr chunk options.

To specify chunk options, after ```{r, specify a chunk label (name), and comma, and then a list of options separated by commas. This is known as the chunk header.

All of the chunk options must be specified on one line (no line breaks).

Avoid the use of characters beside alphabetic characters and -.

Change the first line of the R code chunk to ```{r mydata, echo=FALSE}.

Here mydata is the chunk label, and echo=FALSE is an option. Notice the use of commas to separate.

Common chunk options to control text output

As we saw, echo=FALSE suppresses printing of the R code. By default, echo is set to TRUE, but often we do not want our audience to see the underlying R code.

Here are some options to control our output (default of option specified in parentheses):

  • echo:(TRUE) whether to print the R code to the document. Can be set to a vector of numbers to print only specific lines of code.
  • eval:(TRUE) whether or not to evaluate (run) the R code chunk. Can be set to vector of numbers to evaluate only specific lines of the code, e.g. eval=1:3 evaluates only the first 3 lines of code.
  • include:(TRUE) whether to include the R code and output in the document. Differs from eval in that if include=FALSE, the R code is still evaluated, but nothing is printed to the document.
  • results:('markup') how to print results (note use of single quotes for setting values)
    • 'markup' prints the output with special formatting per the document type
    • 'hide' suppress printing of the output.
    • 'hold' prints all of the output of the entire code chunk together at the end.
    • 'asis' prints raw results without special formatting.

Change echo=FALSE to eval=FALSE.

Change eval=FALSE to results='hold'.

Suppressing warnings and messages

Many R functions display warnings and messages to the user.

knitr will print warnings and messages to the document by default, but they may be distracting to the reader.

We can use the chunk options:

  • message:(TRUE) whether to print messages to the document
  • warning: (TRUE) whether to print warnings to the document

A. Insert a new code chunk after the text “First, log-transforming the outcome **extra** was suggested:”.
B. Inside the code chunk, specify these 2 lines of code:

sleep$logextra <- log(sleep$extra)
sleep$logextra

Notice that a warning was printed to the document.

Add the chunk label log-transform and the option warning=FALSE to this second chunk, separated by commas.

Notice now that the warnings are printed to the R Markdown console.

NOTE: you may not want to suppress warnings and messages until you are sure everything is working correctly.

Global chunk options

If you know that you will need to set an option for multiple or all chunks, you can set them globally with a call to knitr::opts_chunk$set() in the first code chunk of the .rmd file.

A. Insert a new code chunk before the “# Purpose” header.
B. Give the chunk the label “setup” in the header.
C. Specify this code inside the chunk: knitr::opts_chunk$set(echo=FALSE).

The global option above sets echo=FALSE for all chunks, thus suppressing all R code.

If you’d like to see the R code in your document, delete this chunk or reset the option to echo=TRUE.

Usually we don’t want to see this setup chunk in the report. Suppress its printing.

Formatted tables with knitr::kable()

The function kable() from the knitr package produces pretty, formatted tables produced by R code (rather than the default R output style).

The table input is usually a data.frame, a matrix, or a table and is the first argument to kable().

kable() inludes arguments to control the number of digits printed, column names, column alignment, table caption, and other formatting options. See ?knitr::kable for details.

Look into the package kableextra to get many more formatting options for kable tables. See here for examples.

In the code chunk ‘mydata’, replace the first line of code sleep with knitr::kable(sleep, align='c').

Inline R code

We can also insert R code directly into text, which will be replaced by its output when rendered.

Enclose the inline R code with `r and `.

Inline R code itself will not be printed to the document.

Use Markdown tags to format the ouptut.

A. Replace the text “XXX” at the very end of the .Rmd file with the inline R code `r mean(sleep$extra)`.
B. Use Markdown tags ** to bold the result.

R code chunks and Figures

tidyverse package for this section

For this section of the seminar, we will be using the package tidyverse, a diverse collection of packages with many tools for data analysis. Specifically we will be using the following packages within tidyverse:

  • ggplot2 for plots
  • dplyr for data managment
  • broom to make regression tables into printable data frames

Please make sure you have tidyverse installed. You can check by issuing library(tidyverse) in the current environment. If it errors, please run install.packages("tidyverse") now.

Please open the mileage.rmd file to practice syntax for controlling R graphics.

rmarkdown, knitr and R graphics

knitr, and thus rmarkdown make including and formatting graphics in the documents quite easy.

Graphics produced by R code are placed immediately after the generating code chunk.

Knit the mileage.rmd file and observe the placement of the R code and graphs.

Arranging multiple figures produced by the same code chunk

Notice that the three plots produced by the final code chunk of the mileage.rmd file are interleaved with the individual ggplot() commands that produced them.

The knitr chunk option fig.show determines how to place multiple plots:

  • fig.show: (asis) how to arrange plots produced by the same chunk, taking on one of these settings:
    • asis places each plot immediately after the code that produced it, the default
    • hold places all plots at the very end of the code chunk
    • hide saves the plots to files, but don’t place them in the document (requires the option fig.path be set to a valid path to save the plots)

Add the chunk option fig.show='hold' to the fourth chunk, mileage-graphs, of mileage.rmd. Leave this option at 'hold' for the remainder of the seminar.

Because the figures are large, knitr places them one after another.

Sizing and aligning figures

We can easily adjust the size of figures using the knitr chunk options:

  • fig.width:(7) width of figure in inches
  • fig.height:(5) height of figure in inches
  • fig.asp:(NULL) aspect ratio of plot, height/width

If only one of fig.width or fig.height is specified, the other is not adjusted, unless fig.asp is also specified (it is NULL by default)

Add the chunk option fig.width=3 to the code chunk mileage-graphs of mileage.rmd and observe what happens to the size and positioning of the graphs.

Now add the chunk option fig.asp=1 to this same chunk. Don’t forget commas!

How could you change the size of all figures in the document to this size?

We also have an option to adjust the alignment of figures in the document:

fig.align:('default') 'default' is no alignment adjustment, and other possible values are 'left', 'center', and 'right'.

Place the option fig.align='center' inside of knitr::opts_chunk$set() in the very first chunk.

Figure captions

Though no captions are shown by default, knitr makes adding a figure caption easy with this option:

  • fig.cap:(NULL) a character string specifying a figure caption, or NULL for no caption

Add the option fig.cap='Fig 1' to the chunk sample (following the header ## The sample of cars)

Now add the option fig.cap='Fig 2' to the chunk mileage-graphs (following the header ## Mileage graphs) and observe an interesting result

Saving figures (Optional)

By default, knitr will embed the images into the final document as base64 strings, creating a single file with all content including images (rather than saving the images externally and linking them into the docuemnt).

If you would also like to save the R-produced images to external files, use:

  • fig.path: ('figure/') prefix to use to name image files, with directory names allowed. For example, fig.path='my_figs/plot-' will create a new directory within the working directory called my_figs, in which each file will be named beginning with the prefix plot- and its corresponding chunk label.

R Notebooks

Where are my R objects?

You might have noticed that after we Knit a file, none of the R objects appear in the current R session.

Using the Knit button actually starts a new R session to render the document, where all the R code is executed and is then closed after rendering.

  • The working directory of the new session is automatically set to the same directory as the location of the .rmd file. You can change the working directory (of the new R session only) within the .rmd file with setwd() as usual, and this will not change the working directory of the current session.
  • Any R packages necessary to run the R code must be loaded in the .rmd document. These will not be loaded into the current session.

Rendering in a new session ensures that the document is reproducible (for instance on someone else’s computer), as it prevents any dependencies on objects (e.g. packages) in the current R session.

Still, it’s handy to be able to work with R objects as you construct your .rmd file, which R Notebooks allow.

R Notebooks

When editing an R Markdown document within RStudio, it will be edited as an R Notebook.

R Notebooks allow the user to execute each R code chunk interactively, which places the output immediately below the code chunk itself in the .rmd document.

R Notebooks are R Markdown files in every sense – they just provide an interactive mode for document editing.

You’ll know that RStudio is treating your .rmd file as an R Notebook if you see these buttons at the top right of each R code block:


Check for the buttons at the top right of your code chunks in the mileage.rmd file

  • the green triangle button on the right runs R code in the current chunk and places the output beneath the chunk
  • the gray triangle and green bar button in the middle run the R code in all chunks before the current chunk, placing the output beneath each chunk
  • the cog button on the left will help you set up basic chunk options for R code output

Click the middle button (gray triangle and green bar) in the code chunk mileage-graphs of mileage.rmd.

Click the right button (green triangle) in the code chunk mileage-graphs of mileage.rmd.

YAML header

Purpose of the YAML header

In the YAML header we specify pandoc options that control the overall appearance of the output document.

YAML code tells pandoc which output document format (HTML webpage, LaTeX pdf, Word doc, etc.) to use and how to style it.

Different options are available for different output document formats.

See the R Markdown Reference Guide cheatsheet to see a table of options by output format (click on the Help menu in RStudio, then click Cheatsheets).

Specifying a YAML header

The YAML header is located at the top of the .rmd file and is enclosed in 2 sets of 3 dashes, ---

YAML syntax generally is option_name: value. For example:

  • title: Analysis of Health Outcomes
  • output: html_document

The YAML header is actually optional, and if omitted completely, an HTML document will be produced.

We will cover many more YAML (pandoc) options as we discuss specific output formats

YAML Indentation

Indentation is important when specifying options to style the output document.

  • output: should not be indented
    • The document type should appear on the next line indented 2 spaces (e.g. pdf_document)
      • sub-potions (name and value) on new lines, indented 4 spaces

For example, below we add a table of contents that floats on the left of the document created by mileage.rmd. Notice the indentation and newlines in this specification:

---
title: "Mileage of American Cars"
output: 
  html_document:
    toc: TRUE
    toc_float: TRUE
---

Replace the YAML header of mileage.rmd with the header above. Make sure the indentation is copied faithfully.

Parameters in the YAML heading

R Markdown allows specification of parameters in the YAML heading that can be passed to R code anywhere in the document.

Parameters provide an easy mechanism to generate different customized reports depending on the inputs. For example, we can reproduce a data analysis document for different age cohorts or specify different analyses for different years of data.

To declare parameters, include the params: field in the YAML header, and underneath add one parameter per line, each specified as param_name: value, and each indented by 2 spaces.

To access the parameter value in R code, use params$param_name, where param_name is the name of the parameter specified in the YAML header.

A. Change the eval option in the final R code chunk, subgroup-plot, to eval=TRUE.
B. Add the following parameter specification to the YAML header:
---
params:
  manufacturer: dodge
---

The first line is not indented, but the second line is indented 2 spaces.

Now try changing dodge to chevrolet, ford, jeep, lincoln, mercury, or pontiac and re-render the document

Notice how the parameter is accessed in the final chunk with params$manufacturer

Tour of Output Formats

Output Formats

R Markdown’s ability to produce a wide variety of document types using a single, unified coding framework is one of its biggest strengths.

The same .rmd file can produce an HTML document, a LaTeX .pdf document, a Word document, a PowerPoint slide show presentation, etc.

See here for a full list of available output formats. Formats are either documents or presentations.

Each output format has its own set of options that we can specify in the YAML header to control the document’s appearance.

For this section, please open the txhousing_sales.rmd file.

HTML documents

Markdown’s original purpose was to simplify HTML coding, so HTML documents naturally have the widest array of options available in rmarkdown.

We have already seen the use of toc: TRUE to add a table of contents.

Some more useful suboptions for HTML documents are:

  • theme: theme_name, where theme_name is one of default, cerulean, journal, flatly, darkly, readable, spacelab, united, cosmo, lumen, paper, sandstone, simplex, and yeti. This adjusts the font family, font colors, and possibly the background colors.
  • highlight: style, where style is one of default, tango, pygments, kate, monochrome, espresso, zenburn, haddock, textmate, or null. This adjusts the style of the code syntax highlighting.
  • code_folding: hide will hide code from the reader unless the reader toggles a Code button that appears in its place. You can also specify code_folding: show to show the code by default, but allow it to be hidden by the user.

Try adding a theme: suboption to txhousing_sales.rmd when the output format is html_document. Try a few of the theme_name specifications above.

Now add a highlight: suboption, using any of the style specifications above. (Make sure echo=TRUE in knitr::opts_chunk$set() in the first chunk to observe the results.)

Finally, add code_folding: hide, and try toggling the Code buttons throughout, and the master Code button at the top of the document.

A full list of options for HTML documents can be found at the R Markdown Definitive Guide

Raw HTML and CSS

If you want to have finer control over the appearance of your HTML documents (including some of the presentation formats we will discuss later), you will probably need to learn some HTML.

You can insert HTML directly into the document and in most cases it will render as expected. For example, <font color="red">ERROR:</font> produces ERROR:.

CSS (Cascading Style Sheets) is a language used to style markup languages like HTML.

  • You can specify CSS code directly within in a code chunk.
```{css, echo=FALSE}
  slides > slide {
    overflow-y: auto !important; 
  }
```
  • You can link a CSS file through YAML header.
---
title: "My Presentation"
output:
  ioslides_presentation:
    css: "mystyle.css"
---
  • CSS code blocks that control the appearance of the document globally are often defined at the top of documents inside HTML <style> </style> tags.
Try adding the following CSS within HTML <style> tags to txhousing_sales.rmd immediately after the YAML header:
<style>
body {
background-color: AliceBlue;
font-family: Garamond, serif;
font-size: 20px;
}
</style>

Remove this style section when you feel you understand how it functions.

W3Schools is an excellent, free online resource for beginners to learn HTML and CSS.

Presentations

R Markdown supports several slide-show-style presentation output formats, including:

  • powerpoint_presentation
  • slidy_presentation, HTML
  • ioslides_presentation, HTML
  • beamer_presentation, LaTeX PDF

The HTML slideshows are opened and viewed in a browser just like any other HTML file, while a beamer_presentation is viewed in a PDF viewer (e.g. Adobe Acrobat, more on PDF files later).

HTML slideshows can be styled with raw HTML and CSS, while Beamer presentations can be styled with LaTeX.

We will not be covering Beamer presentations in this seminar.

Slideshows often look and behave better in the actual output file than in the RStudio previewer.

Markdown in slideshow presentations

Slideshows use section headers to initiate new slides. For example ## Purpose will initiate a new slide with the header “Purpose”.

First-level section headers (i.e. # Header) will become title slides, and should not have any accompanying text underneath.

    Including anything in a title slide besides the header itself can mangle all of the subsequent slides.

Second-level section headers (i.e. ## Header) will intiate new slides and may have additional content underneath.

You can also start a new slide without a header using ---, with a blank line before and after (this tag produces a horizontal line in non-presentations).

To have bulleted items appear on click (when advancing the slide) use >- instead of *.

PowerPoint presentation

PowerPoint presentations are a more recent addition to R Markdown.

PowerPoint slideshows have some limitations such as images and tables are always placed on new slides, and can have no accompanying text other than the slide header and a caption.

Change the YAML header in txhousing_sales.rmd to this:
---
title: "Texas housing sales, 2000-2015"
output: 
  powerpoint_presentation:
    toc: true
params:
  spotlight: "Houston"
---

More about PowerPoint presentation

ioslides presentation

ioslides are simple and easy to use with basic customization, less focus on technical customization.

Uses YAML metadata header to control the appearance and behavior of slides.

This seminar is a ioslides_presentation.

Change the YAML header in txhousing_sales.rmd to this:
---
title: "Texas housing sales, 2000-2015"
output: 
  ioslides_presentation:
    widescreen: true
params:
  spotlight: "Houston"
---

More about ioslides presentation

Slidy presentations

Slidy presentations by default have simple styling, but are highly customizable.

In Slidy presentation, the vertical size of slides is unlimited by default, so you can scroll down slides.

Slidy offers more extensive customization options compared to ioslides.

Change the YAML header in txhousing_sales.rmd to this:
---
title: "Texas housing sales, 2000-2015"
output: 
  slidy_presentation:
    font_adjustment: -1
    footer: Created in R
params:
  spotlight: "Houston"
---

A couple of options unique to slidy_presentations:

  • font_adjustemnt: -1 decreases font size, while font_adjustment: +1 increases font size
  • footer: text adds text to the footer of each slide

More about Slidy presentation

LaTeX PDF documents

Specifying output: pdf_document in the YAML header will produce a .pdf file formatted with LaTeX. Of course, there are suboptions available for pdf_documents.

Replace the entire YAML header in txhousing_sales.rmd with this header
---
title: "Texas housing sales, 2000-2015"
output:
  pdf_document:
params:
  spotlight: "Houston"
---

The RStudio previewer for PDF documents is separate from the previewer for HTML documents.

Any raw HTML code in a R Markdown file that is destined for a pdf_document will be ignored. Similarly, any LaTeX in a file destined for html_document or one of the HTML presentation formats will similarly be ignored.

Technical Note: Outputting a pdf_document requires that you have some distribution of TeX installed on your computer (e.g MiKTeX or TeX Live). You can install a small version of TeX on your computer directly through R by first running install.packages(tinytex) and then tinytex::install_tinytex(), which will install enough TeX to output a R Markdown pdf_document.

What is LaTeX? *

LaTeX (pronounced LAY-tech) is another document markup language, allowing the user to use tags to format plain text with very fine control. Compiling a LaTeX file into a readable PDF document requires that a TeX distribtuion (e.g. MikTeX) be installed as well.

LaTeX is often used to produce scientific documents, as it is particularly well suited to produce beautiful mathematical equations.

LaTeX tags begin with the forward slash, and usually have the syntax: tag{value}{text}, where tag is the name of the markup tag, value is its assigned value, and text is the text to which the formatting will be applied.

Replace the heading towards the top, # Background, with # \color{red}{Background}

Overleaf is a good place for new users of LaTeX to learn.

Useful options for LaTeX *

Some of the options for HTML documents that we have seen are also available for LaTeX PDF documents:

  • toc: true for a table of contents
  • highlight: style for syntax highlighting of R code (see a few slides back for available <em>style</em>s)

Another useful option for novice LaTeX users is to switch to the xelatex engine, with this option:

  • late_engine: xelatex, allows the use of system fonts through this option (not available for default engine pdflatex):
    • mainfont: font-family, specifies that the font font-family be used throughout the document.

Somewhat confusingly, the mainfont option is actually a top-level option, passed directly to pandoc, so should not be indented.

Pandoc has many other top-level options (i.e. not indented) for LaTeX documents that can be specified in the YAML header of an R Markdown file. See here for more of these top-level options.

Set the YAML header to this specification (change Georgia to another font if it is not installed on your computer)
---
title: "Texas housing sales, 2000-2015"
output:
  pdf_document:
    latex_engine: xelatex
mainfont: Georgia
params:
  spotlight: "Houston"
---

Windows OS sytem fonts

Mac OS system fonts

Challenges of LaTeX documents

Remember that Markdown was designed for HTML, so formatting the document exactly as you want will usually be more difficult in a LaTeX PDF document.

You may find that images are placed where you didn’t intend, particularly if there are several consecutive images. This behavior can be difficult to control. It is recommended that you complete the content of the document before trying to fine tune placement of images.

Troubleshooting Advice

Pay attention to warnings and messages in the R Markdown Console, not the regular R console (when looking for rendering errors).

Name code chunks to help identify where code is breaking.

If your markdown tags at the beginning of a line don’t seem to be working, try adding a newline before the tags.

Use Notebooks to isolate where the code is not working.

Accessibility

Provide the original source file.

HTML documents are usually more accessible than PDF.

Label all graphics with alt text tag and try to use informative text.

Use # to organize content by creating section headers, not to control size of the font.

Use LaTex for inserting mathematical content.

R Markdown resources

R Markdown website

R Markdown: The Definitive Guide The free, most comprehensive guide to R Markdown, written by its creators. Most of the content of this seminar can be found in this book.

R Markdown Cookbook Another free guide, and shares a primary author with R Markdown: The Definitive Guide (Yihui Xie). More advanced, explaining the usage of R Markdown through short examples and providing solutions to common questions.

The R Markdown Cheatsheet and R Markdown Reference guide, accessible through the Help -> Cheatsheets menus, are handy reference covering most of the basics.

Knitr Chunk Options Reference page has a full list of knitr chunk options to control R output.

Learn more about Cascading Style Sheets (CSS)

Concluding Remarks

R Markdown is easy enough to use that a little experience with each of the coding frameworks will give you sufficient flexibility to create reports in different formats with widely varying appearances.

You won’t find a report-generating system nearly as powerful and easy-to-use as R Markdown in any other statistical software.

Go see the R Markdown website or The Definitive Guide to see even more uses of R Markdown, including the generation of books, blog pages, and interactive documents.