How to use this seminar

This seminar aims to teach the user basic R Markdown syntax to make beautiful, reproducible reports.

First we will discuss what R Markdown is, how it is used, and how it works. The rest of the seminar focuses on R Markdown sytnax, specifically:

  • Markdown tags to format text
  • knitr options to format R code and output
  • YAML coding to control the output document type and its appearance

This seminar does not attempt to explain all of the R code used in the example reports.

Text that appears with this typeface and background is usually code syntax you can use when authoring your R Markdown files. Buttons and menus in RStudio will also appear formatted this way.

Text that appears blockquoted like this is a set of instructions to alter an R Markdown file. Click the Knit button after finishing all instructions within a block to view the results of your modifications.

Go ahead and press the ‘k’ key to disable advancing with mouse click. This will make it easier to copy-and-paste code.

What is R Markdown?

“An authoring framework for data science” – R Markdown creators

R Markdown allows us to create reproducible documents that weave narrative text together with R code and the output it produces when executed.

For example, here is an R code block inserted into the R Markdown file that generates this slide show. Underneath the code is its output:

        col=c("#4d4d4d", "#bf812d", "#f4a582", "#f6e8c3"),
        legend.text=TRUE, xlab="Eye Color", 
        args.legend=list(title="Hair Color"))

These documents are dynamically generated – whenever we need to change code or data, we can simply update the R Markdown file, compile it, and the output will be automatically updated in the resulting docuemnt.

These documents can then be shared with an audience to provide the most up-to-date content.

Installing R Markdown and working in RStudio

We highly recommend working with R Markdown in RStudio, which has many features that facilitate R Markdown file editing, including:

  • syntax highlighting: headers are colored blue, R code chunks have a gray background
  • compiling the R Markdown file with a single button push
  • ability to run code chunks individually without having to compile whole file
  • RStudio will produce a preview of the document when compiling
  • RStudio includes the document converter Pandoc

Once R and RStudio are installed, you can install R Markdown with install.packages("rmarkdown") as usual.

Starting our first markdown file

We can open a new Markdown file template through the File menu in RStudio.

A. Choose File -> New File -> R Markdown...
B. Fill the Title field and Author fields with “Practice” and your name, respectively.
C. In the left menu, select Document, and for Default Output Format select option HTML (these are the defaults).
D. Click OK

R Markdown files typically use the extension .Rmd or .rmd

A file initiated through this method will have a skeleton of the elements of an R Markdown file:

  • YAML header
  • Markdown
  • R code chunks

Note: If you have not installed package rmarkdown and try to open a .rmd file through the File menu, RStudio may ask you to install rmarkdown immediately.

Elements of an R Markdown file - YAML header

At the top of our newly intiated R Markdown file, enclosed in --- tags, we see the first of the essential elements of an R Markdown file, the YAML header.

YAML stands for “YAML Ain’t Markup Language” or “Yet Another Markup Language”, and is a human-readable language, which we use here to communicate with Pandoc.

Pandoc converts between document formats and controls their overall appearance. Pandoc is installed with RStudio.

The YAML header is also used to control:

  • which output format to use (HTML, LaTeX pdf, etc.)
  • overall appearance of the document
  • adding other files to add content or style the document

The YAML header may also contain the document’s metadata, information about the the R Markdown file itself, such as title:, author:, and date:.

Elements of an R Markdown file - Markdown

Within the body of the document we find some examples of text with special characters that have been highlighted blue, including the following:

  • ## R Markdown and ## Including plots: The ## signifies that the text following is to be treated as a section header (or a new slide for a slide show)
  • **Knit**: The ** signify that “Knit” is to appear in bold

## and ** are Markdown tags which format the text enclosed within them.

Markdown is a markup language, a system of code shortcuts to annotate and format plain text – once the .rmd file is compiled (rendered), the text will be formatted.

Elements of an R Markdown file - R code chunks

The final element of R Markdown files are the R code chunks, highlighted with gray backgrounds and enclosed within ```{r } and ```.

The R code chunks are actually processed by the package knitr, which is installed with rmarkdown.

When the R Markdown file is compiled and rendered, the output of the code chunk will be embedded in the document underneath the code.

rmarkdown (via knitr) provides a large array of options to control the appearance of both the R code and its output.

Compiling and rendering an R Markdown file

Once we are pleased with its contents, we can compile the R Markdown file and render it into its final output format in two ways:

  • Click the Knit button (Ctrl+Shift+K) in RStudio
  • Call rmarkdown::render(“filename.rmd”) in the R Console

The output document will be rendered and saved in the same directory as where the .rmd file is locakted.

RStudio will also provide a preview of the output document.

Clicking on the Knit button simply calls render() on the current .rmd file.

Progress and messages produced by rendering the .rmd file will be displayed in the R Markdown console, which appears when you render.

Click the Knit button. Name the file whatever you want, but make sure to use the .rmd extension. Observe how elements of the .rmd file appear in the output.

How it all works

R Markdown is the unification of 3 frameworks:

  • Markdown, to format text
  • knitr, to process R code chunks
  • YAML/pandoc, to allow a variety of output formats

When we render the document, the following happens:

First, knitr converts all of the R code chunks, code and output, into text and Markdown tags, resulting in a Markdown file (.md) of just text and Markdown. Images are saved to files and are included/embedded in the output via links.

Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.