R Markdown Basics

UCLA OARC Statistical Methods and Data Analytics


How to use this seminar

This seminar aims to teach the user basic R Markdown syntax to make dynamic, reproducible reports.

First we will discuss what R Markdown is, how it is used, and how it works.

The rest of the seminar focuses on R Markdown sytnax, specifically:

This seminar does not attempt to explain all of the R code used in the example reports.

Text that appears with this typeface and background is usually code syntax you can use when authoring your R Markdown files. Buttons and menus in RStudio will also appear formatted this way.

Text that appears blockquoted like this is a set of instructions to construct an R Markdown file. Click the Knit button after finishing all instructions within a block to view the results of your coding.

Go ahead and press the ‘k’ key to disable advancing with mouse click. This will make it easier to copy-and-paste code.

What is R Markdown?

“An authoring framework for data science” – R Markdown creators

R Markdown allows us to create reproducible documents that weave narrative text together with R code and the output it produces when executed.

For example, here is an R code block inserted into the R Markdown file that generates this slide show. Underneath the code is its output:

        col=c("#4d4d4d", "#bf812d", "#f4a582", "#f6e8c3"),
        legend.text=TRUE, xlab="Eye Color", 
        args.legend=list(title="Hair Color"))

These documents are dynamically generated – update code or data, then re-compile the file and the output will be automatically updated in the resulting document.

Your documents can thus provide the most up-to-date content.

Installing R Markdown and working in RStudio

We highly recommend working with R Markdown in RStudio, which has many features that facilitate R Markdown file editing, including:

Once R and RStudio are installed, you can install R Markdown with install.packages("rmarkdown") as usual.

Starting our first markdown file

We can open a new Markdown file template through the File menu in RStudio.

A. Choose File -> New File -> R Markdown...
B. Fill the Title field and Author fields with “Practice” and your name, respectively.
C. In the left menu, select Document, and for Default Output Format select option HTML (these are the defaults).
D. Click OK

R Markdown files typically use the extension .Rmd or .rmd

This new R Markdown file contains the basic elements of an R Markdown file:

Note: If you have not installed package rmarkdown and try to open a .rmd file through the File menu, RStudio may ask you to install rmarkdown immediately.

Elements of an R Markdown file - YAML header

The header atop the code file contains optional metadata (title:, author: etc.).

Additionally, YAML code can be specified to control:

“YAML Ain’t Markup Language” or “Yet Another Markup Language” is a human-readable language, used here to communicate with Pandoc.

Pandoc converts documents between formats and controls their overall appearance. Pandoc is installed with RStudio.

Elements of an R Markdown file - Markdown

In the body of the document, you’ll find text with special characters that have been highlighted blue:

## and ** are Markdown tags which format the text enclosed within them.

Markdown is a markup language, a system of code tags to format text.

Elements of an R Markdown file - R code chunks

Finally, we also see R code chunks, highlighted with gray backgrounds and enclosed within ```{r } and ```.

The R code chunks are actually processed by the package knitr, which is installed with rmarkdown.

When the R Markdown file is compiled and rendered, the output of the code chunk will be embedded in the document underneath the code.

rmarkdown (via knitr) provides a large array of options to control the appearance of both the R code and its output.

Compiling and rendering an R Markdown file

When you are ready to compile and render your .rmd file, click the Knit button (Ctrl+Shift+K) in RStudio.

The output document will be saved in the same directory as where the .rmd file is located.

RStudio will also provide a preview of the output document.

Progress and messages produced by rendering the .rmd file will be displayed in the R Markdown console, which appears in its own tab next to the regular Console.

Click the Knit button. Name the file whatever you want, but use the .rmd extension. Observe how elements of the .rmd file appear in the output.

How it all works

R Markdown is thus the unification of 3 frameworks:

When we render the document, the following happens:

First, knitr converts all of the R code and output into text and Markdown tags, resulting in a Markdown file (.md) of just text, Markdown tags, and links to image files.

Then, pandoc converts the Mardown (.md) file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.

Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.

Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.

Learning R Markdown

Thus, to use R Markdown proficiently, we must learn each of the 3 coding frameworks: Markdown, knitr and YAML (pandoc).

Fortunately, the coding is not difficult to learn, and you can find lots of free help online.

You can produce a wide variety of documents with just a little coding knowledge, but very fine control over documents will require that you learn more advanced coding (such as HTML or LaTeX).

R Markdown keeps it simple for the user by requiring a single format, the .rmd file, to produce this wide variety of output.


First, what is a markup language and what is HTML?

A markup language is a system of tags (code) used to format documents. Tags are used to define sections, change the appearance of text, build tables, link images, and so on.

Hypertext Markup Language (HTML) is a markup language designed to be used for web pages.

HTML tags generally include both an opening and closing tag, and have this form <tag> and </tag>.

For example, enclosing text in <em> tags results in italicized text.

<em>hello</em> becomes hello when viewing the document in a web browser.

What is Markdown?

Markdown is a lightweight markup language, simple and easy to type.

Markdown was originally designed to be a shorthand for HTML. For example we just learned to italicize text in HTML, you enclose it in <em> </em>. In Markdown, you just enclose it in * *.

However, pandoc can convert Markdown to many different output formats besides just HTML.

R Markdown uses pandoc’s version of Markdown, which differs a bit from standard Markdown.

Because of its simplicity, Markdown is very easy to use, so let’s dive in!

In your currently open .rmd file, erase all content except for the YAML header.

Spacing and paragraphs

Newlines (carriage returns) are considered spaces in Markdown.

First, try adding the following text broken by a single new-line (and no additional spaces) after the first period:


In order to begin a new paragraph, insert a blank line (i.e. 2 newlines) before beginning the new paragraph. This will double-space paragraphs.

Now, try retyping the text with 2 newlines after the first period:



Or, you can add 2 spaces before the new-line to single-space the paragraphs.

Finally, try retyping the text with 2 spaces and a single newline after the period:



Section headers are often displayed in larger, bolder fonts.

To format text as a header, place one to six # tags at the beginning of the header text. The number of # signs indicate the level of the header (higher levels will be larger).

Put a space between the # tags and the header text.

Headers must be preceded by at least one blank line.

Add a level 1 header called “Big Header” and a level 3 header called “Small Header”

Bold, indent, underline, strikethrough

Markdown provides simple tags to format text for emphasis as well as super- and subscripting:

Do not insert spaces in between formatting tags and the text.

Any character preceded by a backslash will be treated as a literal character and not as a code tag:

\*italics\* produces *italics*

Try recreating this formatted text with Markdown syntax:

We multiplied x by z2 to create the interaction variable x*z2

Bulleted Lists

Markdown bulleted lists are much easier to specify than HTML lists.

To create a list, precede each list item with * (or + or -) and a space:

* item1
* item2

To add sublists, indent 4 spaces:

* item1
    + sub1.1
    + sub1.2
* item2

Use numbers with periods as tags for numbered lists:

1. item1
2. item2
3. item3

External images

Embedding external images (not created by R code within the document) in a document uses syntax very similar to linking:


Note: Do not put the image path/name in quotes!

So, assuming there is a file named “densities.png” in the same directory as the .rmd file:

![Fig 1 Densities by diet](densities.png) produces

Fig 1 Densities by diet

Tex Math

Text enclosed by $ symbols will be treated as TeX Math, another set of markup tags used to format mathematical expressions.

For example, $mean(X) = \frac{\sum\_{i=1}^nX}{n}$ will be rendered as \(mean(X) = \frac{\sum_{i=1}^nX}{n}\)

A Tex expression enclosed by two $ sybmols on each side will be displayed as a equation, generally centered in the document. So, $$mean(X) = \frac{\sum_{i=1}^n X}{n}$$ produces \[mean(X) = \frac{\sum_{i=1}^n X}{n}\]

Although TeX Math is often associated with LaTeX documents, it can be used in any document type supported by RMarkdown.

R code chunks

R Markdown, knitr, and R code chunks

By itself, R package knitr is used to weave text and R code output together into reports.

R Markdown builds upon knitr by allowing Markdown tags to format text and using Pandoc to convert between document formats.

knitr executes code chunks sequentially when the .rmd file is knit, so R objects created in a chunk are available to all subsequent chunks.

A large array of knitr options provides control over the appearance of R code, text output, and graphical output in the final document.

Please open the sleep_study.rmd file to practice using knitr code chunks. You may close the .rmd file we used to practice Markdown.

knitr R code chunk delimiters

R code chunks are delimited by ```{r chunk_label, options} at the beginning and ``` at the end. The chunk_label and options are indeed optional and are separated by commas (much more on this soon).

Three ways to add R code tags:

Add an R code chunk using the keyboard shortcut or the Insert button to the sleep_study.rmd file after the text “Here are the data:”

Our first output

Data sets stored as data.frame objects in R can be printed to the document by simply specifying the object name.

Inside our new code chunk, add the following code:

Let’s look closely at the output:

We can control all of this!

Code chunk options

Much of the power of rmarkdown via knitr lies in its wide array of options to control the appearance of R code output.

See here for a full list of knitr chunk options.

To specify chunk options, after ```{r, specify a chunk label (name), and comma, and then a list of options separated by commas. This is known as the chunk header.

All of the chunk options must be specified on one line (no line breaks).

Avoid the use of characters beside alphabetic characters and -.

Change the first line of the R code chunk to ```{r, mydata, echo=FALSE}.

Here mydata is the chunk label, and echo=FALSE is an option. Notice the use of commas to separate.

Common chunk options to control text output

As we saw, echo=FALSE suppresses printing of the R code. By default, echo is set to TRUE, but often we do not want our audience to see the underlying R code.

Here are some options to control our output (default of option specified in parentheses):

Change echo=FALSE to eval=FALSE.

Change eval=FALSE to results='hold'.

Suppressing warnings and messages

Many R functions display warnings and messages to the user.

knitr will print warnings and messages to the document by default, but they may be distracting to the reader.

We can use the chunk options:

A. Insert a new code chunk after the text “First, log-transforming the outcome **extra** was suggested:”.
B. Inside the code chunk, specify these 2 lines of code:

sleep$logextra <- log(sleep$extra)

Notice that a warning was printed to the document.

Add the chunk label log-transform and the option warning=FALSE to this second chunk, separated by commas.

Notice now that the warnings are printed to the R Markdown console.

NOTE: you may not want to suppress warnings and messages until you are sure everything is working correctly.

Global chunk options

If you know that you will need to set an option for multiple or all chunks, you can set them globally with a call to knitr::opts_chunk$set() in the first code chunk of the .rmd file.

A. Insert a new code chunk before the “# Purpose” header.
B. Give the chunk the label “setup” in the header.
C. Specify this code inside the chunk: knitr::opts_chunk$set(echo=FALSE).

The global option above sets echo=FALSE for all chunks, thus suppressing all R code.

If you’d like to see the R code in your document, delete this chunk or reset the option to echo=TRUE.

Usually we don’t want to see this setup chunk in the report. Suppress its printing.

Formatted tables with knitr::kable()

The function kable() from the knitr package produces pretty, formatted tables produced by R code (rather than the default R output style).

The table input is usually a data.frame, a matrix, or a table and is the first argument to kable().

kable() inludes arguments to control the number of digits printed, column names, column alignment, table caption, and other formatting options. See ?knitr::kable for details.

Look into the package kableextra to get many more formatting options for kable tables. See here for examples.

In the code chunk ‘mydata’, replace the first line of code sleep with knitr::kable(sleep, align='c').

Inline R code

We can also insert R code directly into text, which will be replaced by its output when rendered.

Enclose the inline R code with `r and `.

Inline R code itself will not be printed to the document.

Use Markdown tags to format the ouptut.

A. Replace the text “XXX” at the very end of the .Rmd file with the inline R code `r mean(sleep$extra)`.
B. Use Markdown tags ** to bold the result.

R code chunks and Figures

tidyverse package for this section

For this section of the seminar, we will be using the package tidyverse, a diverse collection of packages with many tools for data analysis. Specifically we will be using the following packages within tidyverse:

Please make sure you have tidyverse installed. You can check by issuing library(tidyverse) in the current environment. If it errors, please run install.packages("tidyverse") now.

Please open the mileage.rmd file to practice syntax for controlling R graphics.

rmarkdown, knitr and R graphics

knitr, and thus rmarkdown make including and formatting graphics in the documents quite easy

Graphics produced by R code are placed immediately after the generating code chunk.

Knit the mileage.rmd file and observe the placement of the R code and graphs.

Arranging multiple figures produced by the same code chunk

Notice that the three plots produced by the final code chunk of the mileage.rmd file are interleaved with the individual ggplot() commands that produced them.

The knitr chunk option fig.show determines how to place multiple plots:

Add the chunk option fig.show='hold' to the fourth chunk, mileage-graphs, of mileage.rmd. Leave this option at 'hold' for the remainder of the seminar.

Because the figures are large, knitr places them one after another.

Sizing and aligning figures

We can easily adjust the size of figures using the knitr chunk options:

If only one of fig.width or fig.height is specified, the other is not adjusted, unless fig.asp is also specified (it is NULL by default)

Add the chunk option fig.width=3 to the code chunk mileage-graphs of mileage.rmd and observe what happens to the size and positioning of the graphs.

Now add the chunk option fig.asp=1 to this same chunk. Don’t forget commas!

How could you change the size of all figures in the document to this size?

We also have an option to adjust the alignment of figures in the document:

fig.align:('default') 'default' is no alignment adjustment, and other possible values are 'left', 'center', and 'right'.

Place the option fig.align='center' inside of knitr::opts_chunk$set() in the very first chunk.

Figure captions

Though no captions are shown by default, knitr makes adding a figure caption easy with this option:

Add the option fig.cap='Fig 1' to the chunk sample (following the header ## The sample of cars)

Now add the option fig.cap='Fig 2' to the chunk mileage-graphs (following the header ## Mileage graphs) and observe an interesting result

Saving figures (Optional)

By default, knitr will embed the images into the final document as base64 strings, creating a single file with all content including images (rather than saving the images externally and linking them into the docuemnt).

If you would also like to save the R-produced images to external files, use:

R Notebooks

Where are my R objects?

You might have noticed that after we Knit a file, none of the R objects appear in the current R session.

Using the Knit button actually starts a new R session to render the document, where all the R code is executed and is then closed after rendering.

Rendering in a new session ensures that the document is reproducible (for instance on someone else’s computer), as it prevents any dependencies on objects (e.g. packages) in the current R session.

Still, it’s handy to be able to work with R objects as you construct your .rmd file, which R Notebooks allow.

R Notebooks

When editing an R Markdown document within RStudio, it will be edited as an R Notebook.

R Notebooks allow the user to execute each R code chunk interactively, which places the output immediately below the code chunk itself in the .rmd document.

R Notebooks are R Markdown files in every sense – they just provide an interactive mode for document editing.

You’ll know that RStudio is treating your .rmd file as an R Notebook if you see these buttons at the top right of each R code block:

Check for the buttons at the top right of your code chunks in the mileage.rmd file

Click the middle button (gray triangle and green bar) in the code chunk mileage-graphs of mileage.rmd.

Click the right button (green triangle) in the code chunk mileage-graphs of mileage.rmd.

YAML header

Purpose of the YAML header

In the YAML header we specify pandoc options that control the overall appearance of the output document.

YAML code tells pandoc which output document format (HTML webpage, LaTeX pdf, Word doc, etc.) to use and how to style it.

Different options are available for different output document formats.

See the R Markdown Reference Guide cheatsheet to see a table of options by output format (click on the Help menu in RStudio, then click Cheatsheets).

Specifying a YAML header

The YAML header is located at the top of the .rmd file and is enclosed in 2 sets of 3 dashes, ---

YAML syntax generally is option_name: value. For example:

The YAML header is actually optional, and if omitted completely, an HTML document will be produced.

We will cover many more YAML (pandoc) options as we discuss specific output formats

YAML Indentation

Indentation is important when specifying options to style the output document.

For example, below we add a table of contents that floats on the left of the document created by mileage.rmd. Notice the indentation and newlines in this specification:

title: "Mileage of American Cars"
    toc: TRUE
    toc_float: TRUE

Replace the YAML header of mileage.rmd with the header above. Make sure the indentation is copied faithfully.

Parameters in the YAML heading

R Markdown allows specification of parameters in the YAML heading that can be passed to R code anywhere in the document.

Parameters provide an easy mechanism to generate different customized reports depending on the inputs. For example, we can reproduce a data analysis document for different age cohorts or specify different analyses for different years of data.

To declare parameters, include the params: field in the YAML header, and underneath add one parameter per line, each specified as param_name: value, and each indented by 2 spaces.

To access the parameter value in R code, use params$param_name, where param_name is the name of the parameter specified in the YAML header.

A. Change the eval option in the final R code chunk, subgroup-plot, to eval=TRUE.
B. Add the following parameter specification to the YAML header:
  manufacturer: dodge

The first line is not indented, but the second line is indented 2 spaces.

Now try changing dodge to chevrolet, ford, jeep, lincoln, mercury, or pontiac and re-render the document

Notice how the parameter is accessed in the final chunk with params$manufacturer

Tour of Output Formats

Output Formats

R Markdown’s ability to produce a wide variety of document types using a single, unified coding framework is one of its biggest strengths.

The same .rmd file can produce an HTML document, a LaTeX .pdf document, a Word document, a PowerPoint slide show presentation, etc.

See here for a full list of available output formats. Formats are either documents or presentations.

Each output format has its own set of options that we can specify in the YAML header to control the document’s appearance.

For this section, please open the txhousing_sales.rmd file.

HTML documents

Markdown’s original purpose was to simplify HTML coding, so HTML documents naturally have the widest array of options available in rmarkdown.

We have already seen the use of toc: TRUE to add a table of contents.

Some more useful suboptions for HTML documents are:

Try adding a theme: suboption to txhousing_sales.rmd when the output format is html_document. Try a few of the theme_name specifications above.

Now add a highlight: suboption, using any of the style specifications above. (Make sure echo=TRUE in knitr::opts_chunk$set() in the first chunk to observe the results.)

Finally, add code_folding: hide, and try toggling the Code buttons throughout, and the master Code button at the top of the document.

A full list of options for HTML documents can be found at the R Markdown Definitive Guide

Raw HTML and CSS

If you want to have finer control over the appearance of your HTML documents (including some of the presentation formats we will discuss later), you will probably need to learn some HTML.

You can insert HTML directly into the document and in most cases it will render as expected. For example, <font color="red">ERROR:</font> produces ERROR:.

CSS (Cascading Style Sheets) is a language used to style markup languages like HTML. CSS code blocks that control the appearance of the document globally are often defined at the top of documents inside HTML <style> </style> tags.

Try adding the following CSS within HTML <style> tags to txhousing_sales.rmd immediately after the YAML header:
body {
background-color: AliceBlue;
font-family: Garamond, serif;
font-size: 20px;

Remove this style section when you feel you understand how it functions.

W3Schools is an excellent, free online resource for beginners to learn HTML and CSS.


R Markdown supports several slide-show-style presentation output formats, including:

The HTML slideshows are opened and viewed in a browser just like any other HTML file, while a beamer_presentation is viewed in a PDF viewer (e.g. Adobe Acrobat, more on PDF files later).

HTML slideshows can be styled with raw HTML and CSS, while Beamer presentations can be styled with LaTeX.

We will not be covering Beamer presentations in this seminar.

Slideshows often look and behave better in the actual output file than in the RStudio previewer.

Markdown in slideshow presentations

Slideshows use section headers to initiate new slides. For example