1,500 scientists lift the lid on reproducibility. Nature, 2016
1,500 scientists lift the lid on reproducibility. Nature, 2016
This seminar aims to teach the user basic R Markdown syntax to make dynamic, reproducible reports.
First we will discuss what R Markdown is, how it is used, and how it works.
The rest of the seminar focuses on R Markdown syntax, specifically:
knitr
options to format R code and outputThis seminar does not attempt to explain all of the R code used in the example reports.
Text that appears with this typeface and background
is usually code syntax you can use when authoring your R Markdown files. Buttons and menus in RStudio will also appear formatted this way.
Text that appears blockquoted like this is a set of instructions to construct an R Markdown file. Click the
Knit
button after finishing all instructions within a block to view the results of your coding.
“An authoring framework for data science” – R Markdown creators
R Markdown allows us to create reproducible documents that weave narrative text together with R code and the output it produces when executed.
For example, here is an R code block inserted into the R Markdown file that generates this slide show. Underneath the code is its output:
barplot(HairEyeColor[,,1], col=c("#4d4d4d", "#bf812d", "#f4a582", "#f6e8c3"), legend.text=TRUE, xlab="Eye Color", args.legend=list(title="Hair Color"))
These documents are dynamically generated – update code or data, then re-compile the file and the output will be automatically updated in the resulting document.
Your documents can thus provide the most up-to-date content.
We highly recommend working with R Markdown in RStudio, which has many features that facilitate R Markdown file editing, including:
Once R and RStudio are installed, you can install R Markdown with install.packages("rmarkdown")
as usual.
We can open a new Markdown file template through the File
menu in RStudio.
A. Choose
File
->New File
->R Markdown...
B. Fill theTitle
field andAuthor
fields with “Practice” and your name, respectively.
C. In the left menu, selectDocument
, and forDefault Output Format
select optionHTML
(these are the defaults).
D. Click OK
R Markdown files typically use the extension .Rmd
or .rmd
This new R Markdown file contains the basic elements of an R Markdown file:
Note: If you have not installed package rmarkdown
and try to open a .rmd
file through the File
menu, RStudio may ask you to install rmarkdown
immediately.
The header atop the code file contains optional metadata (title:
, author:
etc.).
Additionally, YAML code can be specified to control:
“YAML Ain’t Markup Language” or “Yet Another Markup Language” is a human-readable language, used here to communicate with Pandoc.
Pandoc converts documents between formats and controls their overall appearance. Pandoc is installed with RStudio.
In the body of the document, you’ll find text with special characters that have been highlighted blue:
##
signifies that the text is to be treated as a section header (or a new slide for a slide show)**
signify that “Knit” is to appear in bold##
and **
are Markdown tags which format the text enclosed within them.
Markdown is a markup language, a system of code tags to format text.
Finally, we also see R code chunks, highlighted with gray backgrounds and enclosed within ```{r }
and ```
.
The R code chunks are actually processed by the package knitr
, which is installed with rmarkdown
.
When the R Markdown file is compiled and rendered, the output of the code chunk will be embedded in the document underneath the code.
rmarkdown
(via knitr
) provides a large array of options to control the appearance of both the R code and its output.
When you are ready to compile and render your .rmd
file, click the Knit
button (Ctrl+Shift+K) in RStudio.
The output document will be saved in the same directory as where the .rmd
file is located.
RStudio will also provide a preview of the output document.
Progress and messages produced by rendering the .rmd file will be displayed in the R Markdown console
, which appears in its own tab next to the regular Console
.
Click the
Knit
button. Name the file whatever you want, but use the.rmd
extension. Observe how elements of the.rmd
file appear in the output.
R Markdown is thus the unification of 3 frameworks:
knitr
, to process R code chunksWhen we render the document, the following happens:
First, knitr
converts all of the R code and output into text and Markdown tags, resulting in a Markdown file (.md) of just text, Markdown tags, and links to image files.
Then, pandoc converts the Markdown (.md) file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.
Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.
Then, pandoc converts the .md file into the desired final output format, such as an HTML web page, a LaTex pdf document, or an ioslides slide-show presentation, etc.
Thus, to use R Markdown proficiently, we must learn each of the 3 coding frameworks: Markdown, knitr
and YAML (pandoc).
Fortunately, the coding is not difficult to learn, and you can find lots of free help online.
You can produce a wide variety of documents with just a little coding knowledge, but very fine control over documents will require that you learn more advanced coding (such as HTML or LaTeX).
R Markdown keeps it simple for the user by requiring a single format, the .rmd
file, to produce this wide variety of output.
A markup language is a system of tags (code) used to format documents. Tags are used to define sections, change the appearance of text, build tables, link images, and so on.
Hypertext Markup Language (HTML) is a markup language designed to be used for web pages.
HTML tags generally include both an opening and closing tag, and have this form <tag>
and </tag>
.
For example, enclosing text in <em>
tags results in italicized text.
<em>hello</em>
becomes hello when viewing the document in a web browser.
Markdown is a lightweight markup language, simple and easy to type.
Markdown was originally designed to be a shorthand for HTML. For example we just learned to italicize text in HTML, you enclose it in <em> </em>
. In Markdown, you just enclose it in * *
.
However, pandoc can convert Markdown to many different output formats besides just HTML.
R Markdown uses pandoc’s version of Markdown, which differs a bit from standard Markdown.
Because of its simplicity, Markdown is very easy to use, so let’s dive in!
In your currently open .rmd file, erase all content except for the YAML header.
Newlines (carriage returns) are considered spaces in Markdown.
First, try adding the following text broken by a single new-line (and no additional spaces) after the first period:
First.
Second.
In order to begin a new paragraph, insert a blank line (i.e. 2 newlines) before beginning the new paragraph. This will double-space paragraphs.
Now, try retyping the text with 2 newlines after the first period:
First.
Second.
Or, you can add 2 spaces before the new-line to single-space the paragraphs.
Finally, try retyping the text with 2 spaces and a single newline after the period:
First.
Second.
Section headers are often displayed in larger, bolder fonts.
To format text as a header, place one to six #
tags at the beginning of the header text. The number of #
signs indicate the level of the header (higher levels will be larger).
Put a space between the #
tags and the header text.
Headers must be preceded by at least one blank line.
Add a level 1 header called “Big Header” and a level 3 header called “Small Header”
Markdown provides simple tags to format text for emphasis as well as super- and subscripting:
*italics*
produces italics**bold**
produces bold~~strikethrough~~
produces `code`
produces code
text^super^
produces textsupertext~sub~
produces textsubDo not insert spaces in between formatting tags and the text.
Any character preceded by a backslash will be treated as a literal character and not as a code tag:
\*italics\*
produces *italics*
Try recreating this formatted text with Markdown syntax:
We multiplied x by z2 to create the interaction variable x*z2
Markdown bulleted lists are much easier to specify than HTML lists.
To create a list, precede each list item with *
(or +
or -
) and a space:
* item1
* item2
To add sublists, indent 4 spaces:
* item1
+ sub1.1
+ sub1.2
* item2
Use numbers with periods as tags for numbered lists:
1. item1
2. item2
3. item3
Square brackets, []
, convert text to a link to another part of the document or to an external webpage.
To create a link:
[]
.]
, place the location of the destination inside of ()
()
{#label}
, where label
is a name you supply. Then, for the link, after []
specify #label
within ()
.For example:
[UCLA](https://www.ucla.edu/)
produces this link: UCLA
[References](#references)
would create a link to the place in the document where {#references}
is specified.
Create a link that displays the text “IDRE website” in bold and sends the user to https://stats.idre.ucla.edu.
Embedding external images (not created by R code within the document) in a document uses syntax very similar to linking:
![Caption](image_location)
Note: Do not put the image path/name in quotes!
So, assuming there is a file named “densities.png” in the same directory as the .rmd
file:
![Fig 1 Densities by diet](rmarkdown_images/densities.png)
Alternatively, you can use HTML code:
<img src="rmarkdown_images/densities.png" alt="Fig 1 Densities by diet" style="width: 50%;">
Text enclosed by $
symbols will be treated as TeX Math, another set of markup tags used to format mathematical expressions.
For example, $mean(X) = \frac{\sum\_{i=1}^nX}{n}$
will be rendered as \(mean(X) = \frac{\sum_{i=1}^nX}{n}\)
A TeX expression enclosed by two $
sybmols on each side will be displayed as a equation, generally centered in the document. So, $$mean(X) = \frac{\sum_{i=1}^n X}{n}$$
produces \[mean(X) = \frac{\sum_{i=1}^n X}{n}\]
Although TeX Math is often associated with LaTeX documents, it can be used in any document type supported by RMarkdown.
knitr
, and R code chunksBy itself, R package knitr
is used to weave text and R code output together into reports.
R Markdown builds upon knitr
by allowing Markdown tags to format text and using Pandoc to convert between document formats.
knitr
executes code chunks sequentially when the .rmd
file is knit, so R objects created in a chunk are available to all subsequent chunks.
A large array of knitr
options provides control over the appearance of R code, text output, and graphical output in the final document.
Please open the
sleep_study.rmd
file to practice usingknitr
code chunks. You may close the.rmd
file we used to practice Markdown.
knitr
R code chunk delimitersR code chunks are delimited by ```{r chunk_label, options}
at the beginning and ```
at the end. The chunk_label and options are indeed optional and are separated by commas (much more on this soon).
Three ways to add R code tags:
Ctrl + Alt + I
(Cmd + Option + I
on Macs)knit
button and looks like this: Add an R code chunk using the keyboard shortcut or the Insert button to the
sleep_study.rmd
file after the text “Here are the data:”
Data sets stored as data.frame
objects in R can be printed to the document by simply specifying the object name.
Inside our new code chunk, add the following code:
sleep
summary(sleep)
Let’s look closely at the output:
##
, double comment symbols, allowing the user to copy and paste all of the R-related content without interfering with execution of the R code.We can control all of this!
Much of the power of rmarkdown
via knitr
lies in its wide array of options to control the appearance of R code output.
See here for a full list of knitr
chunk options.
To specify chunk options, after ```{r
, specify a chunk label (name), and comma, and then a list of options separated by commas. This is known as the chunk header.
All of the chunk options must be specified on one line (no line breaks).
Avoid the use of characters beside alphabetic characters and -
.
Change the first line of the R code chunk to
```{r mydata, echo=FALSE}
.
Here mydata
is the chunk label, and echo=FALSE
is an option. Notice the use of commas to separate.
As we saw, echo=FALSE
suppresses printing of the R code. By default, echo
is set to TRUE
, but often we do not want our audience to see the underlying R code.
Here are some options to control our output (default of option specified in parentheses):
echo
:(TRUE
) whether to print the R code to the document. Can be set to a vector of numbers to print only specific lines of code.eval
:(TRUE
) whether or not to evaluate (run) the R code chunk. Can be set to vector of numbers to evaluate only specific lines of the code, e.g. eval=1:3
evaluates only the first 3 lines of code.include
:(TRUE
) whether to include the R code and output in the document. Differs from eval
in that if include=FALSE
, the R code is still evaluated, but nothing is printed to the document.results
:('markup'
) how to print results (note use of single quotes for setting values)
'markup'
prints the output with special formatting per the document type'hide'
suppress printing of the output.'hold'
prints all of the output of the entire code chunk together at the end.'asis'
prints raw results without special formatting.Change
echo=FALSE
toeval=FALSE
.
Change
eval=FALSE
toresults='hold'
.
Many R functions display warnings and messages to the user.
knitr
will print warnings and messages to the document by default, but they may be distracting to the reader.
We can use the chunk options:
message
:(TRUE
) whether to print messages to the documentwarning
: (TRUE
) whether to print warnings to the documentA. Insert a new code chunk after the text “First, log-transforming the outcome **extra** was suggested:”.
B. Inside the code chunk, specify these 2 lines of code:
sleep$logextra <- log(sleep$extra)
sleep$logextra
Notice that a warning was printed to the document.
Add the chunk label
log-transform
and the optionwarning=FALSE
to this second chunk, separated by commas.
Notice now that the warnings are printed to the R Markdown console.
NOTE: you may not want to suppress warnings and messages until you are sure everything is working correctly.
If you know that you will need to set an option for multiple or all chunks, you can set them globally with a call to knitr::opts_chunk$set()
in the first code chunk of the .rmd file.
A. Insert a new code chunk before the “# Purpose” header.
B. Give the chunk the label “setup” in the header.
C. Specify this code inside the chunk:knitr::opts_chunk$set(echo=FALSE)
.
The global option above sets echo=FALSE
for all chunks, thus suppressing all R code.
If you’d like to see the R code in your document, delete this chunk or reset the option to echo=TRUE
.
Usually we don’t want to see this setup chunk in the report. Suppress its printing.
knitr::kable()
The function kable()
from the knitr
package produces pretty, formatted tables produced by R code (rather than the default R output style).
The table input is usually a data.frame
, a matrix
, or a table
and is the first argument to kable()
.
kable()
inludes arguments to control the number of digits printed, column names, column alignment, table caption, and other formatting options. See ?knitr::kable
for details.
Look into the package kableextra
to get many more formatting options for kable
tables. See here for examples.
In the code chunk ‘mydata’, replace the first line of code
sleep
withknitr::kable(sleep, align='c')
.
We can also insert R code directly into text, which will be replaced by its output when rendered.
Enclose the inline R code with `r
and `
.
Inline R code itself will not be printed to the document.
Use Markdown tags to format the ouptut.
A. Replace the text “XXX” at the very end of the .Rmd file with the inline R code
`r
mean(sleep$extra)`
.
B. Use Markdown tags**
to bold the result.
tidyverse
package for this sectionFor this section of the seminar, we will be using the package tidyverse
, a diverse collection of packages with many tools for data analysis. Specifically we will be using the following packages within tidyverse
:
ggplot2
for plotsdplyr
for data managmentbroom
to make regression tables into printable data framesPlease make sure you have
tidyverse
installed. You can check by issuinglibrary(tidyverse)
in the current environment. If it errors, please runinstall.packages("tidyverse")
now.
Please open the
mileage.rmd
file to practice syntax for controlling R graphics.
rmarkdown
, knitr
and R graphicsknitr
, and thus rmarkdown
make including and formatting graphics in the documents quite easy.
Graphics produced by R code are placed immediately after the generating code chunk.
Knit the
mileage.rmd
file and observe the placement of the R code and graphs.
Notice that the three plots produced by the final code chunk of the mileage.rmd file are interleaved with the individual ggplot()
commands that produced them.
The knitr
chunk option fig.show
determines how to place multiple plots:
fig.show
: (asis
) how to arrange plots produced by the same chunk, taking on one of these settings:
asis
places each plot immediately after the code that produced it, the defaulthold
places all plots at the very end of the code chunkhide
saves the plots to files, but don’t place them in the document (requires the option fig.path
be set to a valid path to save the plots)Add the chunk option
fig.show='hold'
to the fourth chunk,mileage-graphs
, ofmileage.rmd
. Leave this option at'hold'
for the remainder of the seminar.
Because the figures are large, knitr
places them one after another.
We can easily adjust the size of figures using the knitr
chunk options:
fig.width
:(7
) width of figure in inchesfig.height
:(5
) height of figure in inchesfig.asp
:(NULL
) aspect ratio of plot, height/widthIf only one of fig.width
or fig.height
is specified, the other is not adjusted, unless fig.asp
is also specified (it is NULL
by default)
Add the chunk option
fig.width=3
to the code chunkmileage-graphs
ofmileage.rmd
and observe what happens to the size and positioning of the graphs.
Now add the chunk option
fig.asp=1
to this same chunk. Don’t forget commas!
How could you change the size of all figures in the document to this size?
We also have an option to adjust the alignment of figures in the document:
fig.align
:('default'
) 'default'
is no alignment adjustment, and other possible values are 'left'
, 'center'
, and 'right'
.
Place the option
fig.align='center'
inside ofknitr::opts_chunk$set()
in the very first chunk.
Though no captions are shown by default, knitr
makes adding a figure caption easy with this option:
fig.cap
:(NULL
) a character string specifying a figure caption, or NULL
for no captionAdd the option
fig.cap='Fig 1'
to the chunksample
(following the header ## The sample of cars)
Now add the option
fig.cap='Fig 2'
to the chunkmileage-graphs
(following the header ## Mileage graphs) and observe an interesting result
By default, knitr
will embed the images into the final document as base64 strings, creating a single file with all content including images (rather than saving the images externally and linking them into the docuemnt).
If you would also like to save the R-produced images to external files, use:
fig.path
: ('figure/'
) prefix to use to name image files, with directory names allowed. For example, fig.path='my_figs/plot-'
will create a new directory within the working directory called my_figs
, in which each file will be named beginning with the prefix plot-
and its corresponding chunk label.You might have noticed that after we Knit
a file, none of the R objects appear in the current R session.
Using the Knit
button actually starts a new R session to render the document, where all the R code is executed and is then closed after rendering.
.rmd
file. You can change the working directory (of the new R session only) within the .rmd
file with setwd()
as usual, and this will not change the working directory of the current session..rmd
document. These will not be loaded into the current session.Rendering in a new session ensures that the document is reproducible (for instance on someone else’s computer), as it prevents any dependencies on objects (e.g. packages) in the current R session.
Still, it’s handy to be able to work with R objects as you construct your .rmd
file, which R Notebooks allow.
When editing an R Markdown document within RStudio, it will be edited as an R Notebook.
R Notebooks allow the user to execute each R code chunk interactively, which places the output immediately below the code chunk itself in the .rmd
document.
R Notebooks are R Markdown files in every sense – they just provide an interactive mode for document editing.
You’ll know that RStudio is treating your .rmd
file as an R Notebook if you see these buttons at the top right of each R code block:
Check for the buttons at the top right of your code chunks in the
mileage.rmd
file
Click the middle button (gray triangle and green bar) in the code chunk
mileage-graphs
ofmileage.rmd
.
Click the right button (green triangle) in the code chunk
mileage-graphs
ofmileage.rmd
.
In the YAML header we specify pandoc options that control the overall appearance of the output document.
YAML code tells pandoc which output document format (HTML webpage, LaTeX pdf, Word doc, etc.) to use and how to style it.
Different options are available for different output document formats.
See the R Markdown Reference Guide
cheatsheet to see a table of options by output format (click on the Help
menu in RStudio, then click Cheatsheets
).
The YAML header is located at the top of the .rmd
file and is enclosed in 2 sets of 3 dashes, ---
YAML syntax generally is option_name: value
. For example:
title: Analysis of Health Outcomes
output: html_document
The YAML header is actually optional, and if omitted completely, an HTML document will be produced.
We will cover many more YAML (pandoc) options as we discuss specific output formats
Indentation is important when specifying options to style the output document.
output:
should not be indented
pdf_document
)
For example, below we add a table of contents that floats on the left of the document created by mileage.rmd
. Notice the indentation and newlines in this specification:
---
title: "Mileage of American Cars"
output:
html_document:
toc: TRUE
toc_float: TRUE
---
Replace the YAML header of
mileage.rmd
with the header above. Make sure the indentation is copied faithfully.
R Markdown allows specification of parameters in the YAML heading that can be passed to R code anywhere in the document.
Parameters provide an easy mechanism to generate different customized reports depending on the inputs. For example, we can reproduce a data analysis document for different age cohorts or specify different analyses for different years of data.
To declare parameters, include the params:
field in the YAML header, and underneath add one parameter per line, each specified as param_name: value
, and each indented by 2 spaces.
To access the parameter value in R code, use params$param_name
, where param_name
is the name of the parameter specified in the YAML header.
A. Change theeval
option in the final R code chunk,subgroup-plot
, toeval=TRUE
.
B. Add the following parameter specification to the YAML header:--- params: manufacturer: dodge ---
The first line is not indented, but the second line is indented 2 spaces.
Now try changing
dodge
tochevrolet
,ford
,jeep
,lincoln
,mercury
, orpontiac
and re-render the document
Notice how the parameter is accessed in the final chunk with params$manufacturer
R Markdown’s ability to produce a wide variety of document types using a single, unified coding framework is one of its biggest strengths.
The same .rmd
file can produce an HTML document, a LaTeX .pdf document, a Word document, a PowerPoint slide show presentation, etc.
See here for a full list of available output formats. Formats are either documents or presentations.
Each output format has its own set of options that we can specify in the YAML header to control the document’s appearance.
For this section, please open the
txhousing_sales.rmd
file.
Markdown’s original purpose was to simplify HTML coding, so HTML documents naturally have the widest array of options available in rmarkdown
.
We have already seen the use of toc: TRUE
to add a table of contents.
Some more useful suboptions for HTML documents are:
theme: theme_name
, where theme_name
is one of default
, cerulean
, journal
, flatly
, darkly
, readable
, spacelab
, united
, cosmo
, lumen
, paper
, sandstone
, simplex
, and yeti
. This adjusts the font family, font colors, and possibly the background colors.highlight: style
, where style
is one of default
, tango
, pygments
, kate
, monochrome
, espresso
, zenburn
, haddock
, textmate
, or null
. This adjusts the style of the code syntax highlighting.code_folding: hide
will hide code from the reader unless the reader toggles a Code
button that appears in its place. You can also specify code_folding: show
to show the code by default, but allow it to be hidden by the user.Try adding a
theme:
suboption totxhousing_sales.rmd
when the output format ishtml_document
. Try a few of thetheme_name
specifications above.
Now add a
highlight:
suboption, using any of thestyle
specifications above. (Make sureecho=TRUE
inknitr::opts_chunk$set()
in the first chunk to observe the results.)
Finally, add
code_folding: hide
, and try toggling theCode
buttons throughout, and the masterCode
button at the top of the document.
A full list of options for HTML documents can be found at the R Markdown Definitive Guide
If you want to have finer control over the appearance of your HTML documents (including some of the presentation formats we will discuss later), you will probably need to learn some HTML.
You can insert HTML directly into the document and in most cases it will render as expected. For example, <font color="red">ERROR:</font>
produces ERROR:.
CSS (Cascading Style Sheets) is a language used to style markup languages like HTML.
```{css, echo=FALSE}
slides > slide {
overflow-y: auto !important;
}
```
---
title: "My Presentation"
output:
ioslides_presentation:
css: "mystyle.css"
---
<style> </style>
tags.Try adding the following CSS within HTML<style>
tags totxhousing_sales.rmd
immediately after the YAML header:<style> body { background-color: AliceBlue; font-family: Garamond, serif; font-size: 20px; } </style>
Remove this style section when you feel you understand how it functions.
W3Schools is an excellent, free online resource for beginners to learn HTML and CSS.
R Markdown supports several slide-show-style presentation output formats, including:
powerpoint_presentation
slidy_presentation
, HTMLioslides_presentation
, HTMLbeamer_presentation
, LaTeX PDFThe HTML slideshows are opened and viewed in a browser just like any other HTML file, while a beamer_presentation
is viewed in a PDF viewer (e.g. Adobe Acrobat, more on PDF files later).
HTML slideshows can be styled with raw HTML and CSS, while Beamer presentations can be styled with LaTeX.
We will not be covering Beamer presentations in this seminar.
Slideshows often look and behave better in the actual output file than in the RStudio previewer.
Slideshows use section headers to initiate new slides. For example ## Purpose
will initiate a new slide with the header “Purpose”.
First-level section headers (i.e. # Header
) will become title slides, and should not have any accompanying text underneath.
Including anything in a title slide besides the header itself can mangle all of the subsequent slides.
Second-level section headers (i.e. ## Header
) will intiate new slides and may have additional content underneath.
You can also start a new slide without a header using ---
, with a blank line before and after (this tag produces a horizontal line in non-presentations).
To have bulleted items appear on click (when advancing the slide) use >-
instead of *
.
PowerPoint presentations are a more recent addition to R Markdown.
PowerPoint slideshows have some limitations such as images and tables are always placed on new slides, and can have no accompanying text other than the slide header and a caption.
Change the YAML header intxhousing_sales.rmd
to this:--- title: "Texas housing sales, 2000-2015" output: powerpoint_presentation: toc: true params: spotlight: "Houston" ---
ioslides are simple and easy to use with basic customization, less focus on technical customization.
Uses YAML metadata header to control the appearance and behavior of slides.
This seminar is a ioslides_presentation
.
Change the YAML header intxhousing_sales.rmd
to this:--- title: "Texas housing sales, 2000-2015" output: ioslides_presentation: widescreen: true params: spotlight: "Houston" ---
Slidy presentations by default have simple styling, but are highly customizable.
In Slidy presentation, the vertical size of slides is unlimited by default, so you can scroll down slides.
Slidy offers more extensive customization options compared to ioslides.
Change the YAML header intxhousing_sales.rmd
to this:--- title: "Texas housing sales, 2000-2015" output: slidy_presentation: font_adjustment: -1 footer: Created in R params: spotlight: "Houston" ---
A couple of options unique to slidy_presentations
:
font_adjustemnt: -1
decreases font size, while font_adjustment: +1
increases font sizefooter: text
adds text
to the footer of each slideSpecifying output: pdf_document
in the YAML header will produce a .pdf file formatted with LaTeX. Of course, there are suboptions available for pdf_document
s.
Replace the entire YAML header intxhousing_sales.rmd
with this header--- title: "Texas housing sales, 2000-2015" output: pdf_document: params: spotlight: "Houston" ---
The RStudio previewer for PDF documents is separate from the previewer for HTML documents.
Any raw HTML code in a R Markdown file that is destined for a pdf_document
will be ignored. Similarly, any LaTeX in a file destined for html_document
or one of the HTML presentation formats will similarly be ignored.
Technical Note: Outputting a pdf_document
requires that you have some distribution of TeX installed on your computer (e.g MiKTeX or TeX Live). You can install a small version of TeX on your computer directly through R by first running install.packages(tinytex)
and then tinytex::install_tinytex()
, which will install enough TeX to output a R Markdown pdf_document
.
LaTeX (pronounced LAY-tech) is another document markup language, allowing the user to use tags to format plain text with very fine control. Compiling a LaTeX file into a readable PDF document requires that a TeX distribtuion (e.g. MikTeX) be installed as well.
LaTeX is often used to produce scientific documents, as it is particularly well suited to produce beautiful mathematical equations.
LaTeX tags begin with the forward slash, and usually have the syntax: tag{value}{text}
, where tag
is the name of the markup tag, value
is its assigned value, and text
is the text to which the formatting will be applied.
Replace the heading towards the top,
# Background
, with# \color{red}{Background}
Overleaf is a good place for new users of LaTeX to learn.
Some of the options for HTML documents that we have seen are also available for LaTeX PDF documents:
toc: true
for a table of contentshighlight: style
for syntax highlighting of R code (see a few slides back for available <em>style</em>
s)Another useful option for novice LaTeX users is to switch to the xelatex
engine, with this option:
late_engine: xelatex
, allows the use of system fonts through this option (not available for default engine pdflatex
):
mainfont: font-family
, specifies that the font font-family
be used throughout the document.Somewhat confusingly, the mainfont
option is actually a top-level option, passed directly to pandoc, so should not be indented.
Pandoc has many other top-level options (i.e. not indented) for LaTeX documents that can be specified in the YAML header of an R Markdown file. See here for more of these top-level options.
Set the YAML header to this specification (change Georgia to another font if it is not installed on your computer)--- title: "Texas housing sales, 2000-2015" output: pdf_document: latex_engine: xelatex mainfont: Georgia params: spotlight: "Houston" ---
Remember that Markdown was designed for HTML, so formatting the document exactly as you want will usually be more difficult in a LaTeX PDF document.
You may find that images are placed where you didn’t intend, particularly if there are several consecutive images. This behavior can be difficult to control. It is recommended that you complete the content of the document before trying to fine tune placement of images.
Pay attention to warnings and messages in the R Markdown Console, not the regular R console (when looking for rendering errors).
Name code chunks to help identify where code is breaking.
If your markdown tags at the beginning of a line don’t seem to be working, try adding a newline before the tags.
Use Notebooks to isolate where the code is not working.
Provide the original source file.
HTML documents are usually more accessible than PDF.
Label all graphics with alt text tag and try to use informative text.
Use #
to organize content by creating section headers, not to control size of the font.
Use LaTex for inserting mathematical content.
R Markdown: The Definitive Guide The free, most comprehensive guide to R Markdown, written by its creators. Most of the content of this seminar can be found in this book.
R Markdown Cookbook Another free guide, and shares a primary author with R Markdown: The Definitive Guide (Yihui Xie). More advanced, explaining the usage of R Markdown through short examples and providing solutions to common questions.
The R Markdown Cheatsheet and R Markdown Reference guide, accessible through the Help
-> Cheatsheets
menus, are handy reference covering most of the basics.
Knitr Chunk Options Reference page has a full list of knitr
chunk options to control R output.
R Markdown is easy enough to use that a little experience with each of the coding frameworks will give you sufficient flexibility to create reports in different formats with widely varying appearances.
You won’t find a report-generating system nearly as powerful and easy-to-use as R Markdown in any other statistical software.
Go see the R Markdown website or The Definitive Guide to see even more uses of R Markdown, including the generation of books, blog pages, and interactive documents.