{"id":647,"date":"2016-12-01T21:49:36","date_gmt":"2016-12-01T21:49:36","guid":{"rendered":"http:\/\/tech.me.holycross.edu\/?p=647"},"modified":"2016-12-01T21:49:36","modified_gmt":"2016-12-01T21:49:36","slug":"rstudio-as-a-research-and-writing-platform","status":"publish","type":"post","link":"https:\/\/blogs.holycross.edu\/tech\/2016\/12\/01\/rstudio-as-a-research-and-writing-platform\/","title":{"rendered":"RStudio as a Research and Writing Platform"},"content":{"rendered":"<p>R (<a href=\"https:\/\/www.r-project.org\/\">r-project.org<\/a>) is a programming language and software platform for statistical computing and graphics, widely used in academia and industry (see <a href=\"http:\/\/tech.me.holycross.edu\/2015\/03\/17\/introduction-to-r\/\">Introduction to R<\/a>). <a href=\"https:\/\/www.rstudio.com\/\">RStudio<\/a> is an <a href=\"https:\/\/en.wikipedia.org\/wiki\/Integrated_development_environment\">integrated development environment<\/a> for R. RStudio makes R easier to use, and it also enables the creation and rendering of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Plain_text\">plain-text<\/a> documents that contain embedded R code. With RStudio, you can encapsulate the code and data for your analysis within the text of your paper, fostering research transparency and replicability of results. An increasing number of scholarly journals are requiring that authors submit such replication materials as a condition of publication (see, for example, <a href=\"https:\/\/ajps.org\/2015\/03\/26\/the-ajps-replication-policy-innovations-and-revisions\/\">The AJPS Replication Policy: Innovations and Revisions<\/a>), and are providing guidelines for data archiving in support of reproducible research (e.g., <a href=\"http:\/\/biostatistics.oxfordjournals.org\/content\/10\/3\/405.full.pdf\">Reproducible research and Biostatistics<\/a> and <a href=\"http:\/\/isps.yale.edu\/news\/blog\/2013\/07\/the-role-of-data-repositories-in-reproducible-research\">The Role of Data Repositories in Reproducible Research<\/a>).<\/p>\n<p><!--more--><\/p>\n<p>RStudio can also be used to insert literature citations into your text and produce formatted bibliographies, using <a href=\"http:\/\/rmarkdown.rstudio.com\">R Markdown<\/a>, an R-flavored variant of the <a href=\"https:\/\/daringfireball.net\/projects\/markdown\/\">Markdown<\/a> language, and the <a href=\"http:\/\/www.bibtex.org\/\">BibTeX bibliographic system<\/a>. RStudio has also recently developed <a href=\"http:\/\/rmarkdown.rstudio.com\/r_notebooks.html\">R Notebooks<\/a>, which are R Markdown documents that provide a rich workflow for interactive data analysis. R Markdown documents and R Notebooks both can be rendered into publication-quality output in a variety of formats, including HTML, PDF, and Microsoft Word. All of these tools are free and will run on any computer platform.<\/p>\n<h4 id=\"toc_1\">Reproducible Research<\/h4>\n<p>In an <a href=\"https:\/\/channel9.msdn.com\/Events\/useR-international-R-User-conference\/useR2016\/Notebooks-with-R-Markdown\">18-minute video<\/a>, J.J. Allaire, Founder and CEO of RStudio, states:<\/p>\n<blockquote><p>Those who receive the results of modern data analysis have limited opportunity to verify the results by direct observation. Users of the analysis have no option but to trust the analysis, and by extension the software that produced it. This places an obligation on all creators of software to program in such a way that the computations can be understood and trusted.<\/p><\/blockquote>\n<p>This leads to the concept of <a href=\"https:\/\/www.coursera.org\/learn\/reproducible-research\">reproducible research<\/a>, &#8220;the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available.&#8221;<\/p>\n<p>The author of <a href=\"https:\/\/www.r-bloggers.com\/what-is-reproducible-research\/\">What is reproducible research?<\/a> lists the following criteria:<\/p>\n<blockquote><p>A study can be truly reproducible when it satisfies at least the following three criteria.<\/p>\n<p>\u2013 All methods are fully reported.<\/p>\n<p>\u2013 All data and files used for the analysis are (publicly) available.<\/p>\n<p>\u2013 The process of analyzing raw data is well reported and preserved.<\/p><\/blockquote>\n<p>An excellent reference is <a href=\"https:\/\/www.crcpress.com\/Reproducible-Research-with-R-and-R-Studio-Second-Edition\/Gandrud\/p\/book\/9781498715379\">Reproducible Research with R and RStudio, Second Edition<\/a> by Christopher Gandrud. The author has freely provided this book <a href=\"https:\/\/github.com\/christophergandrud\/Rep-Res-Book\">in reproducible form<\/a>. Pre-compiled PDF versions can also be found in various internet locations, such as <a href=\"https:\/\/englianhu.files.wordpress.com\/2016\/01\/reproducible-research-with-r-and-studio-2nd-edition.pdf\">here<\/a>.<\/p>\n<p>This post will demonstrate the use of RStudio as a platform for the production of transparent, reproducible research. RStudio facilitates a form of the <a href=\"http:\/\/tech.me.holycross.edu\/2015\/11\/18\/theplaintextworkflow\/\">plain-text workflow<\/a> in which you can write, cite the literature and produce formatted bibliographies, perform statistical analyses, create graphics, and execute code in R and several other programming languages, all from one, plain-text document. Because the document contains only plain text, it is <a href=\"http:\/\/doycetesterman.com\/index.php\/2014\/12\/my-plain-text-workflow\/\">futureproof<\/a>, easily archived and shared, can be edited on any type of computing device, and is fully compatible with <a href=\"https:\/\/en.wikipedia.org\/wiki\/Version_control\">version control systems<\/a>.<\/p>\n<p><!--nextpage--><\/p>\n<h4 id=\"toc_2\">Software Installation<\/h4>\n<p>To use RStudio, you first have to install R. DO NOT install RStudio first. R has to be installed first, followed by RStudio.<\/p>\n<p>To install R, go to the <a href=\"http:\/\/www.r-project.org\/\">R Web site<\/a>. Under &#8220;Getting Started:&#8221;, click <a href=\"http:\/\/cran.r-project.org\/mirrors.html\">download R<\/a>. Choose a CRAN (<strong>C<\/strong>omprehensive <strong>R<\/strong>\u00a0<strong>A<\/strong>rchive <strong>N<\/strong>etwork) mirror site that is closest to you. This should bring you to the R download\u00a0page, the relevant part of which will look something like this:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/cran-e1480686858474.png\" alt=\"\" \/><\/p>\n<p>Under &#8220;Download and Install R,&#8221; download and run the installer (&#8220;Precompiled binary distribution&#8221;) for your particular computer platform. Follow the installation instructions and you should be off and running.\u00a0Extensive gory details can be found at <a href=\"http:\/\/cran.r-project.org\/doc\/manuals\/r-release\/R-admin.html\">R Installation and Administration<\/a>.<\/p>\n<p>After installing R, there should be an R icon somewhere on your computer system, or perhaps an entry in an Applications folder or start menu. When you run R, you will be brought to the R console, which looks like this on an iMac:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/r.png\" alt=\"\" \/><\/p>\n<p>After briefly studying the R console, terminate R and forget about it, because we will be using R from inside of RStudio. We will not dwell on the details of using R; for that, please see <a href=\"http:\/\/tech.me.holycross.edu\/2015\/03\/17\/introduction-to-r\/\">Introduction to R<\/a>. We will focus instead on RStudio, which is what you need to install next. You must have R installed before you can use RStudio, but once RStudio is installed you do not need to have R running, as RStudio contains its own instance of R.<\/p>\n<p>To install RStudio, go to <a href=\"https:\/\/www.rstudio.com\/products\/rstudio\/download\/\">the RStudio Desktop download site<\/a>, and download and run the installer for your particular computer platform. After installation, run RStudio, and you should see something like this:<\/p>\n<p><a href=\"http:\/\/live-hcblog.pantheonsite.io\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-650\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio2-300x216.png\" alt=\"rstudio2\" width=\"300\" height=\"216\" srcset=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio2-300x216.png 300w, https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio2-1024x736.png 1024w, https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio2-768x552.png 768w, https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio2.png 1365w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>(Click to enlarge.)<\/p>\n<p>The above screenshot is from an iMac, but Windows and Linux users should find it comparable to RStudio running on their systems.<\/p>\n<p>For PDF output in RStudio, you also need to install a version of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/TeX\">TeX typesetting system<\/a>. Specifically, you need to install either <a href=\"http:\/\/www.miktex.org\/\">MiKTeX<\/a> on Windows, <a href=\"http:\/\/www.tug.org\/mactex\/\">MacTeX<\/a> 2013+ on OS X\/macOS (best to download with Safari, and use the full version, not the smaller BasicTeX), or <a href=\"http:\/\/www.tug.org\/texlive\/\">TeX Live<\/a> 2013+ on Linux).<\/p>\n<p><!--nextpage--><\/p>\n<h4>Rendering Documents in RStudio<\/h4>\n<p>We will first examine RStudio as a platform for writing plain text <a href=\"http:\/\/rmarkdown.rstudio.com\/\">R Markdown<\/a> documents, inserting bibliographic citations, producing formatted bibliographies, and rendering the Markdown document into publication-quality output.<\/p>\n<p>Markdown is a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Lightweight_markup_language\">lightweight markup language<\/a> with a <a href=\"https:\/\/daringfireball.net\/projects\/markdown\/syntax\">simple syntax<\/a> designed to streamline the process of formatting and rendering plain-text documents. (See <a href=\"http:\/\/tech.me.holycross.edu\/2015\/11\/18\/theplaintextworkflow\/\">The Plain Text Workflow<\/a> for a discussion of why you should be doing all of your writing in plain text instead of in a word processor.) While they can be rendered into many different publication-ready formats, including PDF, HTML, and Microsoft Word, Markdown files can stand on their own as human-readable text documents without being rendered. This is a big advantage for archiving and sharing, because no special software is needed to read Markdown and R Markdown files. Any plain-text editor\u00a0will do (see <a href=\"https:\/\/en.wikipedia.org\/wiki\/List_of_text_editors\">List of text editors<\/a>), and every computer already comes with one installed (TextEdit for Mac, Notepad for Windows, Emacs for Linux\/Unix, etc.). RStudio has developed <a href=\"http:\/\/rmarkdown.rstudio.com\/lesson-1.html\">R Markdown<\/a>, which preserves the syntax of the original Markdown but also allows the inclusion of blocks (&#8220;chunks&#8221;) of R code and code from several other programming, database, and scripting languages. R Markdown also has enhancements for tables, footnotes, citations, and other features of scholarly documents. As with the original Markdown, R Markdown is plain-text and human-readable, meaning that &#8220;<a href=\"https:\/\/druedin.com\/2015\/12\/25\/why-knitr-beats-sweave\/\">anyone who has never even heard of R Markdown can understand what is happening to some extent<\/a>.&#8221;<\/p>\n<p>Two handy PDF references for R Markdown are the <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2016\/03\/rmarkdown-cheatsheet-2.0.pdf\">R Markdown Cheat Sheet<\/a> and the <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2015\/03\/rmarkdown-reference.pdf\">R Markdown Reference Guide<\/a>. These two documents cover all that you will ever need to know about R Markdown syntax, options, and output formats. Here we will focus on the basic features that you will need to get started with RStudio and R Markdown.<\/p>\n<blockquote><p>As you work in RStudio, it&#8217;s possible that you will get messages about one or more R packages (for example, <code>rmarkdown<\/code>) not being installed. If this happens, just go to the Packages tab and install them. Click the Install button, search for the package name, be sure <code>Install dependencies<\/code> is checked, and install the missing packages. RStudio may also present a message saying that it wants to install required or updated packages. Say Yes.<\/p><\/blockquote>\n<p>RStudio provides templates for both R Markdown and R Notebooks. In RStudio, click the File menu, then select New File:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/newFile.png\" alt=\"\" \/><\/p>\n<p>Choose R Markdown, give the document a title, and a text editor window will then open containing the R Markdown template:<\/p>\n<p><a href=\"http:\/\/live-hcblog.pantheonsite.io\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-652\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio3-300x247.png\" alt=\"rstudio3\" width=\"300\" height=\"247\" srcset=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio3-300x247.png 300w, https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio3-1024x842.png 1024w, https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio3-768x632.png 768w, https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/rstudio3.png 1380w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>(Click to enlarge.)<\/p>\n<p>&nbsp;<\/p>\n<p>You should now have 4 panes open in RStudio. The content of the panes can be customized under Pane Layout in RStudio Preferences. What you see in the above screenshot is the default layout. The upper-left pane is a text editor containing the R Markdown document we just created. This editor can also be used to create R scripts and various other plain-text source files. Below the editor is a pane containing the R Console. This is exactly the same console that you would get if you ran R by itself, independently from RStudio. (The Console pane contains an instance of R running inside of RStudio. You can have multiple instances of RStudio running at once, each running its own separate instance of R.) You can enter R commands directly into the Console, and text output from R commands will also appear here. In the upper right pane are tabs for the Environment (containing variables and other data structures created during the session) and History (a list of all R commands entered during the session). In the lower-right pane are tabs for Files (your computer&#8217;s filesystem), Plots (graphics created by R), Packages (a tool for installing and updating <a href=\"http:\/\/www.statmethods.net\/interface\/packages.html\">R packages<\/a>), Help (the <a href=\"https:\/\/www.r-project.org\/help.html\">R help system<\/a>), and the Viewer (containing output from rendering R Markdown files and R Notebooks).<\/p>\n<p>Of course, there is a very nice <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2016\/01\/rstudio-IDE-cheatsheet.pdf\">RStudio Cheat Sheet<\/a>. The RStudio Help menu has links to additional very nice cheat sheets:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/cheatsheets.png\" alt=\"\" \/><\/p>\n<p>You can also access a Markdown Quick Reference from the RStudio Help menu. This will open up in the RStudio Help tab. You can copy-paste Markdown syntax from the Quick Reference into your R Markdown document. (Sadly, RStudio does not have Markdown syntax built into its editor, at least at the time of this writing.)<\/p>\n<p>The first part of the as-yet-unnamed R Markdown file looks like this:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"An R Markdown File\"\nauthor: \"RAL\"\ndate: \"10\/14\/2016\"\noutput: html_document\n---<\/code><\/pre>\n<\/div>\n<p>This is called the <a href=\"https:\/\/en.wikipedia.org\/wiki\/YAML\">YAML<\/a> header. YAML (<strong>Y<\/strong>AML <strong>A<\/strong>in&#8217;t <strong>M<\/strong>arkup <strong>L<\/strong>anguage) is <a href=\"http:\/\/yaml.org\/\">a human friendly data serialization standard for all programming languages<\/a>. In R Markdown, the YAML header (which is optional) is placed at the top of the document between lines that start with three dashes (<code>---<\/code>) and contains <a href=\"https:\/\/en.wikipedia.org\/wiki\/Metadata\">metadata<\/a> for the document (e.g., title, author, date) and other options that control how the document is rendered.<\/p>\n<p>Besides the YAML header, R Markdown documents contain either <a href=\"http:\/\/rmarkdown.rstudio.com\/lesson-8.html\">text<\/a> or <a href=\"http:\/\/rmarkdown.rstudio.com\/lesson-3.html\">code chunks<\/a>. Text is formatted using <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2015\/03\/rmarkdown-reference.pdf\">R Markdown syntax<\/a>. R code chunks are made with three backticks immediately followed on the same line by an <code>r<\/code> in braces. End the chunk with three more backticks, on a separate line:<\/p>\n<div>\n<pre><code class=\"language-none\">```{r}\npaste(\"Hello\", \"World!\")\n\n``` <\/code><\/pre>\n<\/div>\n<p>The backtick symbol is located to the left of the numeral 1 on your keyboard. It is NOT a single apostrophe (<code>'<\/code>).<\/p>\n<p>Additional R options can be placed inside of the braces, for example:<\/p>\n<div>\n<pre><code class=\"language-none\">```{r, echo=FALSE}\nsummary(cars)\n```<\/code><\/pre>\n<\/div>\n<p>The R option <code>echo=FALSE<\/code> will display the output of a code chunk but not the underlying R code.<\/p>\n<p>All of the R code for a chunk must be inside of the space defined by the two lines of backticks.<\/p>\n<p>At the top of the editor window is a toolbar:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/toolbar.png\" alt=\"\" \/><\/p>\n<p>Drop down the menu next to the <code>Knit<\/code> icon, and you will see options for rendering the R Markdown document into publication-quality output:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/knit.png\" alt=\"\" \/><\/p>\n<p>This function is called <code>Knit<\/code> because it invokes an R package called <a href=\"http:\/\/yihui.name\/knitr\/\">knitr<\/a> that &#8220;knits&#8221; the Markdown-formatted text, relevant YAML content, and R code chunks together into a rendered document.<\/p>\n<p>Try knitting the document into HTML, PDF, and Word formats. You will first need to give the R Markdown file a name, if you haven&#8217;t already done so. The default R Markdown file name extension is <code>Rmd<\/code>. The HTML rendering will appear in the RStudio Viewer pane. If it appears in a separate window, drop down the gear icon next to the <code>knit<\/code> button and select <code>Preview in Viewer Pane<\/code>. PDF output will open in a separate PDF viewer window. When you render the document to Microsoft Word format, it will open up in Microsoft Word. HTML, PDF, and Word files having the same name as your R Markdown document will be saved in your working directory.<\/p>\n<p>Let&#8217;s look at an HTML rendering:<\/p>\n<p><a href=\"http:\/\/live-hcblog.pantheonsite.io\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/html.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-656\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/html-300x238.png\" alt=\"html\" width=\"300\" height=\"238\" \/><\/a><\/p>\n<p>(Click to enlarge).<\/p>\n<p>&nbsp;<\/p>\n<p>In the above screenshot I have maximized the Editor and Viewer panes so we can see them side-by-side. The title, author, and date contained in the YAML header appear in the rendered document, but the YAML statement <code>output: html_document<\/code> does not because it is a formatting command, not printable text. Regular text appears in the HTML as formatted by <a href=\"https:\/\/www.rstudio.com\/wp-content\/uploads\/2015\/03\/rmarkdown-reference.pdf\">R Markdown syntax<\/a>. Text and graphical output produced by R code chunks appears in the rendered document at the point where the code chunks were placed in the original R Markdown document. Display of the code itself can be turned off in the rendered document by use of the <code>echo=FALSE<\/code> parameter in the code chunk:<\/p>\n<div>\n<pre><code class=\"language-none\">```{r pressure, echo=FALSE}\nplot(pressure)\n```<\/code><\/pre>\n<\/div>\n<p>If you knit the R Markdown document to HTML, PDF, and Word formats, you will notice that the YAML header in the R Markdown document has been modified:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"An R Markdown File\"\nauthor: \"RAL\"\ndate: \"10\/14\/2016\"\noutput:\n  word_document: default\n  pdf_document: default\n  html_document: default\n---<\/code><\/pre>\n<\/div>\n<p>RStudio inserts additional formatting parameters into the YAML header to control how the document is rendered. You can do this manually as well, simply by editing the YAML header directly, in the editor. For example, to control the size of R graphics in Microsoft Word, add this to the YAML header:<\/p>\n<div>\n<pre><code class=\"language-none\">output:\n  word_document:\n    fig_width: 3\n    fig_height: 3<\/code><\/pre>\n<\/div>\n<p>This will cause figures in the resulting Word file to be sized at 3 by 3 inches. You can also set YAML options interactively, by clicking the gear icon to the right of the <code>Knit<\/code> button, and choosing <code>Output Options<\/code>:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/options1.png\" alt=\"\" \/><\/p>\n<p>This will give you a dialog box from which you can choose the various output formats and set things like figure size, inclusion of figure captions, etc.:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/options2.png\" alt=\"\" \/><\/p>\n<p>When you make selections from the Output Options dialog, RStudio will make the appropriate changes to the YAML header in the R Markdown document. You can also edit those parameters directly in the YAML header, as mentioned earlier.<\/p>\n<p>See <a href=\"http:\/\/rmarkdown.rstudio.com\/formats.html\">RStudio Formats<\/a> for more formats and YAML options for rendering R Markdown documents.<\/p>\n<p><!--nextpage--><\/p>\n<h4 id=\"toc_4\">Writing and Citing in RStudio<\/h4>\n<p>For this example I am using a new R Markdown file, created using the RStudio template as demonstrated previously, but with everything deleted except the YAML header. I have titled the document &#8220;<a href=\"https:\/\/sora.unm.edu\/sites\/default\/files\/journals\/jfo\/v063n04\/p0411-p0419.pdf\">Our Friend the Catbird<\/a>&#8221; and have also deleted the YAML <code>author<\/code> and <code>date<\/code> fields to keep things simple. I have named the file <code>citations.Rmd<\/code>, but any file name will do as long as you use the <code>.Rmd<\/code> filename extension.<\/p>\n<p>Thus the YAML header currently looks like this:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"Our Friend the Catbird\"\noutput: html_document\n---<\/code><\/pre>\n<\/div>\n<p>In RStudio, inserting literature citations and creating formatted bibliographies in R Markdown documents is facilitated by an R package called <a href=\"https:\/\/cran.r-project.org\/web\/packages\/citr\/index.html\">citr<\/a>. To install it, go to the Packages tab in RStudio, click the <code>Install<\/code> button, enter <code>citr<\/code> into the search bar, be sure that <code>Install dependencies<\/code> is checked, and then click <code>Install<\/code>:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citr.png\" alt=\"\" \/><\/p>\n<p>If you then click the <code>Addins<\/code> button on the main RStudio toolbar, you should see an entry for <code>Insert citations<\/code>:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/addins.png\" alt=\"\" \/><\/p>\n<p>If you don&#8217;t see <code>Insert citations<\/code>, be sure that <code>citr<\/code> is checked in the list of installed packages in the <code>Packages<\/code> tab. You might have to restart RStudio after checking <code>citr<\/code> in the package list.<\/p>\n<p>Before we can use citr, we need a <a href=\"http:\/\/www.bibtex.org\/Format\/\">BibTeX bibliography file<\/a>.<\/p>\n<p>Here&#8217;s one: <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/references.bib\">references.bib<\/a>.<\/p>\n<p>To follow along with this example, click the above link and download <code>references.bib<\/code> to your RStudio working directory, or some other location where you can find it.<\/p>\n<p>BibTeX files are plain text and can be created by the major reference manager applications, such as <a href=\"https:\/\/refworks.proquest.com\/researcher\/\">RefWorks<\/a>, <a href=\"http:\/\/endnote.com\/\">EndNote<\/a>, <a href=\"https:\/\/www.mendeley.com\/\">Mendeley<\/a>, and <a href=\"https:\/\/www.zotero.org\/\">Zotero<\/a>. Many other reference manager applications, e.g. <a href=\"http:\/\/www.jabref.org\/\">JabRef<\/a>, use BibTeX as their native format; see <a href=\"https:\/\/en.wikipedia.org\/wiki\/Comparison_of_reference_management_software\">Comparison of reference management software<\/a>. The <a href=\"https:\/\/en.wikipedia.org\/wiki\/BibTeX\">BibTeX<\/a> format is widely used and has been around for a long time.<\/p>\n<p>A BibTeX reference entry looks like this:<\/p>\n<div>\n<pre><code class=\"language-none\">@article{marsh_adaptations_1984,\n  title = {Adaptations of the {{Gray Catbird}} to Long Distance Migration: Flight Muscle Hypertrophy Associated with Elevated Body Mass},\n  volume = {57},\n  timestamp = {2016-03-12T15:29:37Z},\n  number = {1},\n  journaltitle = {Physiological Zoology},\n  author = {Marsh, R. L.},\n  date = {1984},\n  pages = {105--117}\n}<\/code><\/pre>\n<\/div>\n<p>All reference manager applications break bibliographic citations into a series of fields that can be reassembled into any citation style desired, such as that of the journal <em>Ecology<\/em>:<\/p>\n<div>\n<pre><code class=\"language-none\">Marsh, R. L. 1984. Adaptations of the Gray Catbird to long distance migration: Flight muscle hypertrophy associated with elevated body mass. Physiological Zoology 57:105\u2013117.<\/code><\/pre>\n<\/div>\n<p>The <code>references.bib<\/code> file used in this example was created by exporting references from Zotero using the <a href=\"https:\/\/www.zotero.org\/support\/dev\/translators\">BibTeX export translator<\/a>.<\/p>\n<p>The YAML header can contain the name of a BibTeX file to use for literature citations. Thus we add <code>bibliography: references.bib<\/code> to our YAML header so that it now looks something like this:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"Our Friend the Catbird\"\noutput: html_document\nbibliography: references.bib\n---<\/code><\/pre>\n<\/div>\n<p>This assumes that <code>references.bib<\/code> is in the same folder as the R Markdown file. But you could also store <code>references.bib<\/code> in some other location, as long as you provide the complete path to the file, for example:<\/p>\n<div>\n<pre><code class=\"language-none\">bibliography: \/User\/rlent\/Desktop\/references.bib<\/code><\/pre>\n<\/div>\n<p>You can also provide multiple BibTeX files in the YAML header by listing them like this:<\/p>\n<div>\n<pre><code class=\"language-none\">bibliography: [statistics.bib, graphics.bib]<\/code><\/pre>\n<\/div>\n<p>And so, with <code>bibliography: references.bib<\/code> added to the YAML header of <code>citations.Rmd<\/code>, and with the <code>references.bib<\/code> file residing in the same folder as <code>citations.Rmd<\/code>, click <code>Addins<\/code>, then select <code>Insert citations<\/code>. A dialog box should appear:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations1.png\" alt=\"\" \/><\/p>\n<p>Note that we get an acknowledgement that <code>references.bib<\/code> was found in the YAML header. An error message will appear in the search bar if there was a problem finding the bibliography file. If you click in the search bar, where it says <code>Search terms<\/code>, you should see a scrollable list of the references contained in <code>references.bib<\/code>:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations2.png\" alt=\"\" \/><\/p>\n<p>You can select one or more references from the list, by clicking on them, and the selected references will be added to a separate box above the scrollable list:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations4.png\" alt=\"\" \/><\/p>\n<p>When you are finished selecting references to cite, click in a blank area of the dialog box, and <code>citr<\/code> will build a citation marker to insert into your text:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations5.png\" alt=\"\" \/><\/p>\n<p>The citation marker consists of one or more BibTeX <em>citation keys<\/em>, each beginning with the <code>@<\/code> symbol, each separated by semicolons, with everything enclosed in square brackets. The <code>citr<\/code> tool takes care of this in-text formatting automatically; all you have to do is select the references you want to cite.<\/p>\n<p>Each reference in a BibTeX file contains a citation key at the beginning of the entry:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/bibtex.png\" alt=\"\" \/><\/p>\n<p>The citation key uniquely identifies each reference in the BibTeX file and is used as a place marker for in-text citations in the R Markdown file.<\/p>\n<p>By default, RStudio will use a <a href=\"http:\/\/www.chicagomanualofstyle.org\/tools_citationguide.html\">Chicago author-date format<\/a> for citations and references. To use another style, you need to specify a CSL (<a href=\"http:\/\/citationstyles.org\/\">Citation Style Language<\/a>) style file in a <code>csl<\/code> metadata field in your YAML header. CSL styles are plain-text and are written in <a href=\"https:\/\/en.wikipedia.org\/wiki\/XML\">XML<\/a>. You can select and download over 8300 CSL bibliographic styles from the <a href=\"https:\/\/www.zotero.org\/styles\">Zotero Style Repository<\/a>.<\/p>\n<p>Let&#8217;s first try citing using the default style. This means that our YAML header looks like this:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"Our Friend the Catbird\"\noutput: html_document\nbibliography: references.bib\n---<\/code><\/pre>\n<\/div>\n<p>Here the YAML header only specifies the BibTeX file, and does not specify a particular CSL style file. Therefore the default Chicago author-date style will be used.<\/p>\n<p>We type some text, and leave our cursor at the point in the text where we want to insert a citation:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations6.png\" alt=\"\" \/><\/p>\n<p>We then click <code>Addins|Insert citations<\/code>, and search for a reference to cite:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations7.png\" alt=\"\" \/><\/p>\n<p>Be sure <code>In parentheses<\/code> is checked (this puts in the square brackets), click <code>Insert citation<\/code>, and the citation marker will be inserted into your text at the current cursor position:<\/p>\n<div>\n<pre><code class=\"language-none\">The catbird is one of our most beloved songbirds [@bent_life_1948].<\/code><\/pre>\n<\/div>\n<p>Now, if we knit the R Markdown document to HTML, it looks like this:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations8.png\" alt=\"\" \/><\/p>\n<p>RStudio, via the <code>citr<\/code> package and another application called <a href=\"http:\/\/pandoc.org\/\">pandoc<\/a> (automatically installed with RStudio; see also <a href=\"http:\/\/rmarkdown.rstudio.com\/authoring_pandoc_markdown.html\">Pandoc Markdown<\/a>), has changed the citation marker to the appropriate in-text citation style (in this case, the default Chicago author-date style) and has created a formatted bibliography, also in the Chicago style. The bibliography is always placed at the end of the document; I had already inserted a Markdown heading (<code>#### Literature Cited<\/code>) to create a <em>Literature Cited<\/em> section.<\/p>\n<p>If we want a specific bibliographic style other than the default Chicago style, we need to add a <code>csl<\/code> metadata entry to our YAML header:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"Our Friend the Catbird\"\noutput: html_document\nbibliography: references.bib\ncsl: nature.csl\n---<\/code><\/pre>\n<\/div>\n<p>The <code>csl: nature.csl<\/code> YAML entry points to a CSL style file called <code>nature.csl<\/code> that was downloaded from the <a href=\"https:\/\/www.zotero.org\/styles\">Zotero Style Repository<\/a>. This is the style used in the journal <a href=\"http:\/\/www.nature.com\/nature\/index.html\">Nature<\/a>.<\/p>\n<p>(Here is <a href=\"https:\/\/www.zotero.org\/styles\/nature\">nature.csl<\/a> if you want to download it for this example.)<\/p>\n<p>Now, when we <code>knit<\/code> to HTML, we get:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/citations9.png\" alt=\"\" \/><\/p>\n<p>The rendered HTML has been reformatted to follow the bibliographic style specified by <code>nature.csl<\/code>. In-text citations are now numbered superscripts, and the bibliography is also numbered and organized in citation order instead of alphabetically by author&#8217;s last name. (This would be more obvious if we had more than one citation.)<\/p>\n<p>See <a href=\"http:\/\/rmarkdown.rstudio.com\/authoring_bibliographies_and_citations.html#bibliography_placement\">Bibliographies and Citations<\/a> for more information on citing and creating bibliographies in RStudio. For example, using an author-date style like Chicago, if you put a minus sign before the opening <code>@<\/code> of a citation key in the text, like this:<\/p>\n<div>\n<pre><code class=\"language-none\">[-@bent_life_1948]<\/code><\/pre>\n<\/div>\n<p>you can suppress the author&#8217;s name in the citation. Now we can write a sentence that reads:<\/p>\n<div>\n<pre><code class=\"language-none\">Arthur Bent (1948) said that the catbird is one of our most beloved songbirds.    <\/code><\/pre>\n<\/div>\n<p><a href=\"http:\/\/svmiller.com\/blog\/2016\/02\/svm-r-markdown-manuscript\/\">An R Markdown Template for Academic Manuscripts<\/a> provides more useful tips on YAML and R Markdown documents.<\/p>\n<p><!--nextpage--><\/p>\n<h4 id=\"toc_5\">R Notebooks<\/h4>\n<p>An <a href=\"http:\/\/rmarkdown.rstudio.com\/r_notebooks.html\">R Notebook<\/a> is an R Markdown document with a special execution mode for interactive data analysis. Any R Markdown document can be used as a notebook, and R Notebooks can be rendered into the same publication-quality document formats as regular R Markdown files. By default, RStudio enables notebook mode on all R Markdown documents, so you can interact with any R Markdown document as though it were a notebook.\u00a0R Notebooks, however, offer additional features that are not available in regular R Markdown documents.<\/p>\n<p>In RStudio, create a new R Notebook by clicking the File menu, then New File, and then select R Notebook. For this example we will save the notebook and give it a file name of <code>MyNotebook.Rmd<\/code>. Your editor pane should now look something like this:<\/p>\n<p><a href=\"http:\/\/live-hcblog.pantheonsite.io\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/notebook1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-670\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/notebook1-300x248.png\" alt=\"notebook1\" width=\"300\" height=\"248\" \/><\/a><\/p>\n<p>(Click to enlarge.)<\/p>\n<p>&nbsp;<\/p>\n<p>This is the R Studio template for an R Notebook. Kind of looks like regular R Markdown, doesn&#8217;t it? That&#8217;s because an R Notebook <em>is<\/em> an R Markdown document with code chunks that can be executed independently and interactively. Text and graphical output will appear immediately beneath the code chunk that produced it, in the editor window.<\/p>\n<p>You can execute your notebook code chunks interactively by using the controls that appear in each chunk:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/notebook3.png\" alt=\"\" \/><\/p>\n<p>The rightmost arrow will run the current chunk. To the left of the run arrow is a down-pointing arrow that will run all of the chunks above the current chunk. The gear icon allows you to modify options for how each code chunk behaves. These same controls are also available in regular R Markdown documents.<\/p>\n<p>More options for running code chunks can be found in the <code>Run<\/code> menu on the editor toolbar:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/run.png\" alt=\"\" \/><\/p>\n<p>A feature unique to R Notebooks is notebook Preview:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/notebook2.png\" alt=\"\" \/><\/p>\n<p>While an R notebook preview looks similar to a rendered R Markdown document, the notebook preview does not automatically execute and knit all of your code chunks, which is what happens when you render an R Markdown document. A notebook preview simply shows you a rendered copy of the Markdown text in your R Notebook along with the most recent chunk output. This allows you to efficiently develop R code in an R Notebook by iterating back and forth between coding and output until the code chunk is completed, without having to render the entire document each time you want to look at the output of a single code chunk.<\/p>\n<p>Try previewing the R Notebook template <em>before<\/em> running the embedded code chunk. You&#8217;ll see just the text of the notebook in the Viewer pane. Then run the code chunk by clicking the Run arrow. A plot of the <code>cars<\/code> dataset (which comes with R) will appear immediately below the code chunk, not in the View pane, but right in the editor window. In the upper right corner of the plot window are tools for clearing the output, expanding or collapsing it (without clearing it), and for opening the graphic in an external window. If you leave the plot displayed, and then preview the notebook, you will now see the graphic included with the rendered text.<\/p>\n<p>The YAML header in our R Notebook looks like this:<\/p>\n<div>\n<pre><code class=\"language-none\">---\ntitle: \"R Notebook\"\noutput: html_notebook\n---<\/code><\/pre>\n<\/div>\n<p>The <code>output: html_notebook<\/code> statement in the YAML header is what turns a regular R Markdown document into an R Notebook. So if you start out with an R Markdown document and then decide that you want to &#8220;upgrade&#8221; it to a notebook, just add <code>output: html_notebook<\/code> to the YAML header. This will turn your R Markdown document into an R Notebook and will also turn the <code>Knit<\/code> button into a <code>Preview<\/code> button. You will still have the option to knit the notebook completely into publication-quality output with all R text and graphical output. Just pull down the <code>Preview<\/code> button and you will see the knit options for HTML, PDF, and Word output.<\/p>\n<p>Now that you have previewed <code>MyNotebook.Rmd<\/code> with the chunk output displayed, note that there is a file named <code>MyNotebook.nb.html<\/code> in your working directory. As discussed <a href=\"http:\/\/rmarkdown.rstudio.com\/r_notebooks.html#saving_and_sharing\">here<\/a>, when a notebook <code>.Rmd<\/code> file is saved, the <code>output: html_notebook<\/code> statement in the YAML header causes an <code>.nb.html<\/code> having the same name as the notebook to be saved as well (the <code>nb<\/code> stands for <strong>n<\/strong>ote<strong>b<\/strong>ook). This file is a self-contained HTML document having both a rendered copy of the notebook with all current chunk outputs (suitable for display on a website) plus a copy of the notebook <code>.Rmd<\/code> source file itself.<\/p>\n<p>Open your RStudio <code>Files<\/code> pane, and click on <code>MyNotebook.nb.html<\/code> in the list of files. Choose to view the file in a web browser. It should look something like this:<\/p>\n<p><a href=\"http:\/\/live-hcblog.pantheonsite.io\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/notebook5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-674\" src=\"https:\/\/blogs.holycross.edu\/tech\/wp-content\/uploads\/sites\/2\/2016\/12\/notebook5-300x220.png\" alt=\"notebook5\" width=\"300\" height=\"220\" \/><\/a><\/p>\n<p>(Click to enlarge.)<\/p>\n<p>&nbsp;<\/p>\n<p>In this web page, there is a button labeled <code>Hide<\/code> that you can use to show and hide the code that produced the plot. There is also a button labeled <code>Code<\/code> that lets you show and hide all of the code and also lets you download the original notebook <code>.Rmd<\/code> file.<\/p>\n<p>The <code>nb.html<\/code> files are an excellent way to archive and share R notebooks. Anyone with access to an <code>nb.html<\/code> file has a complete package of the rendered text of the notebook, all of the tabular and graphical output from the code chunks, and a copy of the original R Notebook <code>.Rmd<\/code> file. Because the <code>nb.html<\/code> file can be viewed in any web browser, a person does not have to have R or RStudio in order to view the notebook, the code, or the output. However, if one of your collaborators does have RStudio, they can open the <code>nb.html<\/code> file directly using the <code>File|Open File...<\/code> dialog of RStudio to resume work on the notebook with all output intact. This will extract the <code>.Rmd<\/code> file into a new RStudio editor tab, extract the chunk outputs from the <code>.nb.html<\/code> file, and place them appropriately in the editor.<\/p>\n<p>Only R Notebooks (which have at least one of the output formats in the YAML header listed as <code>html_notebook<\/code>) can produce a companion <code>.nb.html<\/code> file. Regular, non-notebook R Markdown files can have inline chunk output (the chunk output appears immediately below the chunk, in the editor) but they do not produce an <code>.nb.html<\/code> file.<\/p>\n<p><!--nextpage--><\/p>\n<h4 id=\"toc_6\">Reproducible Research Revisited<\/h4>\n<p>If we were creating a journal article back in the olden days (i.e., prior to 2011, the first public beta release of RStudio), we would start writing our manuscript in a word processor, say <a href=\"http:\/\/tech.me.holycross.edu\/2015\/11\/18\/theplaintextworkflow\/\">Microsoft Word<\/a>. If it was a <a href=\"https:\/\/www.r-bloggers.com\/simple-template-for-scientific-manuscripts-in-r-markdown\/\">science manuscript<\/a>, the text would contain Introduction, Methods, Results, Discussion, and Literature Cited or References sections. The data, say from a laboratory experiment or from field observations, might reside in an Excel spreadsheet, a database application, or preferably in one or more <a href=\"http:\/\/kbroman.org\/dataorg\/pages\/csv_files.html\">plain text files<\/a>. To produce data summaries, statistical analyses, and graphics, we would have to bring the data into a statistics package like <a href=\"http:\/\/www.ibm.com\/analytics\/us\/en\/technology\/spss\/\">SPSS<\/a>, <a href=\"http:\/\/www.sas.com\/\">SAS<\/a>, or one of <a href=\"https:\/\/en.wikipedia.org\/wiki\/List_of_statistical_packages\">many others<\/a>. Reduction and manipulation of the original data might continue in the statistical software. Tabular statistical output such as regression and ANOVA tables would need to be copy-pasted back into Word, where they might then be wrangled into a pretty table. Graphical output from statistical software or maybe a separate graphics package would need to be saved to a graphics file, then imported back into Word. If we needed a map of study sites, we might have to use a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Geographic_information_system\">geographic information system<\/a> to produce a map, requiring even more data files, and then export the map to an external graphics file so that it could be brought into Word. Bibliographic references might be stored in reference management software like Zotero or RefWorks, and via a Word <a href=\"https:\/\/www.zotero.org\/support\/word_processor_plugin_usage\">plugin<\/a>, we could produce our literature citations and formatted bibliographies. At some point, a final manuscript would be produced.<\/p>\n<p>And then the revisions would begin.<\/p>\n<p>The cut-and-paste approach to producing a scholarly manuscript is tedious, slow, and error-prone, to say the least. Moving data back and forth between applications makes it difficult to retrace the steps taken to produce a given result, even if careful notes are taken every step of the way. If a project involves multiple researchers, each working on different parts of the analysis, and each keeping their own set of notes, this process becomes even more complicated.<\/p>\n<p>Production of an analysis, publication-quality graphics, and a final manuscript can be greatly streamlined by keeping everything in one R Notebook. With R code chunks embedded in R Markdown text, you can fully document how you arrived at your results, while simultaneously producing the statistical output, graphics, and references for your paper. R Notebooks can be easily archived and shared among collaborators, using cloud storage technologies such as <a href=\"https:\/\/www.dropbox.com\">Dropbox<\/a> and <a href=\"https:\/\/www.google.com\/drive\/\">Google Drive<\/a>, or version control systems such as <a href=\"https:\/\/git-scm.com\/\">git<\/a>, and can be rendered into publication-quality documents in a variety of formats. And because everything is plain text, the R Markdown manuscript can be edited on any computing device that has a text editor, including smartphones and other mobile gadgetry.<\/p>\n<p>We illustrate this workflow with a small example, involving the analysis of the raw data file <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.csv\">sites.csv<\/a>. This is a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Comma-separated_values\">comma-separated values file<\/a> containing ecological data from 11 grassland sites in Massachusetts, New Hampshire, and Vermont. The companion <a href=\"https:\/\/en.wikipedia.org\/wiki\/Metadata\">metadata<\/a> file <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.metadata.txt\">sites.metadata.txt<\/a> describes the variables (columns) of <code>sites.csv<\/code>. The data for each site consist of measures of site vegetation structure, morphological measures on individuals of the butterfly species <a href=\"http:\/\/www.butterfliesandmoths.org\/species\/coenonympha-tullia\">Coenonympha tullia<\/a> (the Common Ringlet) inhabiting each site, and the geographic location of each site in both <a href=\"https:\/\/en.wikipedia.org\/wiki\/Universal_Transverse_Mercator_coordinate_system\">UTM <\/a> and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Decimal_degrees\">decimal degree<\/a> coordinates. The aim of the study\u00a0was to examine relationships between habitat structure and morphological variation in the butterfly populations at each site.<\/p>\n<p>You can view the complete example in <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.nb.html\">sites.nb.html<\/a>, which is an HTML notebook created in RStudio from the corresponding R Notebook file <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.Rmd\">sites.Rmd<\/a>. The HTML notebook contains the rendered text of a scientific paper originally written in R Markdown with embedded R code that creates all of the statistical analyses, tables, and graphics. The same <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.Rmd\">sites.Rmd<\/a> file was used to produce <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.pdf\">PDF<\/a> and <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.docx\">Microsoft Word<\/a> versions of the paper. The manuscript in <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.Rmd\">sites.Rmd<\/a> was written to be self-contained and self-documenting, essentially a &#8220;paper-within-a-paper.&#8221; It includes comments that document both the main text and the embedded R code. Also in <code>sites.nb.html<\/code> is a link from which you can download a copy of the complete R Markdown source file <code>sites.Rmd<\/code>. You can also download <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.zip\">sites.zip<\/a>, a zip archive file that contains all of the document files, data, metadata, R code, and other associated files (such as external images and bibliographic data) needed to completely replicate our analysis and document production.<\/p>\n<p>Recalling the 3 criteria for <a href=\"https:\/\/www.r-bloggers.com\/what-is-reproducible-research\/\">reproducible research<\/a>, the files comprising our example satisfy the requirement that <em>All data and files used for the analysis are publicly available<\/em>. The data file and its companion metadata file would be placed in a publicly accessible digital repository so that other workers desiring to replicate the analysis could get the data and know what they were working with. At a minimum, the metadata file needs to describe what the variables are and their units of measurement. The requirement that <em>All methods are fully reported<\/em> should be satisfied by the Methods section of the article. Because the article is written in R Markdown and contains embedded R code showing exactly how the data were analyzed, we also have satisfied the third requirement of reproducible research, that <em>The process of analyzing raw data is well reported and preserved<\/em>. The R Markdown manuscript with its embedded R code and accompanying data and metadata files, all residing in <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/sites\/sites.zip\">sites.zip<\/a>, is a self-contained package of reproducible research.<\/p>\n<h4 id=\"toc_7\">Coda I: Python<\/h4>\n<p>We note briefly here that you can insert code chunks from other programming languages besides R into an R Markdown document. See <a href=\"http:\/\/rmarkdown.rstudio.com\/authoring_knitr_engines.html\">knitr Language Engines<\/a> for more details. There is an example of an R Notebook that includes both R and <a href=\"https:\/\/www.python.org\/\">Python<\/a> code <a href=\"http:\/\/college.holycross.edu\/faculty\/rlent\/basemap.nb.html\">here<\/a>.<\/p>\n<h4 id=\"toc_8\">Coda II: Inspirational Quotes About Data<\/h4>\n<p>&#8220;Data! Data! Data!&#8221; he cried impatiently. &#8220;I can&#8217;t make bricks without clay.&#8221; &#8212; Sherlock Holmes, <em>The Adventure of the Copper Beeches<\/em><\/p>\n<p>&#8220;War is 90% information.&#8221; &#8212; Napoleon Bonaparte<\/p>\n<p>&#8220;Everybody gets so much information all day long that they lose their common sense.&#8221; &#8212; Gertrude Stein<\/p>\n<p>&#8220;Statistics are no substitute for judgment.&#8221; &#8212; Henry Clay<\/p>\n<p>&#8220;Information is not knowledge.&#8221; &#8212; Albert Einstein<\/p>\n<p>&#8220;Facts are stubborn, but statistics are more pliable.&#8221; &#8212; Mark Twain<\/p>\n<div class=\"footnotes\">\n<p>&nbsp;<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>R (r-project.org) is a programming language and software platform for statistical computing and graphics, widely used in academia and industry (see Introduction to R). RStudio is an integrated development environment for R. RStudio makes R easier to use, and it also enables the creation and rendering of plain-text documents that contain embedded R code. With &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/blogs.holycross.edu\/tech\/2016\/12\/01\/rstudio-as-a-research-and-writing-platform\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;RStudio as a Research and Writing Platform&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,3,7],"tags":[],"class_list":["post-647","post","type-post","status-publish","format-standard","hentry","category-data-analysis","category-data-visualization","category-writing-tools"],"_links":{"self":[{"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/posts\/647","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/comments?post=647"}],"version-history":[{"count":0,"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/posts\/647\/revisions"}],"wp:attachment":[{"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/media?parent=647"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/categories?post=647"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.holycross.edu\/tech\/wp-json\/wp\/v2\/tags?post=647"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}