Introduction

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with global facts data in R.

This is an introduction to R designed for participants with no programming experience. These lessons can be taught in one-half day. They start with some basic information about the RStudio interface & R syntax, move through how to find information and get help, read and write data files in multiple formats and inspect the results, the basics of data frames and how to manipulate, aggregate, and calculate summary statistics, and an introduction to plotting. We close with some discussion on base R and gotchas, including how to handle missing data and the benefits of factors.

Requirements

Data Carpentry’s teaching is hands-on, so participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow. These lessons assume no prior knowledge of the skills or tools, but working through this lesson requires working copies of the software described below. To most effectively use these materials, please make sure to install everything before working through this lesson.

We will be using this Etherpad for sharing notes.

You can also read thru a log of the workshop session R commands.

Data

Files for the lesson are available and can be downloaded manually here:

However, we will download them directly from R during the lessons as we need them.

Installation

Greetings! We are happy that you are going to join us for our half-day training on R. As you probably know, R (https://www.r-project.org/) is a free software environment and program language, gaining significant popularity, that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we will use RStudio. NOTE that you must install both R and RStudio, and it is essential that you have these pre-installed so that we can start the workshop on time.

Mac OS X Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio Desktop IDE.

Windows Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio Desktop IDE.

Linux You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please install the RStudio Desktop IDE.

Success? After both installations, please launch RStudio. If you were successful with the installations, you should see a window similar to this:

(Please note that the version of R reported may be newer). If you are having any difficulties with the installations or your RStudio screen does not look like this one, please contact research@hbs.edu; or stop by the training room at 8:30 am the morning of our workshop.

Finally, we’d like you to install the ‘tidyverse’ suite of tools. Please enter the command install.packages('tidyverse') at the command prompt and press Enter. A number of messages will scroll by, and there will be a long minute or two pause where nothing appears to happen (but the installation is actually occurring). At last, the output parade should end with a message like The downloaded source/binary packages are in.... If not, again, please stop by the training room early.

Content Contributors:

  • Bob Freeman
  • Kate Hertweck
  • Andrew Marder
  • Susan McClatchey
  • Tracy Teal
  • Ryan Williams

Attribution

The lessons present here were adapted from the Data Carpentry Social Sciences GapMinder-R pilot workshop, licensed under the Creative Commons Attribution license (CC BY 4.0).*

NOTE: These materials and the files within are governed by the Creative Commons Attribution license (CC BY 4.0).