In this primer, you will explore the popularity of different names over time. To succeed, you will need to master some common tools for manipulating data with R:

  • tibbles and View(), which let you inspect raw data
  • select() and filter(), which let you extract rows and columns from a data frame
  • arrange(), which lets yuo reorder the rows in your data
  • %>%, which organizes your code into reader-friendly “pipes”
  • mutate(), group_by(), and summarize(), which help you use your data to compute new variables and summary statistics

These are some of the most useful R functions for data science, and the tutorials that follow will provide you everything you need to learn them.

In the tutorials, we’ll use a dataset named babynames, which comes in a package that is also named babynames. Within babynames, you will find information about almost every name given to children in the United States since 1880.

This tutorial introduces babynames as well as a new data structure that makes working with data in R easy: the tibble.

In addition to babynames, this tutorial uses the core tidyverse packages, including ggplot2, tibble, and dplyr. All of these packages have been pre-installed for your convenience. But they haven’t been pre-loaded—something you will soon learn more about!

Click the Next Topic button to begin.