Welcome

Data visualization is one of the most important tools for data science. It is also a great way to start learning R; when you visualize data, you get an immediate payoff that will keep you motivated as you push through the initial frustrations of learning a language.

This tutorial will teach you how to begin visualizing data with ggplot2, which is one of the most popular packages in R for visualizing data.

ggplot2 package

ggplot2 is an R package by Hadley Wickham and implements the Grammar of Graphics inspired by Wilkinson. Instead of procedural base-R plot() we can specify a graph declaratively. This approach is especially useful if we want to produce good looking plots quickly (instead of controlling each detail). ggplot2 can also be beneficial if many plots sharing the same coordinate system need to be drawn on one device (see also the lattice package).

ggplot2 Terminology

The tutorial focuses on three essential skills:

  1. How to create graphs with a reusable ggplot2 template
  2. How to add variables to a graph with aesthetics
  3. How to select the “type” of your graph with geoms

A graph can be built from a given data set by specifying aesthetics and geoms. Attributes in the data set are mapped to visual properties (aesthetics) of geoms. aesthetics can be properties like size, color, x and y. geoms specify shape of data in plots. Optional elements include facets, statistics, coordinates and themes.

Acknowledgments

These examples are excerpted from R for Data Science by Hadley Wickham and Garrett Grolemund, published by O’Reilly Media, Inc., 2016, ISBN: 9781491910399. This tutorial uses the core tidyverse packages (http://tidyverse.org/), including ggplot2, which have been pre-loaded for your convenience.