Data visualization is one of the most important tools for data science. It is also a great way to start learning R; when you visualize data, you get an immediate payoff that will keep you motivated as you push through the initial frustrations of learning a language.
This tutorial will teach you how to begin visualizing data with ggplot2, which is one of the most popular packages in R for visualizing data.
ggplot2 is an R package by Hadley Wickham and implements the Grammar of Graphics inspired by Wilkinson. Instead of procedural base-R
plot() we can specify a graph declaratively. This approach is especially useful if we want to produce good looking plots quickly (instead of controlling each detail). ggplot2 can also be beneficial if many plots sharing the same coordinate system need to be drawn on one device (see also the lattice package).
The tutorial focuses on three essential skills:
A graph can be built from a given
data set by specifying
geoms. Attributes in the
data set are mapped to visual properties (
aesthetics can be properties like size, color, x and y.
geoms specify shape of
data in plots. Optional elements include
These examples are excerpted from R for Data Science by Hadley Wickham and Garrett Grolemund, published by O’Reilly Media, Inc., 2016, ISBN: 9781491910399. This tutorial uses the core tidyverse packages (http://tidyverse.org/), including ggplot2, which have been pre-loaded for your convenience.