Plotting multiple groups with facets in ggplot2

In some circumstances we want to plot relationships between set variables in multiple subsets of the data with the results appearing as panels in a larger figure. This is a known as a facet plot. This is a very useful feature of ggplot2. The faceting is defined by a categorical variable or variables. Each panel plot corresponds to a set value of the variable.

setwd("~/Documents/Computing with Data/13_Facets/")
library(ggplot2)

Facet wrap

Here, a single categorical variable defines subsets of the data. The panels are calculated in a 1 dimensional ribbon that can be wrapped to multiple rows.

str(mpg)
## 'data.frame':    234 obs. of  11 variables:
##  $ manufacturer: Factor w/ 15 levels "audi","chevrolet",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ model       : Factor w/ 38 levels "4runner 4wd",..: 2 2 2 2 2 2 2 3 3 3 ...
##  $ displ       : num  1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
##  $ year        : int  1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
##  $ cyl         : int  4 4 4 4 6 6 6 4 4 4 ...
##  $ trans       : Factor w/ 10 levels "auto(av)","auto(l3)",..: 4 9 10 1 4 9 1 9 4 10 ...
##  $ drv         : Factor w/ 3 levels "4","f","r": 2 2 2 2 2 2 2 1 1 1 ...
##  $ cty         : int  18 21 20 21 16 18 18 18 16 20 ...
##  $ hwy         : int  29 29 31 30 26 26 27 26 25 28 ...
##  $ fl          : Factor w/ 5 levels "c","d","e","p",..: 4 4 4 4 4 4 4 4 4 4 ...
##  $ class       : Factor w/ 7 levels "2seater","compact",..: 2 2 2 2 2 2 2 2 2 2 ...
p <- ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point()
p + facet_wrap(~cyl)

plot of chunk unnamed-chunk-2

We can control the layout with options to the facet_wrap function.

p + facet_wrap(~cyl, nrow = 1)

plot of chunk unnamed-chunk-3

p + facet_wrap(~cyl, ncol = 1)

plot of chunk unnamed-chunk-3

We can add an aesthetic for another variable and get one legend.

p <- ggplot(data = mpg, aes(x = displ, y = hwy, color = drv)) + geom_point()
p + facet_wrap(~cyl)

plot of chunk unnamed-chunk-4

Some of the subsets may exhibit extreme bahavior of a variable causing other facets to plot in uncommunicative ways.

p <- ggplot(data = diamonds, aes(x = price)) + geom_histogram(binwidth = 1000)
p + facet_wrap(~color)

plot of chunk unnamed-chunk-5

We can get a better plot by letting the y axes vary freely.

p + facet_wrap(~color, scales = "free_y")

plot of chunk unnamed-chunk-6

Visually, it looks like the histograms are about the same and they aren't in actual counts. The relative sizes between the bins are not so different, though.

Other scales options are "free_x" and "free_xy"

Decorating facets

Sometimes we may want to add features to a single facet. To do that we need to restrict the data.

p <- ggplot(data = mpg, aes(x = displ, y = hwy)) + geom_point()
q <- p + facet_wrap(~cyl)
cycl6 <- subset(mpg, cyl == 6)
q + geom_point(data = cycl6, color = "red")
## Warning: invalid factor level, NAs generated

plot of chunk unnamed-chunk-7

Facet grid

Here, the panels are determined by the values of multiple variables.

str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Plot weight versus mpg for each value of vs and carb.

p <- ggplot(data = mtcars, aes(mpg, wt)) + geom_point()
p + facet_grid(vs ~ carb)

plot of chunk unnamed-chunk-9

You can also have panels displayed in a other geometries, although they are defined with multiple variables.

p + facet_grid(. ~ vs + carb)

plot of chunk unnamed-chunk-10

p + facet_grid(vs + carb ~ .)

plot of chunk unnamed-chunk-10