Biostatistical Computing, PHC 6068

R graphics ggplot2

Zhiguang Huo (Caleb)

Monday September 17, 2018



ggplot2 is based on the grammer of graphics, the idea that you can build every graph from the same few components:

ggplot2 cheetsheet:

ggplot2 grammers

ggplot2 usage – qplot()



mpg data data

## Classes 'tbl_df', 'tbl' and 'data.frame':    234 obs. of  11 variables:
##  $ manufacturer: chr  "audi" "audi" "audi" "audi" ...
##  $ model       : chr  "a4" "a4" "a4" "a4" ...
##  $ displ       : num  1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
##  $ year        : int  1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
##  $ cyl         : int  4 4 4 4 6 6 6 4 4 4 ...
##  $ trans       : chr  "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
##  $ drv         : chr  "f" "f" "f" "f" ...
##  $ cty         : int  18 21 20 21 16 18 18 18 16 20 ...
##  $ hwy         : int  29 29 31 30 26 26 27 26 25 28 ...
##  $ fl          : chr  "p" "p" "p" "p" ...
##  $ class       : chr  "compact" "compact" "compact" "compact" ...
## # A tibble: 6 x 11
##   manufacturer model displ  year   cyl trans drv     cty   hwy fl    class
##   <chr>        <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi         a4      1.8  1999     4 auto… f        18    29 p     comp…
## 2 audi         a4      1.8  1999     4 manu… f        21    29 p     comp…
## 3 audi         a4      2    2008     4 manu… f        20    31 p     comp…
## 4 audi         a4      2    2008     4 auto… f        21    30 p     comp…
## 5 audi         a4      2.8  1999     6 auto… f        16    26 p     comp…
## 6 audi         a4      2.8  1999     6 manu… f        18    26 p     comp…

basic scattered plot

qplot(x = displ, y = hwy, data = mpg) 

## aesthetic: displ, hwy. data: mpg

scattered plot with color

qplot(displ, hwy, colour = class, data = mpg) 

## aesthetic: displ, hwy, class; data: mpg

scattered plot with shape

mpg_sub <- subset(mpg, class!="suv") ## qplot support a maximum of 6 shapes
qplot(x=displ, y=hwy, shape = class, data = mpg_sub) ## aesthetic: displ, hwy, class; data: mpg

scattered plot with geom = “line”

qplot(x = displ, y = hwy, data = mpg, geom = "line") 

## aesthetic: displ, hwy
## data: mpg
## Geometries: line

scattered plot with geom = “path”

qplot(x = displ, y = hwy, data = mpg, geom = "path") 

## aesthetic: displ, hwy
## data: mpg
## Geometries: path

More about geom – geom = “boxplot”

qplot(x=class, y=displ, data = mpg, geom = "boxplot") 

## aesthetic: displ, class
## data: mpg
## Geometries: boxplot

More about geom – geom = “jitter”

qplot(x=class, y=displ, data = mpg, geom = "jitter") 

## aesthetic: displ, class
## data: mpg
## Geometries: jitter

More about geom – geom = c(“jitter”, “boxplot”)

qplot(x=class, y=displ, data = mpg, geom = c("boxplot", "jitter")) 

## aesthetic: displ, class
## data: mpg
## Geometries: jitter

More about geom – geom = “histogram”

qplot(x=displ, data = mpg, geom = "histogram") 
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## aesthetic: displ
## data: mpg
## Geometries: histogram

More about geom – geom = “density”

qplot(x=displ, data = mpg, geom = "density") 

## aesthetic: displ
## data: mpg
## Geometries: density


qplot(displ, data = mpg, geom = "density", facets = ~class) 

## aesthetic: displ
## data: mpg
## Geometries: density
## facets: ~class