Scatterplots in ggplot2

You will learn the layered syntax of ggplot2 for scatterplots in this section. It also demonstrates how regression lines can be added (compared with the base-R syntax shown in the introductory video).

Scatter plots with rectangular bins

The number of observations is counted in each bins and displayed using any of the functions below:

  • geom_bin2d() for adding a heatmap of 2d bin counts
  • stat_bin_2d() for counting the number of observations in rectangular bins
  • stat_summary_2d() to apply function for 2D rectangular bins

The simplified formats of these functions are:

plot + geom_bin2d(...)
plot+stat_bin_2d(geom=NULL, bins=30)
plot + stat_summary_2d(geom = NULL, bins = 30, fun = mean)
  • geom : geometrical object to display the data
  • bins : Number of bins in both vertical and horizontal directions. The default value is 30
  • fun : function for summary

The data sets diamonds from ggplot2 package is used:

head(diamonds)
##   carat       cut color clarity depth table price    x    y    z
## 1  0.23     Ideal     E     SI2  61.5    55   326 3.95 3.98 2.43
## 2  0.21   Premium     E     SI1  59.8    61   326 3.89 3.84 2.31
## 3  0.23      Good     E     VS1  56.9    65   327 4.05 4.07 2.31
## 4  0.29   Premium     I     VS2  62.4    58   334 4.20 4.23 2.63
## 5  0.31      Good     J     SI2  63.3    58   335 4.34 4.35 2.75
## 6  0.24 Very Good     J    VVS2  62.8    57   336 3.94 3.96 2.48
# Plot
p <- ggplot(diamonds, aes(carat, price))
p + geom_bin2d()



Change the number of bins :

# Change the number of bins
p + geom_bin2d(bins=10)



Or specify the width of bins :

# Or specify the width of bins
p + geom_bin2d(binwidth=c(1, 1000))