Completion requirements
You will learn the layered syntax of ggplot2 for scatterplots in this section. It also demonstrates how regression lines can be added (compared with the base-R syntax shown in the introductory video).
Scatter plots with rectangular bins
The number of observations is counted in each bins and displayed using any of the functions below:
- geom_bin2d() for adding a heatmap of 2d bin counts
- stat_bin_2d() for counting the number of observations in rectangular bins
- stat_summary_2d() to apply function for 2D rectangular bins
The simplified formats of these functions are:
plot + geom_bin2d(...) plot+stat_bin_2d(geom=NULL, bins=30) plot + stat_summary_2d(geom = NULL, bins = 30, fun = mean)
- geom : geometrical object to display the data
- bins : Number of bins in both vertical and horizontal directions. The default value is 30
- fun : function for summary
The data sets diamonds from ggplot2 package is used:
head(diamonds)
## carat cut color clarity depth table price x y z ## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 ## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 ## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 ## 4 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63 ## 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 ## 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
# Plot p <- ggplot(diamonds, aes(carat, price)) p + geom_bin2d()
Change the number of bins :
# Change the number of bins p + geom_bin2d(bins=10)
Or specify the width of bins :
# Or specify the width of bins p + geom_bin2d(binwidth=c(1, 1000))