Boxplots in ggplot2

In this section, you will learn the ggplot2 codes for producing boxplots. While the syntax and default appearance may differ, these plots aim to compare distributions and identify outliers. If you need, you can add a few lines of code to make the base-R and ggplot2 graphs look the same. The choice of which plotting system to use is yours now.

Introduction

This R tutorial describes how to create a box plot using R software and ggplot2 package.

The function geom_boxplot() is used. A simplified format is :

geom_boxplot(outlier.colour="black", outlier.shape=16,
             outlier.size=2, notch=FALSE)
  • outlier.colour, outlier.shape, outlier.size : The color, the shape and the size for outlying points
  • notch : logical value. If TRUE, make a notched box plot. The notch displays a confidence interval around the median which is normally based on the median +/- 1.58*IQR/sqrt(n). Notches are used to compare groups; if the notches of two boxes do not overlap, this is a strong evidence that the medians differ.



Prepare the data

ToothGrowth data sets are used :

# Convert the variable dose from a numeric to a factor variable
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
head(ToothGrowth)
##    len supp dose
## 1  4.2   VC  0.5
## 2 11.5   VC  0.5
## 3  7.3   VC  0.5
## 4  5.8   VC  0.5
## 5  6.4   VC  0.5
## 6 10.0   VC  0.5

Make sure that the variable dose is converted as a factor variable using the above R script.



Source: STHDA, http://www.sthda.com/english/wiki/ggplot2-box-plot-quick-start-guide-r-software-and-data-visualization
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.