This section introduces the base-R graphics. Reading the materials will familiarize you with different options and commands used for plotting. You should start coding by implementing the high-level function like the plot, then incrementally modify and add code to change the plot appearance and add the function par to fine-tune the margins, etc. You will also learn about the R graphics devices used to save plots for publications (do not use the point-and-click interface to save plots from RStudio); these device commands are also applicable to outputs of the ggplot2.
High-level plotting commands
- The
plot()
function - Displaying multivariate data
- Display graphics
- Arguments to high-level plotting functions
The plot()
function
One of the most frequently used plotting functions in R is the
plot()
function. This is a generic function: the type of
plot produced is dependent on the type or class of the first
argument.
plot(x, y)
plot(xy)
If x and y are vectors,
plot(x, y)
produces a scatterplot of y against x. The same effect can be produced by supplying one argument (second form) as either a list containing two elements x and y or a two-column matrix.plot(x)
If x is a time series, this produces a time-series plot. If x is a numeric vector, it produces a plot of the values in the vector against their index in the vector. If x is a complex vector, it produces a plot of imaginary versus real parts of the vector elements.
plot(f)
plot(f, y)
f is a factor object, y is a numeric vector. The first form generates a bar plot of f; the second form produces boxplots of y for each level of f.
plot(df)
plot(~ expr)
plot(y ~ expr)
df is a data frame, y is any object, expr is a list of object names separated by '
+
' (e.g.,a + b + c
). The first two forms produce distributional plots of the variables in a data frame (first form) or of a number of named objects (second form). The third form plots y against every object named in expr.-
Displaying multivariate data
R provides two very useful functions for representing multivariate
data. If X
is a numeric matrix or data frame, the command
> pairs(X)
produces a pairwise scatterplot matrix of the variables defined by the
columns of X
, that is, every column of X
is plotted
against every other column of X
and the resulting n(n-1)
plots are arranged in a matrix with plot scales constant over the rows
and columns of the matrix.
When three or four variables are involved a coplot may be more
enlightening. If a
and b
are numeric vectors and c
is a numeric vector or factor object (all of the same length), then
the command
> coplot(a ~ b | c)
produces a number of scatterplots of a
against b
for given
values of c
. If c
is a factor, this simply means that
a
is plotted against b
for every level of c
. When
c
is numeric, it is divided into a number of conditioning
intervals, and for each interval a
is plotted against b
for values of c
within the interval. The number and position of
intervals can be controlled with given.values=
argument to
coplot()
- the function co.intervals()
is useful for
selecting intervals. You can also use two given variables with a
command like
> coplot(a ~ b | c + d)
which produces scatterplots of a
against b
for every joint
conditioning interval of c
and d
.
The coplot()
and pairs()
function both take an argument
panel=
which can be used to customize the plot type appearing in each panel. The default is points()
to produce a
scatterplot but by supplying some other low-level graphics function of
two vectors x
and y
as the value of panel=
you can
produce any type of plot you wish. An example panel function useful for
coplots is panel.smooth()
.
Display graphics
Other high-level graphics functions produce different types of plots. Some examples are:
qqnorm(x)
qqline(x)
qqplot(x, y)
Distribution-comparison plots. The first form plots the numeric vector
x
against the expected Normal order scores (a normal scores plot), and the second adds a straight line to such a plot by drawing a line through the distribution and data quartiles. The third form plots the quantiles ofx
against those ofy
to compare their respective distributions.hist(x)
hist(x, nclass=n)
hist(x, breaks=b, …)
Produces a histogram of the numeric vector
x
. A sensible number of classes is usually chosen, but a recommendation can be given with thenclass=
argument. Alternatively, the breakpoints can be specified exactly with thebreaks=
argument. If theprobability=TRUE
argument is given, the bars represent relative frequencies divided by bin width instead of counts.dotchart(x, …)
-
Constructs a dotchart of the data in
x
. In a dotchart the y-axis gives a labeling of the data inx
, and the x-axis gives its value. For example, it allows easy visual selection of all data entries with values lying in specified ranges. image(x, y, z, …)
contour(x, y, z, …)
persp(x, y, z, …)
Plots of three variables. The
image
plot draws a grid of rectangles using different colors to represent the value ofz
, thecontour
plot draws contour lines to represent the value ofz
, and thepersp
plot draws a 3D surface.
Arguments to high-level plotting functions
There are a number of arguments that may be passed to high-level graphics functions, as follows:
add=TRUE
Forces the function to act as a low-level graphics function, superimposing the plot on the current plot (some functions only).
axes=FALSE
Suppresses generation of axes - useful for adding your own custom axes with the
axis()
function. The default,axes=TRUE
, means include axes.log="x"
log="y"
log="xy"
Causes the x, y or both axes to be logarithmic. This will work for many, but not all, types of plot.
type=
The
type=
argument controls the type of plot produced, as follows:type="p"
-
Plot individual points (the default)
type="l"
-
Plot lines
type="b"
-
Plot points connected by lines (both)
type="o"
-
Plot points overlaid by lines
type="h"
-
Plot vertical lines from points to the zero axis (high-density)
type="s"
type="S"
-
Step-function plots. In the first form, the top of the vertical defines the point; in the second, the bottom.
type="n"
-
No plotting at all. However, axes are still drawn (by default), and the coordinate system is set up according to the data. Ideal for creating plots with subsequent low-level graphics functions.
xlab=string
ylab=string
-
Axis labels for the x and y axes. Use these arguments to change the default labels, usually the names of the objects used in the call to the high-level plotting function.
main=string
-
Figure title, placed at the top of the plot in a large font.
sub=string
-
Sub-title is placed just below the x-axis in a smaller font.
Source: R Core Team, https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Graphics
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.