This section introduces the ggplot2
graphics. You will see how different the syntax is from the base-R graphics. You can think of ggplot2
creating graphs by combining layers with the "+" sign. The default gray background of the ggplot
is not as good for printed publications and can be replaced by adding a theme layer, for example, + theme_minimal()
Key Points
-
Use
ggplot2
to create plots. -
Think about graphics in layers: aesthetics, geometry, statistics, scale transformation, and grouping.
Transformations and statistics
ggplot2 also makes it easy to overlay statistical models over the data. To demonstrate we'll go back to our first example:
ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point()
Currently it's hard to see the relationship between the points due to some strong outliers in GDP per capita. We can change the scale of units on the x axis using the scale functions. These control the mapping between the data values and visual values of an aesthetic. We can also modify the transparency of the points, using the alpha function, which is especially helpful when you have a large amount of data which is very clustered.
ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.5) + scale_x_log10()
The scale_x_log10
function applied a transformation to the coordinate system of the plot,
so that each multiple of 10 is evenly spaced from left to right. For
example, a GDP per capita of 1,000 is the same horizontal distance away
from a value of 10,000 as the 10,000 value is from 100,000. This helps
to visualize the spread of the data along the x-axis.
Tip Reminder: Setting an aesthetic to a value instead of a mapping
Notice that we used
geom_point(alpha = 0.5)
. As the previous tip mentioned, using a setting outside of theaes()
function will cause this value to be used for all points, which is what we want in this case. But just like any other aesthetic setting, alpha can also be mapped to a variable in the data. For example, we can give a different transparency to each continent withgeom_point(mapping = aes(alpha = continent))
.
We can fit a simple relationship to the data by adding another layer,
geom_smooth
:
ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.5) + scale_x_log10() + geom_smooth(method="lm")
Output |
---|
`geom_smooth()` using formula 'y ~ x' |
We can make the line thicker by setting the size aesthetic in the
geom_smooth
layer:
ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.5) + scale_x_log10() + geom_smooth(method="lm", size=1.5)
Output |
---|
`geom_smooth()` using formula 'y ~ x' |
There are two ways an aesthetic can be specified. Here we set the size
aesthetic by passing it as an argument to geom_smooth
. Previously in the
lesson we've used the aes
function to define a mapping between data
variables and their visual representation.