Introduction to using ggplot2 in R Language
ggplot2
is a popular R package that provides flexible and elegant grammar of graphics for creating a wide range of dynamic and static graphics. It breaks down plots into fundamental components like data, aesthetics, geometric objects, and statistical transformations. In this post, we will learn about using ggplot2
in R Language.
There are three strategies for plotting in R language.
base
graphics using functions such asplot()
,points()
, andpar()
lattice
graphics to create nice graphics, however, it is not easy to create high-dimensional data graphics.-
ggplot
package, it is an implementation of “Grammar of Graphics”.
The ggplot2
is built on the principle of layering graphical elements, making it flexible and customizable.
Table of Contents
To plot using ggplot2
in R Langauge, a data.frame object is required as an input, then one needs to define plot layers that stack on top of each other, and each layer has visual/text elements that are mapped to aesthetics (size, colors, and opacity). An extremely informative graph will be produced using the above-described simple set of commands.
Before drawing high-quality informative graphs, one needs to install the ggplot2
package. If ggplot2
is already installed, one does not need to reinstall it using the command below.
install.packages("ggplot2")
Scatter Plot using ggplot2 in R
Let us draw a dot plot (scatter points) graph between variables $hp$ (horsepower) and $disp$ (displacement) from mtcars
dataset.
# first load the data set say mtcars attach(mtcars) # load the ggplot2 library library(ggplot2) # now specify the dataset and variables p <- ggplot(mtcars, aes(x = disp, y = hp)) # Add a plot layer with points p <- p + geom_point() print(p) # display/ show the plot
Note that geom, aesthetics, and facets are three important concepts in drawing the graphs using ggplot2
, where
- geom is the type of the plot
- aesthetics is the shape, color, size, and alpha values used in ggplot
- facet are small multiples, displaying different subsets of data
When certain aesthetics are defined, an appropriate legend is chosen and displayed automatically.
p <- ggplot(mtcars, aes(x = disp, y = hp)) p <- p + geom_point(aes(color = mpg)) p
Updating Graphs using aesthetics (color, size, and shape)
Graphs can be updated by assigning variables to aesthetics color, size, and shape. For example
p <- ggplot(mtcars, aes(x = disp, y = hp)) p <- p + geom_point(aes(color = gear, size = wt)) p
Consider the following example. Here, the $gear$ variable is taken as a factor (grouping variable).
p <- ggplot(mtcars, aes(x = disp, y = hp)) p <- p + geom_point(aes(color = as.factor(gear), size = wt)) p
Note that the behaviour of the aesthetics is predictable and customizable.
Aesthetic | Discrete Variable | Continuous Variable |
---|---|---|
color | Rainbow of colors | Gradient from red to blue |
size | Discrete size steps | Linear mapping between radius and value |
shape | Different shapes for each group | Should not work |
Faceting in ggplot2
A small multiple (sometimes called faceting, trellis chart, lattice chart, panel chart, or grid chart) is a series or grid of small similar graphics or charts for comparison purposes. Usually, these small multiples are used to display different subsets of the data and these multiples are useful for exploring some conditional relationship between variables (especially when data is large enough).
Let us examine the faceting of different types. The following are some examples of subsetting the scatterplot in facets
# Create a basic scatter plot p <- ggplot(mtcars, aes(x = disp, y = hp)) p <- p + geom_point() # columns are cyl categories p1 <- p + facet_grid(. ~ cyl) # rows are cyl categories p2 <- p + facet_grid(cyl ~ .) # columns and rows both p3 <- p + facet_grid(carb ~.) wrap plots by cyl p4 <- p + facet_grid(~ am) # plot all four in one library(gridExtra) grid.arrange(grobs = list(p1, p2, p3, p4), ncol = 2, top = "Facet Examples")