Graphical Representation in R

In the R language, there is much graphical representation of qualitative and quantitative data. We will only discuss the histogram, bar plot, and box plot in this post.

Histogram

To visualize a single variable, the histogram can be drawn using the hist( ) function.

Let use the data from iris dataset.

attach(iris)
head(iris)
hist(Petal.Width)

We can enhance the histogram by using some arguments/parameters related to hist( ) function. For example,

hist(Petal.Width,
  xlab = "Petal Width",
  ylab = "Frequency",
  main = "Histogram of Petal Width from Iris Data set",
  breaks =10,
  col = "dodgerblue",
  border = "orange")

If these arguments are not provided, R will attempt to intelligently guess them, especially the number of breaks. See the YouTube tutorial for a graphical representation of the histogram.

Barplots

The bar plots are the best choice for visual inspection of a categorical variable (or a numeric variable with a finite number of values), or a rank variable. For example,

library(mtcars)
barplot( table(cyl) )
barplot(table(cyl),
  ylab = "Frequency",
  xlab = "Cylinders (4, 6, 8)",
  main = "Number of cylinders ",
  col = "green",
  border = "blue")

Boxplots

Boxplots are used to visualize the normality, skewness, and existence of outliers in the data based on five-number summary statistics.

boxplot(mpg)
boxplot(Petal.Width)
boxplot(Petal.Length)

However, we often compare a numerical variable for different values of a categorical variable. For example,

boxplot(mpg ~ cyl, data = mtcars)

The reads the formula mpg ~ cyl as: “plot the mpg variable against the cyl variable using the dataset mtcars. The symbol ~used to specify a formula in R.

boxplot(mpg ~ cyl, data =mtcars,
  xlab = "Cylinders",
  ylab = "Miles per Gallon",
  pch = 20,
  cex = 2,
  col = "pink",
  border = "black")
Graphical-representation-in-r

See How to perform descriptive statistics