# Graphical Representation in R

In the R language, there is much graphical representation of qualitative and quantitative data. We will only discuss the histogram, bar plot, and box plot in this post.

Histogram

To visualize a single variable, the histogram can be drawn using the hist( ) function.

Let use the data from iris dataset.

attach(iris)
hist(Petal.Width)

We can enhance the histogram by using some arguments/parameters related to hist( ) function. For example,

hist(Petal.Width,
xlab = "Petal Width",
ylab = "Frequency",
main = "Histogram of Petal Width from Iris Data set",
breaks =10,
col = "dodgerblue",
border = "orange")

If these arguments are not provided, R will attempt to intelligently guess them, especially the number of breaks. See the YouTube tutorial for a graphical representation of the histogram.

Barplots

The bar plots are the best choice for visual inspection of a categorical variable (or a numeric variable with a finite number of values), or a rank variable. For example,

library(mtcars)
barplot( table(cyl) )
barplot(table(cyl),
ylab = "Frequency",
xlab = "Cylinders (4, 6, 8)",
main = "Number of cylinders ",
col = "green",
border = "blue")

Boxplots

Boxplots are used to visualize the normality, skewness, and existence of outliers in the data based on five-number summary statistics.

boxplot(mpg)
boxplot(Petal.Width)
boxplot(Petal.Length)


However, we often compare a numerical variable for different values of a categorical variable. For example,

boxplot(mpg ~ cyl, data = mtcars)

The reads the formula mpg ~ cyl as: “plot the mpg variable against the cyl variable using the dataset mtcars. The symbol ~used to specify a formula in R.

boxplot(mpg ~ cyl, data =mtcars,
xlab = "Cylinders",
ylab = "Miles per Gallon",
pch = 20,
cex = 2,
col = "pink",
border = "black")

See How to perform descriptive statistics