Recall that in R language a factor is a variable that defines a partition into groups. A single factor variable can be used to create a simple frequency table in R, while a pair of factors can be used to define a two-way cross-classification (contingency or frequency distribution). For this purpose, the table()
function allows to creation of frequency tables. The frequency table is calculated from equal length factors.
Frequency Table in R of Categorical/ Group/ Factor Variable
We will use the “mtcars” dataset. For the variable $gear$, let us create a frequency table using the table()
function. The table()
function will count the gear code for each entry in the data vector. For example,
attach(mtcars) freq <- table(gear) freq
The freq object will give a table of frequencies of each gear code in the sample. It is important to note that, the frequencies are ordered and labeled by the levels attribute of the factor.
Frequency Distribution of a Continuous Variable
One can also create a frequency distribution table for a continuous variable. Suppose from the mtcars data set, we are interested in creating a frequency table of $mpg$ variable. For this purpose, first, we need to define the cut points or bins to define the classes/groups of the frequency table. For example,
cut(mpg, 10+5*(0:5)) ## Output (10,15] (15,20] (20,25] (25,30] (30,35] 6 12 8 2 4
The cut()
function is used to split the continuous data vector into groups. The groups are defined by creating a sequence of values using 10+5*(0:5)
, that is
10+5*(0:5) ## Output 10 15 20 25 30 35
The cut()
function, cuts and counts the occurrence of each observation of mpg regarding the cut points created using breaks = 10+5*(0:5)
. The frequency table will be
Creating Graph of Frequency Table
For the frequency table created above, one can easily create different graphical representations, such as pie charts and bar plots of the frequency table. For example,
freq<-table(cut(mpg, 10+5*(0:5))) pie(freq) hist(freq) barplot(freq) plot(freq)
Note that: for a $k$ factor argument, the result is a $k$-way array of frequencies.