R Language Reference Guide III: A Quick Guide

The post is about the R Language Reference Guide subsetting Vectors, Lists, Matrices, and Data Frames in R Language.

R Language A Quick Reference

R language Reference Guide is about learning R Programming with a short description of the widely used commands. It will help the learner and intermediate user of the R Programming Language to get help with different functions quickly. This R language reference is classified into different groups. Let us start with the R Language Reference Guide – III.

This R Language Quick Reference contains R commands about subsetting in R, such as subsetting of vectors, matrices, lists, data frames, arrays, and factors. It also discusses setting the different properties related to R language data types.

Subsetting Vectors: Quick R Language Reference

The following are ways to subset or slice the values from a vector.

R CommandShort Description
x[1:5]Select elements of $x$ by index
x[-(1:5)]Exclude elements of $x$ by index
x[c(TRUE, FALSE)]Select elements of $x$ corresponding to the True value
x[c(“a”, “b”)]Select elements of $x$ by name

Subsetting Lists in R Language

The following methods are used to subset or slice a list in R Language.

R CommandShort Description
x[1:5]Extracts a sublist of the list $x$
x[-(1:5)]Extract a sublist by excluding elements of list $x$
x[c(TRUE, FALSE)]Extract a sublist with logical subscripts
x[c(“a”, “b”)]Extract a sublist by name
x[[2]]Extract an element of the list $x$
x[[“a”]]Extract the element with the name “a” from list $x$
x$aExtract the element with the name “a” from list $x$

Subsetting Matrices in R: A Quick Reference

To subset or extract certain elements from a matrix follow the ways described below.

R CommandShort Description
x[i, j]Extracts elements of matrix $x$, specified by row $i$ and column $j$
x[i, j] = vSet or rest the elements of matrix $x$, specified by row $i$ and column $j$
x[i, ]Extracts $i$th row of a matrix $x$
x[i, ] = vSet or resets the $i$th row of a matrix $x$ specified by $i$th row
x[ , j]Extracts the $j$ column of a matrix $x$
x[ , j] = vSets or resets the $j$ column of matrix $x$
x[i]Subets a matrix $x$ as a vector
x[i] = vSets or resets the $i$th elements (treated as a vector operation)

Subsetting a Data Frame in R Language

One can easily subset or slice a Data Frame in R.

R CommandShort Description
df[i, j]Matrix subsetting of a data frame, specified by $i$th row and $j$th column
df[i, j] = dfSets or resets a subset of a data frame
subset(df, subset = i)Subset of the $i$ cases/ observations of a data frame
subset(df, select = i)Subset of the $i$ variables/ columns of a data frame
subset(df, subset=i, select=j)Subset of the $i$ cases and $j$ variables of a data frame
R Language Reference Guide

R Language: A Quick Reference – I

https://gmstat.com

R Language Quick Reference Guide II

The article is about the R Language Quick Reference Guide. This Quick Reference will help you to learn about creating vectors, matrices, data frames, lists, and factors. You will also learn about setting properties of different data types in R Language.

R Language Quick Reference Guide

R Language Quick Reference Guide

R language Quick Reference Guide is about learning R Programming with a short description of the widely used commands. It will help the learner and intermediate user of the R Programming Language to get help with different functions quickly. This Quick Reference is classified into different groups. Let us start with R Language Quick Reference Guide – II.

This R Language Quick Reference Guide contains R commands about creating vectors, matrices, lists, data frames, arrays, and factors. It also discusses setting the different properties related to R language data types.

Creating Vectors in R Language

The creation of a row or column vector in the R Language is very important. One can easily create a vector of numbers, characters/ strings, complex numbers, and logical values, and can concatenate the elements. The following are different commands for creating Vectors in R

R commandShort Description
c(a1, a2, …, an)Concatenates all $n$ elements to a vector
logical(n)Creates a logical vector of length $n$ (containing false)
numeric(n)Creates a numeric vector of length $n$ (containing zeros)
character(n)Creates a character vector of length $n$ (containing an empty string)
complex(n)Creates a complex vector of length $n$ (containing zeros)

Creating Lists in R Language

Creating Lists in R is important as it can store different types of data and even lists. A vector can also be used to create a list of $k$ elements. The following are ways for creating lists in R language.

R CommandShort Description
list(e1, e2, … ek)Combines all $k$ elements as a list
vector(k, “list”)Creates a list of length $k$ (the elements are all NULL)

Creating Matrices in R Language

Two-dimensional data can be created using the matrix command in R.

R CommandShort Description
matrix(x, nr = r, nc = c)Creates a matrix from $x$ (column as major order)
matrix(x, nr = r, nc = c)Creates a matrix from $x$ (row as major order)

Creating Factors in R Language

To create categorical variables, R has a concept of factors as variables. All factors have levels that may have ordered factors.

R CommandShort Description
factor(x)Creates a factor from the values of variable $x$
factor(x, levels = 1)Creates a factor with the given level set from the values of the variable $x$
ordered(x)Creates an ordered factor with the given level set from the values of the variable $x$
levels(x)Gives the levels of a factor or ordered factor
levels(x) = vSet or reset the levels of a factor or ordered factor

Creating a Data Frame in R Language

A data frame is a tabular data format used for statistical data analysis. The format of the data is like data entered in spreadsheets for data analysis.

R CommandShort Description
data.frame(n1=x1, n2=x2, ….)Creates a data frame

R Language Data Type Properties

Every data object has different properties. These properties can be used to find out the number of rows in a vector or matrix, the number of columns, names of rows and columns of a matrix or data frame.

R CommandShort Description
length(x)Gives the number of elements in a variable $x$
mode(x)Tells about the data type of the variable $x$
nrow(x)Displays the number of rows of a vector, array, or data frame $x$
ncol(x)Displays the number of columns (variable) of a vector, array, or data frame $x$
dim(x)Displays the dimension (number of rows and columns) of a matrix, data frame, array, or list $x$
row(x)Matrix of row indices for matrix-like object $x$
col(x)Matrix of column indices for matrix-like object $x$
rownames(x)Get the row names of the matrix-like object $x$
rownames(x)=vSet the row names of the matrix-like object $x$ to $v$
colnames(x)Get the column names of the matrix-like object $x$
colnames(x)=vSet the column names of the matrix-like object $x$ to $v$
dimnames(x)Get both the row and column names (in a matrix, data frame, or list)
dimnames(x)=list(rn, cn)Set both the row and column names
names(x)Gives the names of $x$
namex(x)=vSets or resets the names of $x$ to $v$
names(x)=NULLremoves the names from $x$
row.names(df)Gives the observation names from a data frame
row.names(df)=vSets or resets the observation names of a data frame
names(df)Gives the variables names from a data frame
names(df)=vSets or resets the variable names of a data frame

R Language: A Quick Reference – I

https://itfeature.com, https://gmstat.com

Factors in R (Categorical Data): Learning Made Easy

Factors in R Language are used to represent categorical data in the R language. Factors in R can be ordered or unordered. One can think of a factor as an integer vector where each integer has a label. Factors are specially treated by modeling functions such as lm() and glm().  Factors are the data objects used for categorical data and stored as levels. They can store both string and integer variables. 

Using factors with labels is better than using integers as factors are self-describing; having a variable that has values “Male” and “Female” is better than a variable having values 1 and 2.

Creating a Simple Factor in R

The following example creates a simple factor variable that has two levels.

# Simple factor with two levels
x <- factor(c("yes", "yes", "no", "yes", "no"))
# computes frequency of factors
table(x)
# strips out the class
unclass(x)
Factors in R

The order of the levels can be set using the levels argument to factor(). This can be important in linear modeling because the first level is used as the baseline level.

x <- factor(c("yes","yes","no","yes","no"), levels = c("yes","no"))

Naming Factors in R

Factors can be given names using the label argument. The label argument changes the old values of the variable to a new one. For example,

x <- factor(c("yes", "yes", "no", "yes", "no"), levels = c("yes", "no"), label = c(1,2) )
x <- factor(c("yes","yes","no","yes","no"), levels = c("yes","no"), label = c("Level-1", "level-2"))

x <- factor(c("yes","yes","no","yes","no"), levels = c("yes","no"), label = c("group-1", "group-2"))

Suppose, you have a factor variable with numerical values. You want to compute the mean. The mean vector will result in the average value of the vector, but the mean of the factor variable will result in a warning message. To calculate the mean of the original numeric values of the "f" variable, you have to convert the values using the level argument. For example,

# vector
v <- c(10,20,20,50,10,20,10,50,20)
# vector converted to factor
f <- factor(v)
# mean of the vector
mean(v)

# mean of factor
mean(f)
mean(as.numeric(levels(f)[f]))

Use of cut() Function in R

The the cut() function in R can also be used to convert a numeric variable into a factor. The breaks argument can be used to describe how ranges of numbers will be converted to factor values. If the breaks argument is set to a single number then the resulting factor will be created by dividing the range of the variable into that number of equal-length intervals. However, if a vector of values is given to the breaks argument, the values in the vectors are used to determine the breakpoint. The number of levels of the resultant factor will be one less than the number of values in the vector provided to the breaks argument. For example,

attach(mtcars)
cut(mpg, breaks = 3)
factors <- cut(mpg, breaks = c(10, 18, 25, 30, 35) )
table(factors)
Factors in R using Cut Function

You will notice that the default label for factors produced by the cut() function in R contains the actual range of values that were used to divide the variable into factors.

Learn about Data Frames in R

https://itfeature.com