Vector in R Language

A vector in R is a set of numbers. A vector can be considered as a single column or a single row of a spreadsheet. The following examples are numbers that are not technically “vectors”. It is because these vectors are not in a column/row structure, however, they are ordered. These vectors can be referred to by index.

In R programming, vectors are the most basic data structure and a core building block of data analysis. Whether you’re new to R or brushing up on concepts, understanding vectors is essential. They form the building blocks for more complex structures like matrices, lists, and data frames.

Key Characteristics of Vectors

  • Support Vectorized Operations: Arithmetic and logical operations can be applied element-wise without loops.
  • Homogeneous: All elements must be of the same data type (such as numeric, character, logical, etc.).
  • Indexed: Elements can be accessed using indices (starting at 1).
  • Dynamic: Vectors can grow or shrink in size.

Types of Vectors in R Language

R supports several types of vectors based on the data they store:

(a) Numeric Vectors: Store real numbers (decimals or integers). For example: > c(1.5, 2.3, 4.0)

(b) Integer Vectors: Store whole numbers (explicitly defined with L). For example, > c(1L, 2L, 3L)

(c) Logical Vectors: Store TRUE, FALSE, or NA (missing value). For example: > c(TRUE, FALSE, NA)

(d) Character Vectors” Store text (strings). For example: > c("apple", "banana", "cherry")

(e) Complex Vectors: Store complex numbers. For example: > c(1+2i, 3+4i)

Creating Vectors in R

One can create vectors in R Language using:

  • c() function
  • seq()
  • : operator
# Creating a vector with the c() function

c(1, 4, 6, 7, 9)

c(1:5, 10)
Creating Vector in R Language

A vector in R language can be created using seq() in R, it generates a series of numbers.

# Create a vector using seq() in R

seq(1, 10, by = 2)
seq(0, 50, length = 11)
seq(1, 50, length = 11)
Creating Vector in R using seq() Function

The vector can be created in R using the colon (:) operator. Following are the examples

# Create vector in R using : operator

1:10

## Output
[1]  1  2  3  4  5  6  7  8  9 10

5:1

## Output
[1] 5 4 3 2 1

Creating Non-Integer Sequences in R

The non-integer sequences can also be created in the R Language.

# non-integer sequences
seq(0, 100*pi, by = pi)
Non integer vectors in R

Assigning Vector to Variable

One can assign a vector to a variable using the assignment operator (<-) or equal symbol (=). The examples are:

a <- 1:5
b <- seq(15, 3, length=5)
c <- a * b

Performing Computation on Vectors

There are a lot of built-in functions that can be used to perform different computations on vectors. For example,

a <- 1:5

# compute the total of elements of a vector
sum(a)

## Output
15

# product of elements of a vector
prod(a)

## Output
120

# average of the vector
mean(a)

## Output
3

# standard deviation and variance of a vector
sd(a)

## Output 
1.581139

var(a)

## Output
2.5

Indexing and Slicing Vectors

One can extract the elements of a vector by using square brackets and the index of the component of the vector.

V <- seq(0, 100, by = 10)
V[] # gives all the elements of the vector

## Output
[1]   0  10  20  30  40  50  60  70  80  90 100

V[5] # 5th elements from vector z

## Output
[1] 40

V[c(2, 4, 6, 8)] #2nd, 4th, th, and 8th element

## Output
[1] 10 30 50 70

V[-c(2, 4, 6, 8)] # elements except 2nd, 4th, 6th, and 8th element

## Output
[1]   0  20  40  60  80  90 100

Updating Vector Elements

The specific / required elements of a vector can be updated

V[c(2, 4)] <- c(500, 600) # the second and 4th element is updated to 500 and 600
Updating vector elements in R, Vectors in R Language

https://itfeature.com

https://gmstat.com

Special Vector Values

The following are special vector values used in R Language.

Special ValueMeaningExample
NAMissing valuec(1, NA, 3)
NaNNot a Number0/0 → NaN
InfInfinity1/0 → Inf
NULLEmpty objectvector() → NULL

Important Points About Vectors

The important points about vectors in R language are:

  • Data Types: Vectors can hold logical, integer, double, character, complex, or raw data.
  • Creation: Use the c() function to combine elements into a vector.
  • Accessing Elements: Use indexing (square brackets) to access individual elements.
  • Vector Operations: Perform arithmetic, logical, and comparison operations on vectors.
  • Vectorization: R excels at vectorized operations, making calculations efficient.

Learn How to Create User Defined Functions in R

Introduction to User Defined Functions in R

One can create user defined functions in R Language easily. User-defined functions allow to write/create custom blocks of code to be reused throughout the analysis. The article presents some useful examples of how to write user defined functions in R Language. R language helps to create much more efficient and possibly elegant coding.

Assigning Function to a Variable

Example 1: Create a simple function and assign the function to a variable name as we do with any other objects.

f <- function(x, y = 0){
		z <- x + y
		z
}

x = rnorm(10)
f(x + y)

Regression Coefficients

Example 2: Given $n\times 1$ vector of $y$ and matrix of $X$ the $\hat{\beta}=E[X|y] = (X’X)^{-1}X’y$, where $(X’X)^{-1}$ is generalized invers of $X’X$.

Beta <- function(x, y){
		X <- qr(x)
		qr.coef(X, y)
	}

attach(mtcars)
xmat = cbind(1, hp, wt)
yvar = mpg
regcoef <- Beta(xmat, yvar)

The qr() function computes the QR decomposition of a matrix. The QR decomposition if used to solve the equation $Ax=b$ for a given matrix $A$, and vector $b$. It is very useful in computing regression coefficients and in applying Newton Raphson’s algorithm.

User Defined Functions in R

Removing all Objects from globalenv

Example 3: Create a function capable of removing all objects from the globalenv.

clear <- function(env = globalenv() ){
		obj = ls(envir = env)
		rm(list = obj, envir = env)
}

The clear() function removes all objects from a specified environment and seems to work correctly. However, the clear() function detects also itself and as a result, it cannot be reused without redefining the function again.

The clear() function can be improved to keep the function clear() when all other objects are deleted.

clear <- function(env = globalenv()){
		objects <- objects(env)
		objects <- objects[objects != "clear"]
		rm(list = objects, envir = env)
		invisible(NULL)
	}

Computing Measure of Central Tendency

Example 4: Create a function that can compute some basic Measure of Central Tendency.

center = function(x, type){
	switch(type, 
		mean = mean(x),
		median = median(x),
		trimmed = mean(x, trim = 0.1))
}

attach(airquality)
center(Temp, "mean")     # for calcualtion of mean
center(Temp, "median")   # for calculation of median
center(Temp, "trimmed")  # for calculation of trimmed mean
User defined functions in R: Measure of Central Tendency

Note that the user-defined functions in R can incorporate conditional statements, loops, and other functionalities to perform more advanced tasks. They can also have default parameter values for added flexibility.

https://itfeature.com

https://gmstat.com

Descriptive Summary in R

Introduction to Descriptive Summary in R

Statistics is a study of data: describing properties of data (descriptive statistics) and drawing conclusions about a population based on information in a sample (inferential statistics). In this article, we will discuss the computation of descriptive summary in R (Descriptive statistics in R Programming).

Example: Twenty elementary school children were asked if they live with both parents (B), father only (F), mother only (M), or someone else (S) and how many brothers has he. The responses of the children are as follows:

CaseSexNo. of His BrothersCaseSexNo. of His Brothers
MFemale3BMale2
BFemale2FMale1
BFemale3BMale0
MFemale4MMale0
FMale3MMale3
SMale1BFemale4
BMale2BFemale3
MMale2FMale2
FFemale4BFemale1
BFemale3MFemale2


Consider the following computation is required. These computations are related to the Descriptive summary in R.

  • Construct a frequency distribution table in r relative to the case of each one.
  • Draw a bar and pie graphs of the frequency distribution for each category using the R code.

Creating the Frequency Table in R

# Enter the data in the vector form 
x <- c("M", "B", "B", "M", "F", "S", "B", "M", "F", "B", "B", "F", "B", "M", "M", "B", "B", "F", "B", "M") 

# Creating the frequency table use Table command 
tabx=table(x) ; tabx

# Output
x
B F M S 
9 4 6 1 

Draw a Bar Chart and Pie Chart from the Frequency Table

# Drawing the bar chart for the resulting table in Green color with main title, x label and y label 

barplot(tabx, xlab = "x", ylab = "Frequency", main = "Sample of Twenty elementary school children ",col = "Green") 

# Drawing the pie chart for the resulting table with main title.
pie(tabx, main = "Sample of Twenty elementary school children ")
Graphical Descriptive summary in R Programming Language
Descriptive summary in R Programming Language

Descriptive Statistics for Air Quality Data

Consider the air quality data for computing numerical and graphical descriptive summary in R. The air quality data already exists in the R Datasets package.

attach(airquality)
# To choose the temperature degree only
Temperature = airquality[, 4]
hist(Temperature)

hist(Temperature, main="Maximum daily temperature at La Guardia Airport", xlab="Temperature in degrees Fahrenheit", xlim = c(50, 100), col="darkmagenta", freq=T)

h <- hist(Temperature, ylim = c(0,40))
text(h$mids, h$counts, labels=h$counts, adj=c(0.5, -0.5))
Histogram Descriptive Statistics in R Programming Language

In the above histogram, the frequency of each bar is drawn at the top of each bar by using the text() function.

Note that to change the number of classes or the interval, we should use the sequence function to divide the $range$, $Max$, and $Min$, into $n$ using the function length.out=n+1

hist(Temperature, breaks = seq(min(Temperature), max(Temperature), length.out = 7))
Histogram with breaks. Descriptive Statistics in R Programming Language

Median for Ungrouped Data

Numeric descriptive statistics such as median, mean, mode, and other summary statistics can be computed.

median(Temperature)
## Output 79
mean(Temperature)
summary(Temperature)
Numerical Descriptive Statistics in R Programming Language

A customized function for the computation of the median can be created. For example

arithmetic.median <- function(xx){
    modulo <- length(xx) %% 2
    if (modulo == 0){
      (sort(xx)[ceiling(length(xx)/2)] + sort(xx)[ceiling(1+length(xx)/2)])/2
    } else{
     sort(xx)[ceiling(length(xx)/2)]
  }
}
arithmetic.median(Temperature)

Computing Quartiles and IQR

The quantiles (Quartiles, Deciles, and Percentiles) can be computed using the function quantile() in R. The interquartile range (IQR) can also be computed using the iqr() function.

y = airquality[, 4]  # temperature variable

quantile(y)

quantile(y, probs = c(0.25,0.5,0.75))
quantile(y, probs = c(0.30,0.50,0.70,0.90))

IQR(y)
Quartiles Descriptive summary in R Programming Language

One can create a custom function for the computation of Quartiles and IQR. For example,

quart<- function(x) {
   x <- sort(x)
   n <- length(x)
   m <- (n+1)/2
   if (floor(m) != m) {
      l <- m-1/2; u <- m+1/2
     } else {
     l <- m-1; u <- m+1
     }
   c(Q1 = median(x[1:l]), 
   Q3 = median(x[u:n]), 
   IQR = median(x[u:n])-median(x[1:l]))
}

quart(y)

FAQs in R Language

  1. How one can perform descriptive statistics in R Language?
  2. Discuss the strategy of creating a frequency table in R.
  3. How Pie Charts and Bar Charts can be drawn in R Language? Discuss the commands and important arguments.
  4. What default function is used to compute the quartiles of a data set?
  5. You are interested in computing the median for group and ungroup data in R. Write a customized R function.
  6. Create a User-Defined function that can compute, Quaritles and IQR of the inputted data set.

https://itfeature.com

https://gmstat.com