Handling Missing Values in R: A Quick Guide

The article is about Handling Missing Values in the R Language.

Question: What are the differences between missing values in R and other Statistical Packages?

Answer: Missing values (NA) cannot be used in comparisons, as already discussed in the previous post on missing values in R. In other statistical packages (software) a “missing value” is assigned to some code either very high or very low in magnitude such as 99 or -99 etc. These coded values are considered as missing and can be used to compare to other values and other values can be compared to missing values.

In R language NA values are used for all kinds of missing data, while in other packages, missing strings and missing numbers are represented differently, for example, empty quotations for strings, and periods, large or small numbers. Similarly, non-NA values cannot be interpreted as missing while in other package systems, missing values are designated from other values.

Handling Missing Values in R

Question: What are NA options in R?
Answer: In the previous post on missing values, I introduced is.na() function as a tool for both finding and creating missing values. The is.na() is one of several functions built around NA. Most of the other functions for missing values (NA) are options for na.action(). The possible na.action() settings within R are:

  • na.omit() and na.exclude(): These functions return the object with observations removed if they contain any missing (NA) values. The difference between these two functions na.omit() and na.exclude() can be seen in some prediction and residual functions.
  • na.pass(): This function returns the object unchanged.
  • na.fail(): This function returns the object only if it contains no missing values.

To understand these NA options use the following lines of code.

getOption("na.action")

(m <- as.data.frame(matrix(c(1 : 5, NA), ncol=2)))
na.omit(m)
na.exclude(m)
na.fail(m)
na.pass(m)
Handling Missing Values in R Language

Note that it is wise to investigate the missing values in your data set and also make use of the help files for all functions you are willing to use for handling missing values. You should be either aware of and comfortable with the default treatments (handling) of missing values or specifying the treatment of missing values you want for your analysis.

Handling Missing values in R

https://itfeature.com

Test Preparation MCQs

Matrix in R Language (2015): Key Secrets

The matrix is an important data type in R language similar to the data frame. It has two dimensions as the arrangement of elements is in rows and columns.

Matrix In R Language

Question: What is a matrix in R Language?
Answer: In R language matrices are two-dimensional arrays of elements all of which are of the same type, for example, numbers, character strings, or logical values.

Matrices may be constructed using the built-in function “matrix”, which reshapes its first argument into a matrix having a specified number of rows as the second argument and a number of columns as the third matrix.

Creating a Matrix in R Language

Question: Give an example of how the matrix is constructed in R language.
Answer: A 3 by 3 matrix (3 rows and 3 columns) matrix may be constructed such as:

matrix(1:9, 3, 3)
matrix(c(1,2,3,4,5,6,7,8,9), 3, 3)
matrix(runif(9), 3,3)

First, two commands construct a matrix of 9 elements having 3 rows and 3 columns consisting of numbers from 1 up to 9. The third command makes a matrix of 3 rows and 3 columns with random numbers from a uniform distribution.

Question: How the matrix elements are filled?
Answer: A matrix is filled by columns, unless the optional argument byrow is set to TRUE as an argument in matrix command, for example

matrix(1:9, 3, 3, byrow = TRUE)

Question: Can the matrix be stored in R?
Answer: Any matrix can be stored in R such as

m <- matrix(1:9, 3, 3)
mymatrix <- matrix( rnorm(16), nrow=4 )
Matrix in R Language

Matrices are stored in “m” and “mymatrix” objects. The second command constructs a matrix having 16 elements with 4 rows from a normal distribution having mean 0 and variance 1.

Attributes of Matrix Object in R

Question: What is the use of the dim command in R?
Answer: The dim (dimension) is an attribute of the matrix in R language, which tells the number of rows and the number of columns of a matrix, for example,

dim(mymatrix)

This will result in output showing 4  4, meaning 4 rows and 4 column matrix.

Question: Can we name rows of a matrix in R Language?
Answer: Yes in R language we can name rows of a matrix according to one’s requirements, such as

rownames(mymatrix) &lt;- c("x1", "x2", "x3", "x4")
mymatrix

Question: Can column names be changed or updated in R?
Answer: The procedure is the same as changing the column name. For this purpose colnames command is used, for example

colnames(mymatrix)&lt;-c("A", "B", "C", "D")
mymatrix

Question: What is the purpose of the attributes command for the matrix in R Language?
Answer: The attributes function can be used to get information about the dimension of the matrix and dimnames (dimension names). For example;

attributes(mymatrix)

In summary, the primary function for creating a matrix in R language is matrix(). It takes a few arguments:

  • data: This is a vector containing the elements for the matrix.
  • nrow: The number of rows in the matrix.
  • ncol: The number of columns in the matrix.

FAQs about Matrices in R

  1. How to create a matrix in R?
  2. How elements are filled in R?
  3. How to convert a data object to a matrix object in R?
  4. How different attributes of a matrix in R can be checked?
  5. How matrices can be stored in a variable?
  6. How one can name the rows and columns of a matrix in R?
  7. What is the difference between dim and dimnames commands?
  8. How one can create a matrix of order 3 by 3 (3 rows and 3 columns) with elements from a probability Distribution.
  9. What is the primary function of matrix() function in R Language.

https://itfeature.com

https://gmstat.com

Missing Values In R

The article is about the missing Values in R Language. A discussion is about how missing values are introduced in vectors or matrices and how the existence of missing observations can be checked in R Language.

Understanding Missing Values in R Language

Question: Can missing values be handled in R?
Answer: Yes, in R language one can handle missing observations. The way of dealing with missing values is different as compared to other statistical software such as SPSS, SAS, STATA, EVIEWS, etc.

Question: What is the representation of missing values in R Language?
Answer: The missing values or data appear as NA. Note that NA is not a string nor a numeric value.

Question: Can the R user introduce missing value(s) in matrix/ vector?
Answer: Yes user of R can create (introduce) missing values in vector/ Matrix. For example,

x <- c(1,2,3,4,NA,6,7,8,9,10)
y <- c("a", "b", "c", NA, "NA")

Note that on the $y$ vector the fifth value of strong “NA” is not missing.

How to Check Missing Values in a Vector/ Matrix

Question: How one can check that there is a missing value in a vector/ Matrix?
Answer: To check which values in a matrix/vector are recognized as missing values by R language, use the is.na function. This function will return a vector of TRUE or FALSE. TRUE indicates that the value at that index is missing while FALSE indicates that the value is not missing. For example

is.na(x)    # 5th will appear as TRUE while all others will be FALSE
is.na(y)    # 4th will be true while all others as FALSE

Note that “NA” in the second vector is not a missing value, therefore is.na will return FALSE for this value.

Missing Values in R

Question: Can missing values be used for comparisons?
Answer: No missing values cannot be used in comparisons. NA (missing values) is used for all kinds of missing data. Vector $x$ is numeric and vector $y$ is a character object. So Non-NA values cannot be interpreted as missing values. Write the command, to understand it.

x <- 0
y == NA
is.na(x) <- which(x==7)
x

Question: Provide an example for introducing NA in the matrix.
Answer: The following command will create a matrix with all of the elements as NA.

matrix(NA, nrow = 3, ncol = 3)
matrix(c(NA,1,2,3,4,5,6,NA, NA), nrow = 3, ncol = 3)
Learn R Language and FAQS

Visit for Online MCQs Test of various Subjects