Handling Missing values in R

Question: What are the differences between missing values in R and other Statistical Packages?

Answer: Missing values (NA) cannot be used in comparisons, as already discussed in the previous post on missing values in R. In other statistical packages (software) a “missing value” is assigned some code either very high or very low in magnitude such as 99 or -99 etc. These coded values are considered as missing and can be used to compare to other values and other values can be compared to missing values. In R language NA values are used for all kinds of missing data, while in other packages, missing strings and missing numbers are represented differently, for example, empty quotations for strings, and periods, large or small numbers. Similarly, non-NA values cannot be interpreted as missing while in other packages system missing values are designate from other values.

Question: What are NA options in R?

Answer: In the previous post on missing values, I introduced is.na() function as a tool for both finding and creating missing values. The is.na() is one of several functions build around NA. Most of the other functions for missing values (NA) are options for na.action(). The possible na.action() settings within R are:

    • na.omit() and na.exclude(): These functions return the object with observations removed if they contain any missing (NA) values. The difference between these two functions na.omit() and na.exclude() can be seen in some prediction and residual functions.
    • na.pass(): This function returns the object unchanged.
    • na.fail(): This function returns the object only if it contains no missing values.

To understand these NA options use the following lines of code.

> getOption(“na.action”)
> (m<-as.data.frame(matrix(c(1 : 5, NA), ncol=2)))
> na.omit(m)
> na.exclude(m)
> na.fail(m)
> na.pass(m)

Note that it is wise to both investigate the missing values in your data set and also make use of the help files for all functions you are willing to use for handling missing values. You should be either aware of and comfortable with the default treatments (handling) of missing values or specifying the treatment of missing values you want for your analysis.

Missing values In R

Question: Can missing values be handled on R?
Answer: Yes, in R language one can handle missing values. The way of dealing with missing values is different as compared to other statistical software such as SPSS, SAS, STATA, EVIEWS, etc.

Question: What is the representation of missing values in R Language?
Answer: The missing values or data appear as NA. Note that NA is not a string nor a numeric value.

Question: Can the R user introduce missing value(s) in matrix/ vector?
Answer: Yes user of R can create (introduce) missing values in vector/ Matrix. For example,

> x <- c(1,2,3,4,NA,6,7,8,9,10)
> y <- c("a", "b", "c", NA, "NA")

Note that on $y$ vector the fifth value of strong “NA” is not missing.

Question: How one can check that there is a missing value in a vector/ Matrix?
Answer: To check which values in a matrix/vector are recognized as missing values by R language, use the is.na function. This function will return a vector of TRUE or FALSE. TRUE indicates that the value at that index is missing while FALSE indicates that the value is not missing. For example

> is.na(x)    # 5th will appear as TRUE while all others will be FALSE
> is.na(y)    # 4th will be true while all others as FALSE

Note that “NA” in the second vector is not a missing value, therefore is.na will return FALSE for this value.

Missing Values in R

Question: Can missing values be used for comparisons?
Answer: No missing values cannot be used in comparisons. NA (missing values) is used for all kinds of missing data. Vector $x$ is numeric and vector $y$ is a character object. So Non-NA values cannot be interpreted as missing values. Write the command, to understand it.

> x <- 0
> y == NA
> is.na(x) <- which(x==7)
> x

Question: Provide an example for introducing NA in the matrix.
Answer: The following command will create a matrix with all of the elements as NA.

> matrix(NA, nrow = 3, ncol = 3)
> matrix(c(NA,1,2,3,4,5,6,NA, NA), nrow = 3, ncol = 3)

Visit for Online MCQs Test of various Subjects

R Packages

Question: What is an R Package?
Answer: R package is a collection of objects that R Language can use. A package contains functions, data set, and documentation (which helps how to use the package) or other objects such as dynamically loaded libraries of already compiled code.

Question: How do I see which packages I have available?
Answer: To see which packages you have to use the command at the R prompt

> library()

Question: Which packages do I already have?
Answer: To see what packages are installed one can use the installed.packages() command a R prompt. The output will show the packages installed.

> installed.packages()
> installed.packages()[1:5,]

Question: How one can load a Package in R language?
Answer: Basic packages are already loaded. If you want to load a downloaded version of packages use the command

> library(“package name”)
> library(“car”)

where package name is the name of the package you want to load. Here in the example, we used the “car”, it means “car” package will be loaded.

Question: How one can see the documentation of a particular package?
Answer: To see the documentation of particular package use the command

> library(help=”package name”)
> help(package=”package name”)
> help(package=”car”)
> library(help=”car”)

for more information about getting help follow the link: Getting Help in R Language

Question: How do I see the help for a specific function?
Answer: To get help about a function in R use command

> help(“function name”)
> ? function name
> ?Manova
> help(“Manova”)

Question: What functions and datasets are available in a package?
Answer: To check what functions and datasets are in a package using the help command at R prompt. This will provide package information giving a list of functions and datasets.

> help(package = “MASS”)

Note that once a package is loaded, the help command can also be used with all available functions and datasets.

Question: How can one add or delete a package?
Answer: A package can be installed using command

> install.packages(“package name”)

and package can be removed or deleted using command

> remove.packages(“package name”)

FAQs about R

The post is R Frequently Asked Questions (FAQs).

Question: Why R language is named R?
Answer: The name of the R language is based on the first letters of its authors (Robert Gentleman and Ross Ihaka).

Question: What is the R Foundation?
Answer: The R Foundation is a non-profit organization working in the public interest, founded by the members of the R Core Team. This foundation provides support for the R project and other innovations in statistical computing and provides a reference point for individuals, institutions, or commercial enterprises who want to support or interact with the R development community. R foundation also holds and administers the copyright of R language software and its documentation. For more information about R Foundation follow the link https://www.R-project.org/foundation

Question: What is R-Forge?
Answer: R-Forge provides a central platform for the development of R packages, R-related software, etc. It is based on GForge and offers easy access to the best in SVN, daily built and checked R packages, mailing lists, bug tracking, message board or forum, website hosting, permanent file archival, full backups, and total web-based administration. For more information see

  • The R-Forge web page
  • Stefan Theußl and Achim Zeileis (2009), “Collaborative software development using R-Forge”, The R Journal, 1(1), 9-14.

Question: What mailing lists exist for R language?
Answer: There are four mailing lists devoted to R language

  • R-announce: A moderated mailing list for major announcements about the R development and the availability of new R code.
  • R-packages: A moderated mailing list for an announcement on the availability of new or further enhanced contributed packages.
  • R-help: The main R mailing list for discussion and problems and solutions using R, announcements about the development of R, and the availability of new R code. R-help is intended for people who want to use R to solve problems.
  • R-devel: A mailing list for questions and discussions about code development in R language.

Question: What documentation exists for R language?
Answer: For most of the R functions and variables in R online documentation exists and this documentation can be printed on screen by typing help(name) or ?name at the R prompt, where the name is the name of the topic for which help is required. The R documentation can also be made available in PDF and HTML formats and as a hard copy via LaTeX. The up-to-date HTML version of R documentation is always available for web browsers at http://stat.ethz.ch/R-manual.Lot of R books and manuals are also available as R documentation.

How to get help in R follow the link Getting Help in R Language.

x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
Shield Security