Category: R FAQS

R FAQs: Handling Missing values in R

Question: What are the differences of missing values in R and other Statistical Packages?

Answer: Missing values (NA) cannot be used in comparisons, as already discussed in previous post on missing values in R. In other statistical packages (softwares) a “missing value” is assigned some code either very high or very low in magnitude such as 99 or -99 etc. These coded values are considered as missing and can be used to compare to other values and other values can be compared to missing values. In R language NA values are used for all kinds of missing data, while in other packages, missing strings and missing numbers are represented differently, for example, empty quotations for strings, and periods, large or small numbers. Similarly non-NA values cannot be interpreted as missing while in other packages system missing values are designate from other values.

Question: What are NA options in R?

Answer: In previous post on missing values, I introduced is.na() function as a tool for both finding and creating missing values. The is.na() is one of several functions build around NA. Most of the other functions for missing values (NA) are options for na.action(). The possible na.action() settings within R are:

  • na.omit() and na.exclude(): These functions return the object with observations removed if they contain any missing (NA) values. The difference between these two functions na.omit() and na.exclude() can be seen in in some prediction and residual functions.
  • na.pass(): This function returns the object unchanged.
  • na.fail(): This function returns the object only if it contains no missing values.

To understand these NA options use the following lines of code.

getOption(“na.action”)
(m<-as.data.frame(matrix(c(1:5, NA), ncol=2)))
na.omit(m)
na.exclude(m)
na.fail(m)
na.pass(m)

Note that it is wise to both investigate the missing values in you data set and also make use of the help files for all functions you are willing to use for handling missing values. You should be either aware of and comfortable with the default treatments (handling) of missing values or specifying the treatment of missing values you want for you analysis.

R FAQ missing values

Question: Can missing values be handled on R?
Answer: Yes, in R language one can handle missing values. The way of dealing with missing values is different as compared to other statistical softwares such as SPSS, SAS, STATA, EVIEWS etc.

Question: What is the representation of missing values in R Language?
Answer: In R missing values or data appears as NA. Note that NA is not a string nor a numeric value.

Question: Can R user introduce missing value(s) in matrix/ vector?
Answer: Yes user of R can create (introduce) missing values in vector/ Matrix. For example,

    x <- c(1,2,3,4,NA,6,7,8,9,10)
    y <- c(“a”, “b”, “c”, NA, “NA”)

Note that on y vector the fifth value of strong “NA” not a missing value.

Question: How one can check that there are missing value in a vector/ Matrix?
Answer: To check which values in a matrix/vector recognized as missing value by R language, use the is.na function. This function will return a vector of TRUE or FALSE. TRUE indicate that the value at that index is missing while FALSE indicate that the value is not a missing value. For example

> is.na(x)    # fifth element will appear as TRUE while all other will be FALSE
> is.na(y)    # fourth element will be true while all others as FALSE

Note that “NA” in second vector is not a missing value, therefore is.na will return FALSE for this value.

Question: In R language, can missing values be used comparisons?
Answer: No missing values in R cannot be used in comparisons. NA (missing values) is used for all kinds of missing data. Vector x is numeric and vector y is a character object. So Non-NA values cannot be interpreted as missing values. Write the command, to understand it

x < 0
y == NA
is.na(x) <- which(x–7); x1

Question: Provide an example for introducing NA in matrix?
Answer: Following command will create a matrix with all of the elements as NA.

matrix(NA, nrow=3, ncol=3)
matrix(c(NA,1,2,3,4,5,6,NA, NA), nrow=3, ncol=3)

R FAQS: R Packages

R FAQS: R Packages

Question: What is an R Package?
Answer: R package is a collection of objects that R Language can use. A package contain functions, data set, and documentation (which helps how to use the package) or other objects such as dynamically loaded libraries of already compiled code.

Question: How do I see which packages I have available?
Answer: To see which packages you have use the command at R prompt

> library()

Question: Which packages do I already have?
Answer: To see what packages are installed one can use the installed.packages() command a R prompt. Output will show the packages installed.

> installed.packages()
> installed.packages()[1:5,]

Question: How one can load a Package in R language?
Answer: Basic packages are already loaded. If you want to load downloaded version of packages use the command

> library(“package name”)
> library(“car”)

where package name is the name of the package you want to load. Here in example we used the “car”, it means “car” package will be loaded.

Question: How one can see the documentation of a particular package?
Answer: To see the documentation of particular package use the command

> library(help=”package name”)
> help(package=”package name”)
> help(package=”car”)
> library(help=”car”)

for more information about getting help follow the link: Getting Help in R Language

Question: How do I see the help for a specific function?
Answer: To get help about a function in R use command

> help(“function name”)
> ? function name
> ?Manova
> help(“Manova”)

Question: What functions and datasets are available in a package?
Answer: To check what functions and datasets are in a package using the help command at R prompt. This will provide package information giving list of functions and datasets.

> help(package=”MASS”)

Note that once a package is loaded, the help command can also be used with all available functions and datasets.

Question: How can one add or delete a package?
Answer: A package can be installed using command

> install.packages(“package name”)

and package can be removed or deleted using command

> remove.packages(“package name”)

FAQs about R

Question: Why R language is named as R?
Answer: The name of R language is based on the first letters of its authors (Robert Gentleman and Ross Ihaka).

Question: What is the R Foundation?
Answer: The R foundation is a non-profit organization working in the public interest, founded by the members of the R Core Team. This foundation provides support for the R project and other innovations in statistical computing, provides reference point for individual, institutions or commercial enterprises whom want to support or interact with the R development community. R foundation also holds and administer the copyright of R language software and its documentation. For more information about R foundation follow the link https://www.R-project.org/foundation

Question:What is R-Forge?
Answer: R-Forge provides a central platform for the development of R packages, R-related softwares etc. It is based on GForge that offers easy access to the best in SVN, daily built and checked R packages, mailing lists, bug tracking, message board or forum, web-site hosting, permanent file archival, full backups and total web-based administration. For more information see

  • The R-Forge web page
  • Stefan Theußl and Achim Zeileis (2009), “Collaborative software development using R-Forge”, The R Journal, 1(1), 9-14.

Question: What mailing lists exist for R language?
Answer: There are four mailing lists devoted to R language

  • R-announce: A moderated mailing list for major announcements about the R development and the availability of new R code.
  • R-packages: A moderated mailing list for announcement on the availability of new or further enhanced contributed packages.
  • R-help: The main R mailing list for discussion and problems and solution using R, announcements about the development of R and the availability of new R code. R-help is intended to people who want to use R to solve problems.
  • R-devel: A mailing list for questions and discussions about code development in R language.

Question: What documentation exists for R language?
Answer: For most of the R function and variables in R online documentation exists and this documentation can be printed on screen by typing help(name) or ?name at the R prompt, where name is the name of the topic for which help is required. The R documentation can also be made available in PDF and HTML formats and as a hardcopy via LaTeX. Up-to-date HTML version of R documentation is always available for web browsers at http://stat.ethz.ch/R-manual.Lot of R books and manuals are also available as R documentation.
How to get help in R follow the link Getting Help in R Language.

 

R Basic FAQs

Question: How to start (Run) R Language in Windows Operating System?
Answer: In Microsoft Windows, during installation the R installer will have created a Start menu item and an icon for R on your system’s desktop. Double click R icon from desktop or from start menu list to Run R program. For windows 7, 8 or 10, you can use search term like “R x64 3.2.1” (64 bit version) or “R i386 3.2.1” (32 bit version). R GUI will launch.

Question: How R can be used as calculator.
Answer: Starting R will open the console where user can type commands. To use R as calculator one have to enter the arithmetical expression after > prompt. For example

> 5 + 4
> sqrt(37)
> 2*4^2+17*4-3

Question: How to Quit R session?
Answer: In R console on R command prompt just type

> q( )

Question: What is q()?
Answer: The q() is a function that is used to tell R to quit. When q() is entered in R console and press Enter key, you will be asked whether to save an image of the current workspace or not or to cancel. Note that only typing q tells R to show the content of this function. The action of this function is to quit R.

Question: What is workspace in R?
Answer: The workspace in R is an image that contains a record of the computations one have done and it may contain some saved results.

Question: How to record work in R?
Answer: Rather than saving the workspace, one can record all the commands that one have entered in R console. Recording work in R, the R workspace can be reproduced. The easiest way is to enter the commands in R’s script editor available in the File menu of R GUI.

Question: What is R Script Editor?
Answer: R script editor is a place where one can enter commands. Commands can be executed by highlighting them and hitting CTRL+R (mean RUN). At the end of a R session one can save the final script for a permanent record of one’s work. A text editor such as Notepad can also be used for this purpose.
Note that in R console only one command can be entered at a time because after pressing Enter key the R command executed immediately.