read.table Function in R (2016): A Comprehensive Guide

The post is about how to import data using read.table() function in R. You will also learn what is a file path and how to get and set the working directory in R language. The read.table() function in R is a powerful tool for importing tabular data, typically from text files, into the R environment. The read.table function converts the tabular data from a flat-file format into a more usable data structure called the data frame.

Question: How can I check my Working Directory so that I would be able to import my data in R? Answer: To find the working directory, the command getwd() can be used, that is

getwd()
import data using read.table function in R

Question: How can I change the working directory to my path?
Answer: Use function setwd(), that is

setwd("d:/mydata")
setwd("C:/Users/XYZ/Documents")

Import Data using read.table Function in R

Question: I have a data set stored in text format (ASCII) that contains rectangular data. How can I read this data in tabular form? I have already set my working directory.
Answer: As data is already in a directory set as the working directory, use the following command to import the data using read.table() command.

mydata <- read.table("data.dat")
mydata <- read.table("data.txt")

The mydata is a named object that will have data from the file “data.dat” or “data.txt” in data frame format. Each variable in the data file will be named by default V1, V2,…

Question: How this stored data can be accessed?
Answer: To access the stored data, write the data frame object name (“mydata”) with the $ sign and name of the variable. That is,

mydata$V1
mydata$V2
mydata["V1"]
mydata[ , 1]

Question: My data file has variable names in the first row of the data file. In the previous question, the variable names were V1, V2, V3, … How can I get the actual names of the variables stored in the first row of the data.dat file?
Answer: Instead of reading a data file with default values of arguments, use

read.table("data.dat", header = TRUE)

Question: I want to read a data file that is not stored in the working directory.
Answer: To access the data file that is not stored in the working directory, provide a complete path of the file, such as.

read.table("d:/data.dat" , header = TRUE)
read.table("d:/Rdata/data.txt" , header = TRUE)

Note that read.table() is used to read the data from external files that have normally a special form:

  • The first line of the file should have a name for each variable in the data frame. However, if the first row does not contain the name of a variable then the header argument should not be set to FALSE.
  • Each additional line of the file has its first item a row label and the values for each variable.

In R it is strongly suggested that variables need to be held in the data frame. For this purpose read.table() function in R can be used. For further details about read.table() function use,

help(read.table)
read.table function in R; rfaqs.com

Important Arguments of read.table Function:

  • file: (required argument) it is used to specify the path to the file one wants to read.
  • header: A logical value (TRUE or FALSE) indicating whether the first line of the file contains column names. The default value is set to FALSE.
  • sep: The separator that segregates values between columns. The default is set to white space. One can specify other delimiters like commas (“,”) or tabs (“\t”).
  • as.is: A vector of logical values or column indices specifying which columns to read as characters and prevent conversion to numeric or factors.
  • colClasses: A vector specifying the data type for each column. Useful for ensuring specific data formats during import. This can be useful to ensure the data is read in the correct format (e.g., numeric, character).
Learn R Language and FAQS

https://gmstat.com

https://itfeature.com

Matrix in R Language (2015): Key Secrets

The matrix is an important data type in R language similar to the data frame. It has two dimensions as the arrangement of elements is in rows and columns.

Matrix In R Language

Question: What is a matrix in R Language?
Answer: In R language matrices are two-dimensional arrays of elements all of which are of the same type, for example, numbers, character strings, or logical values.

Matrices may be constructed using the built-in function “matrix”, which reshapes its first argument into a matrix having a specified number of rows as the second argument and a number of columns as the third matrix.

Creating a Matrix in R Language

Question: Give an example of how the matrix is constructed in R language.
Answer: A 3 by 3 matrix (3 rows and 3 columns) matrix may be constructed such as:

matrix(1:9, 3, 3)
matrix(c(1,2,3,4,5,6,7,8,9), 3, 3)
matrix(runif(9), 3,3)

First, two commands construct a matrix of 9 elements having 3 rows and 3 columns consisting of numbers from 1 up to 9. The third command makes a matrix of 3 rows and 3 columns with random numbers from a uniform distribution.

Question: How the matrix elements are filled?
Answer: A matrix is filled by columns, unless the optional argument byrow is set to TRUE as an argument in matrix command, for example

matrix(1:9, 3, 3, byrow = TRUE)

Question: Can the matrix be stored in R?
Answer: Any matrix can be stored in R such as

m <- matrix(1:9, 3, 3)
mymatrix <- matrix( rnorm(16), nrow=4 )
Matrix in R Language

Matrices are stored in “m” and “mymatrix” objects. The second command constructs a matrix having 16 elements with 4 rows from a normal distribution having mean 0 and variance 1.

Attributes of Matrix Object in R

Question: What is the use of the dim command in R?
Answer: The dim (dimension) is an attribute of the matrix in R language, which tells the number of rows and the number of columns of a matrix, for example,

dim(mymatrix)

This will result in output showing 4  4, meaning 4 rows and 4 column matrix.

Question: Can we name rows of a matrix in R Language?
Answer: Yes in R language we can name rows of a matrix according to one’s requirements, such as

rownames(mymatrix) &lt;- c("x1", "x2", "x3", "x4")
mymatrix

Question: Can column names be changed or updated in R?
Answer: The procedure is the same as changing the column name. For this purpose colnames command is used, for example

colnames(mymatrix)&lt;-c("A", "B", "C", "D")
mymatrix

Question: What is the purpose of the attributes command for the matrix in R Language?
Answer: The attributes function can be used to get information about the dimension of the matrix and dimnames (dimension names). For example;

attributes(mymatrix)

In summary, the primary function for creating a matrix in R language is matrix(). It takes a few arguments:

  • data: This is a vector containing the elements for the matrix.
  • nrow: The number of rows in the matrix.
  • ncol: The number of columns in the matrix.

FAQs about Matrices in R

  1. How to create a matrix in R?
  2. How elements are filled in R?
  3. How to convert a data object to a matrix object in R?
  4. How different attributes of a matrix in R can be checked?
  5. How matrices can be stored in a variable?
  6. How one can name the rows and columns of a matrix in R?
  7. What is the difference between dim and dimnames commands?
  8. How one can create a matrix of order 3 by 3 (3 rows and 3 columns) with elements from a probability Distribution.
  9. What is the primary function of matrix() function in R Language.

https://itfeature.com

https://gmstat.com

Data Frame in R Language

Please load the required data set before running the commands given below in R FAQs related to the data frame. As an example for R FAQs about data frame in R, we are assuming the iris data set is available already in R. At R prompt write data(iris).

Naming/ Renaming Columns in a Data Frame

Question: How do you name or rename a column in a data frame?
Answer: Suppose you want to change/ rename the 3rd column of the data frame, then on R prompt write

names (iris)[,3] <- "new_name"

Suppose you want to change the second and third columns of the data frame

names(irisi)[c(2,4)] <- c("A", "D")

Note that names(iris) command can be used to find the names of each column in a data frame.

Question: How you can determine the column information of a data frame such as the “names, type, missing values” etc.?
Answer: There are two built-in functions in R to find the information about columns of a data frame.

str(iris)
summary(iris)
Data Frame in R Language

Exporting a Data Frame in R

Question: How a data frame can be exported in R so that it can be used in other statistical software?
Answer: Use the write.csv command to export the data in comma-separated format (CSV).

write.csv(iris, "iris.csv", row.names = FALSE)

Question: How one can select a particular row or column of a data frame?
Answer: The easiest way is to use the indexing notation []

Suppose you want to select the first column only, then at the R prompt, write

iris[,1]

Suppose we want to select the first column and also want to put the content in a new vector, then

new <- iris[,1]

Suppose you want to select different columns, for example, columns 1, 3, and 5, then

newdata <- iris[, c(1, 3, 5)]

Suppose you want to select a first and third row, then

iris[c(1,2), ]

Dealing with Missing Values in a Data Frame

Question: How do you deal with missing values in a data frame?
Answer: In R language it is easy to deal with missing values. Suppose you want to import a file named “file.csv” that contains missing values represented by a “.” (period), then on the R prompt write

data <- read.csv("file.csv", na.string = ".")

If missing values are represented as “NA” values then write

dataset <- read.csv("file.csv", na.string = "NA")

For the case of built-in data such (here iris), use

data <- na.omit(iris)

https://gmstat.com