Category «Data Structure»

List in R Language

In R language, list is an object that consists of an ordered collection of objects known as its components. A list in R Language is a structured data that can have any number of any modes (types) of other structured data. That is, one can put any kind of object (like vector, data frame, character object, matrix and/ or array) into one list object.An example of list is

> x <- list(c(1,2,3,5), c(“a”, “b”, “c”, “d”), c(T, T, F, T, F), matrix(1:9, nr = 3) )

that contains 4 components, three of them are vectors (numeric, string and a logical) and one of them is matrix.

An object can also be converted to list by using as.list( ) function. For vector, the disadvantage is that each element of vector becomes a component of that list. For example,

> as.list (1: 10)

Extract components from a list

The operator [[ ]] (double square bracket) is used to extract the components of a list. To extract the second component of list, one can write at R prompt,

> list[[2]]

Using [ ] operator return a list rather than the structured data (component of the list). The component of the list need not to be of the same mode. The components are always numbered. If x1 is the name of a list with four components, then individual components may be referred to as x1[[1]], x1[[2]], x1[[3]], and x1[[4]].

If component of a list are defined then these component can be extracted by using the name of components. For example, a list with named component is

> x1 <- list(a = c(1,2,3,5), b = c(“a”, “b”, “c”, “d”), c = c(T, T, F, T, F), d = matrix(1:9, nr = 3) )

To extract the component a, one can write

> x1$a
> x1[“a”]
> x1[[“a”]]

To extract more than one component, one can write

> x[c(1,2)]    #extract component one and two
> x[-1]         #extract all component except 1st
> x[[c(1,2)]] #extract 2nd element of component one
> x[[c(2,2)]] #extract 2nd element of component two
> x[[c(2:4)]] #extract all elements of component 2 to 4

Import Data using read.table function

Question: How I can check my Working Directory so that I would be able to import my data in R.
Answer: To find working directory, the command getwd() can be used, that is

> getwd()

Question: How I can change working directory to my own path.
Answer: Use function setwd(), that is

> setwd(“d:/mydata”)
> setwd(“C:/Users/XYZ/Documents”)

Question: I have data set stored in text format (ASCII) that contain rectangular data. How I can read this data in tabular form. I have already set my working directory.
Answer: As data is already in a directory, which is set as working directory, use following command

> mydata <- read.table(“data.dat”)
> mydata <- read.table(“data.txt”)

mydata is named object that will have data from file “data.dat” or “data.txt” in data frame format. Each variable in data file will be named by default V1, V2, ….

Question: How this stored data can be to accessed?
Answer: To access the stored data, write data frame object name (“mydata”) with $ sign and name of the variable. That is,

mydata$V1
mydata$V2
mydata[“V1”]
mydata[,1]

Question: My data file has variables names in first row of the data file. In previous Question, variables names were V1, V2, V3, … How I can get actual names of the variable store in first row of data.dat file.
Answer: Instead of reading data file with default values of arguments, use

> read.table(“data.dat”, header = TRUE)

Question: I want to read a data file which is not store in working directory?
Answer: To access the data file which is not stored in working directory, provide complete path of the file, such as.

> read.table(“d:/data.dat” , header = TRUE)
> read.table(“d:/Rdata/data.txt” , header = TRUE)

Note that read.table() is used to read the data from external files that has a normally a special form:

  • The first line of the file should have a name for each variable in the data frame. However, if first row does not contains name of variable then header argument should not be set to FALSE.
  • Each additional line of the file has it first item a row label and the values for each variable.

In R it is strongly suggested that variables need to be held in data frame. For this purpose read.table() function can be used. For further details about read.table() function use,

help(read.table)

 

R FAQS about Matrix | Data Structure for Matrix in R

Question: What is matrix in R?
Answer: In R language matrices are two dimensional arrays of elements all of which are of the same type, for example numbers, character strings or logical values.

Matrices may be constructed using the built in function “matrix”, which reshapes its first argument into a matrix having specified number of rows as second argument and number of columns as third matrix.

Question: Give an example of how matrix is constructed in R language?
Answer: A 3 by 3 matrix (3 rows and 3 columns) matrix may be constructed such as:

matrix(1:9, 3, 3)
matrix(c(1,2,3,4,5,6,7,8,9), 3, 3)matrix(runif(9), 3,3)

First two commands constructs a matrix of 9 elements having 3 rows and 3 columns consisting numbers from 1 up to 9. The third command makes a matrix of 3 rows and 3 columns with random numbers from uniform distribution.

Question: How the matrix elements are filled?
Answer: A matrix is filled by columns, unless the optional argument byrow is set to TRUE as argument in matrix command, for example

matrix(1:9, 3, 3, byrow=TRUE)

Question: Can matrix be stored in R?
Answer: Any matrix can be stored in R such as

m <- matrix(1:9, 3, 3)
mymatrix <- matrix( rnorm(16), nrow=4 )

Matrices are stored in “m” and “mymatrix” object. The second command construct a matrix having 16 elements with 4 rows from normal distribution having mean 0 and variance 1.

Question: what is the use of dim command in R?
Answer: The dim (dimension) is an attribute of matrix in R, which tells the number of rows and the number of columns of a matrix, for example,

dim(mymatrix)

This will results in output showing 4  4, meaning that 4 rows and 4 column matrix.

Question: Can we name rows of a matrix in R?
Answer: Yes in R language we can name rows of a matrix according to ones requirements, such as

rownames(mymatrix) <- c(“x1”, “x2”, “x3”, “x4”)
mymatrix

Question: Can column names be changes or updated in R?
Answer: The procedure is same as changing of rows name. For this purpose colnames command is used, for example

colnames(mymatrix)<-c(“A”, “B”, “C”, “D”)
mymatrix

Question: What is the purpose of attributes command for matrix in R?
Answer: The attributes function can be used to get information about dimension of matrix and dimnames (dimension names). For example;

attributes(mymatrix)

 

R FAQs about Data Frame

Please load the require data set before running the commands given below in R FAQs related to data frame. As an example for R FAQs about data frame we are assuming iris data set that is available already in R. At R prompt write data(iris)

Question: How to name or rename a column in a data frame?
Answer: Suppose you want to change/ rename the 3rd column of the data frame, then on R prompt write

>names (iris)[,3] <- “new_name”

Suppose you want to change second and third column of the data frame

>names(irisi)[c(2,4)] <- c(“A”, “D”)

Note that names(iris) command is used to find the names of each column in a data frame.

Question: How you can determine the column information of a data frame such as the “names, type, missing values” etc.?
Answer: There are two built-in functions in R to find the information about columns of a data frame.

> str(iris)
>summary(iris)

Question: How a data frame can be exported in R, so that it can be used in other statistical software?
Answer: Use write.csv command to export the data in comma separated format (CSV).

> write.csv(iris, “iris.csv”, row.names=FALSE)

Question: How one can select a particular row or column of a data frame?
Answer: The easiest way is to use the indexing notation []

Suppose you want to select first column only, then at R prompt, write

>iris[,1]

Suppose we want to select the first column and also want to put the content in a new vector, then

>new <- iris[,1]

Suppose you want to select different columns, for example columns 1, 3, and 5, then

>newdata <- iris[, c(1, 3, 5)]

Suppose you want to select first and third row, then

>iris[c(1,2), ]

Question: How to deal with missing values in a data frame?
Answer: In R language it is easy to deal with missing values. Suppose you want to import a file names “file.csv” that contains missing values represented by a “.” (period), then on R prompt write

>data<-read.csv(“file.csv”, na.string= “.”)

If missing values are represented as “NA” values then write

>dataset<-read.csv(“file.csv”, na.string=”NA”)

For the case of built in data such (here iris), use

>data<-na.omit(iris)

 

%d bloggers like this: