Reading Data in R
Here we will discuss about reading and writing data in R. For reading, (importing) data into R following are some functions.
- read.table(), and read.csv(), for reading tabular data
- readLines() for reading lines of a text file
- source() for reading in R code files (inverse of dump)
- dget() for reading in R code files (inverse of dput)
- load() for reading in saved workspaces.
Writing Data to files
Following are few functions for writing (exporting) data to files.
- write.table(), and write.csv() exports data to wider range of file format including csv and tab-delimited.
- writeLines() write text lines to a text-mode connection.
- dump() takes a vector of names of R objects and produces text representations of the objects on a file (or connection). A dump file can usually be sourced into another R session.
- dput() writes an ASCII text representation of an R object to a file (or connection) or uses one to recreate the object.
- save() writes an external representation of R objects to the specified file.
Reading data files with read.table()
The read.table() function is one of the most commonly used functions for reading data into R. It has a few important arguments.
- file, the name of a file, or a connection
- header, logical indicating if the file has a header line
- sep, a string indicating how the columns are separated
- colClasses, a character vector indicating the class of each column in the data set
- nrows, the number of rows in the dataset
- comment.char, a character string indicating the comment character
- skip, the number of lines to skip from the beginning
- stringsAsFactors, should character variables be coded as factors?
read.table() and read.csv() Examples
data <-read.table("foo.txt") data <-read.table("D:\\datafiles\\mydata.txt") data <-read.csv("D:\\datafiles\\mydata.csv")
R will automatically skip lines that begin with a #, figure out how many rows there are (and how much memory needs to be allocated). R also figure out what type of variable is in each column of the table.
Writing data files with write.table()
Following are few important arguments usually used in write.table() function.
- x, the object to be written, typically a data frame
- file, the name of the file which the data are to be written to
- sep, the field separator string
- col.names, a logical value indicating whether the column names of x are to be written along with x, or a character vector of column names to be written
- row.names, a logical value indicating whether the row names of x are to be written along with x, or a character vector of row names to be written
- na, the string to use for missing values in the data
write.table() and write.csv() Examples
x <- data.frame(a = 5, b = 10, c = pi) write.table(x, file = "data.csv", sep = ",") write.table(x, "c:\\mydata.txt", sep = "\t") write.csv(x, file = "data.csv")