The post is about how to import data using read.table() function in R. You will also learn what is a file path and how to get and set the working directory in R language. The read.table() function in R is a powerful tool for importing tabular data, typically from text files, into the R environment. The read.table function converts the tabular data from a flat-file format into a more usable data structure called the data frame.
Question: How can I check my Working Directory so that I would be able to import my data in R? Answer: To find the working directory, the command getwd() can be used, that is
getwd()
Question: How can I change the working directory to my path?
Answer: Use function setwd(), that is
setwd("d:/mydata") setwd("C:/Users/XYZ/Documents")
Import Data using read.table Function in R
Question: I have a data set stored in text format (ASCII) that contains rectangular data. How can I read this data in tabular form? I have already set my working directory.
Answer: As data is already in a directory set as the working directory, use the following command to import the data using read.table() command.
mydata <- read.table("data.dat") mydata <- read.table("data.txt")
The mydata is a named object that will have data from the file “data.dat” or “data.txt” in data frame format. Each variable in the data file will be named by default V1, V2,…
Question: How this stored data can be accessed?
Answer: To access the stored data, write the data frame object name (“mydata”) with the $ sign and name of the variable. That is,
mydata$V1 mydata$V2 mydata["V1"] mydata[ , 1]
Question: My data file has variable names in the first row of the data file. In the previous question, the variable names were V1, V2, V3, … How can I get the actual names of the variables stored in the first row of the data.dat file?
Answer: Instead of reading a data file with default values of arguments, use
read.table("data.dat", header = TRUE)
Question: I want to read a data file that is not stored in the working directory.
Answer: To access the data file that is not stored in the working directory, provide a complete path of the file, such as.
read.table("d:/data.dat" , header = TRUE) read.table("d:/Rdata/data.txt" , header = TRUE)
Note that read.table() is used to read the data from external files that have normally a special form:
- The first line of the file should have a name for each variable in the data frame. However, if the first row does not contain the name of a variable then the header argument should not be set to FALSE.
- Each additional line of the file has its first item a row label and the values for each variable.
In R it is strongly suggested that variables need to be held in the data frame. For this purpose read.table() function in R can be used. For further details about read.table() function use,
help(read.table)
Important Arguments of read.table Function:
- file: (required argument) it is used to specify the path to the file one wants to read.
- header: A logical value (TRUE or FALSE) indicating whether the first line of the file contains column names. The default value is set to FALSE.
- sep: The separator that segregates values between columns. The default is set to white space. One can specify other delimiters like commas (“,”) or tabs (“\t”).
- as.is: A vector of logical values or column indices specifying which columns to read as characters and prevent conversion to numeric or factors.
- colClasses: A vector specifying the data type for each column. Useful for ensuring specific data formats during import. This can be useful to ensure the data is read in the correct format (e.g., numeric, character).