The post is about R Language Questions that are commonly asked in interviews or R Language-related examinations and tests.
Table of Contents
R Language Questions
Question: What is a file in R? Answer: A script file written in R has a file extension of R. Since, R is a programming language designed to perform statistical computing and graphics on given data, that is why, a file in R contains code that can be executed within the R software environment.
Question: What is the table in R? Answer: A table in R language is an arbitrary R object, that is inherited from the class “table” for the as.data.frame method. A table in R language refers to a data structure that is used to represent categorical data and frequency counts. A table provides a convenient way to summarize and organize the data into a tabular format, making it easier to analyze and interpret.
Factor Variables in R
Questions: What is the factor variable in R language? Answer: Factor variables are categorical variables that hold either string or numeric values. The factor variables are used in various types of graphics, particularly for statistical modeling where the correct number of degrees of freedom is assigned to them.
Data Structure in R
Questions: What is Data Structure in R? Answer: A data structure is a specialized format for organizing and storing data. General data structure types include the array, the file, the record, the table, the tree, and so on. R offers several data structures, each with its characteristics and purposes. In R common data structures are: vector, factor, matrix, array, data frame, and lists.
scan() Function in R
Question: What is a scan() in R? Answer: The scan() in R is used to Read Data Values: Read data into a vector or list from the console or file. For Example:
Z <- scan()
1: 12 5
3: 2
4:
Read 3 items
> z
[1] 12 5 2
readline() Function in R
Questions: What is readline() in R? Answer: The deadline() function in R, read text lines from a Connection: Read some or all text lines from a connection. One can use readline() for inputting a line from the keyboard in the form of a string. For Example:
The post is about how to import data using read.table() function in R. You will also learn what a file path is and how to get and set the working directory in the R language. The read.table() function in R is a powerful tool for importing tabular data, typically from text files, into the R environment. The read.table function converts the tabular data from a flat-file format into a more usable data structure called the data frame.
Table of Contents
Question: How can I check my Working Directory so that I would be able to import my data in R? Answer: To find the working directory, the command getwd() can be used, that is
getwd()
Question: How can I change the working directory to my path? Answer: Use function setwd(), that is
data <- read.table(file,
header = FALSE,
sep = "",
dec = ".",
stringsAsFactors = FALSE)
Key Paramters of read.table in R
Key Parameters Explained
Parameter
Description
Default
Common Values
The first row as column names
File path/URL
–
“data.txt”, “https://example.com/data.csv”
header
First row as column names
FALSE
TRUE/FALSE
sep
Field separator
“” (whitespace)
“,”, “\t”, “;”
dec
Decimal separator
“.”
“,”, “.”
na.strings
Missing value codes
“NA”
“N/A”, “”, “999”
stringsAsFactors
Convert strings to factors
FALSE
TRUE/FALSE
colClasses
Specify column types
NA
“numeric”, “character”, “factor”
nrows
Number of rows to read
-1 (all)
100, 1000
skip
Lines to skip at start
0
1, 5
Import Data using read.table Function in R
Question: I have a data set stored in text format (ASCII) that contains rectangular data. How can I read this data in tabular form? I have already set my working directory. Answer: As the data is already in a directory set as the working directory, use the following command to import the data using read.table() command.
The mydata is a named object that will have data from the file “data.dat” or “data.txt” in data frame format. Each variable in the data file will be named by default V1, V2,…
Question: How can this stored data be accessed? Answer: To access the stored data, write the data frame object name (“mydata”) with the $ sign and the name of the variable. That is,
mydata$V1
mydata$V2
mydata["V1"]
mydata[ , 1]
Question: My data file has variable names in the first row of the data file. In the previous question, the variable names were V1, V2, V3, … How can I get the actual names of the variables stored in the first row of the data.dat file? Answer: Instead of reading a data file with default values of arguments, use
read.table("data.dat", header = TRUE)
Question: I want to read a data file that is not stored in the working directory. Answer: To access the data file that is not stored in the working directory, provide a complete path of the file, such as.
Note that read.table() is used to read the data from external files that normally have a special form:
The first line of the file should have a name for each variable in the data frame. However, if the first row does not contain the name of a variable, then the header argument should not be set to FALSE.
Each additional line of the file has its first item a row label and the values for each variable.
In R it is strongly suggested that variables need to be held in the data frame. For this purpose,e read.table() function in R can be used. For further details about read.table() function use,
help(read.table)
Important Arguments of read.table Function:
file: (required argument) it is used to specify the path to the file one wants to read.
header: A logical value (TRUE or FALSE) indicating whether the first line of the file contains column names. The default value is set to FALSE.
sep: The separator that segregates values between columns. The default is set to white space. One can specify other delimiters like commas (“,”) or tabs (“\t”).
as.is: A vector of logical values or column indices specifying which columns to read as characters and prevent conversion to numeric or factors.
colClasses: A vector specifying the data type for each column. Useful for ensuring specific data formats during import. This can be useful to ensure the data is read in the correct format (e.g., numeric, character).
read.table vs Similar Functions
Function
Best For
Speed
Packages
read.table()
General text files
Slow
Base R
read.csv()
CSV files
Slow
Base R
fread()
Large files
Very Fast
data.table
read_delim()
Tidyverse workflow
Fast
readr
read_excel()
Excel files
Medium
readxl
Best Practices when using read.table Function in R
Always specify column types (colClasses) for large files
Handle missing values explicitly with na.strings
Use faster alternatives (fread, readr) for files >100MB
Check encoding for international character sets
Validate imports with str(), summary(), and head()
Note that
While read.table() is rarely the fastest option today, it remains the most flexible text file importer in base R. For modern workflows, consider data.table::fread() or readr::read_delim() for better performance, but understanding read.table() is essential for handling special cases and legacy code.
Please load the required data set before running the commands given below in R FAQs related to the data frame. As an example for R FAQs about data frame in R, we are assuming the iris data set is available already in R. At R prompt write data(iris).
Table of Contents
Naming/ Renaming Columns in a Data Frame
Question: How do you name or rename a column in a data frame? Answer: Suppose you want to change/ rename the 3rd column of the data frame, then on R prompt write
names (iris)[,3] <- "new_name"
Suppose you want to change the second and third columns of the data frame
names(irisi)[c(2,4)] <- c("A", "D")
Note that names(iris) command can be used to find the names of each column in a data frame.
Question: How you can determine the column information of a data frame such as the “names, type, missing values” etc.? Answer: There are two built-in functions in R to find the information about columns of a data frame.
str(iris)
summary(iris)
Exporting a Data Frame in R
Question: How a data frame can be exported in R so that it can be used in other statistical software? Answer: Use the write.csv command to export the data in comma-separated format (CSV).
write.csv(iris, "iris.csv", row.names = FALSE)
Question: How one can select a particular row or column of a data frame? Answer: The easiest way is to use the indexing notation []
Suppose you want to select the first column only, then at the R prompt, write
iris[,1]
Suppose we want to select the first column and also want to put the content in a new vector, then
new <- iris[,1]
Suppose you want to select different columns, for example, columns 1, 3, and 5, then
newdata <- iris[, c(1, 3, 5)]
Suppose you want to select a first and third row, then
iris[c(1,2), ]
Dealing with Missing Values in a Data Frame
Question: How do you deal with missing values in a data frame? Answer: In R language it is easy to deal with missing values. Suppose you want to import a file named “file.csv” that contains missing values represented by a “.” (period), then on the R prompt write
data <- read.csv("file.csv", na.string = ".")
If missing values are represented as “NA” values then write
dataset <- read.csv("file.csv", na.string = "NA")
For the case of built-in data such (here iris), use