Learn everything about files in R, including .RData, CSV, Excel, and text files. Discover how to read, write, and restore R objects using load()
, save()
, read.csv()
, and more. Explore best practices for file handling in R and compare different file formats for efficient data management. Perfect for R programmers, data analysts, and researchers working with datasets in R.
Table of Contents
What is a File in the R Language?
In R, a file refers to data stored on a computer storage device. The script written in R has an extension *.R that can read into R or write from R. R Files are essential for importing external data, saving results, and sharing work. The R script files contain code that can be executed within the R software environment.
Describe commonly used Files in R
For illustration purposes, I have categorized the commonly used files in R as code files, data files, and specialized data files.
Code Files:
- .R (R script files)
- .Rmd (R Markdown files)
Data Files:
- .csv (Comma Separated Values) – Most common for tabular data
- .txt (Plain text files)
- .xlsx or .xls (Excel files)
- .RData or .rda (R’s native binary format)
Specialized Data Formats:
- .json (for structured data)
- .xml (for hierarchical data)
- .sav (SPSS files)
- .dta (Stata files)
What are the best Practices for using Files in R?
- Use relative paths when possible for portability
- Check file existence before reading
- Close connections (when the database connection is open) after reading/writing certain file types
- Consider using the package
here
for more reliable file paths
What is .RData Files in R
An .RData
(or .rda
) file is a binary file format used by R. It is used to save multiple objects (variables, data frames, functions, etc.) in a compressed, space-efficient way. It is R’s native format for storing workspace data.
What are the Key Features of .RData
Files?
The key features of .RData
files in R are:
- Stores Multiple Objects
- The .
.RData
can save several R objects (e.g., data frames, lists, models) in a single file. - Example:
save(df, model, list1, file = "mydata.RData")
- The .
- Binary Format (Not Human-Readable)
- Unlike
.csv
or.txt
,.RData
files are not plain text and cannot be opened in a text editor.
- Unlike
- Compressed by Default
- Uses compression to reduce file size (especially useful for large datasets).
- Platform-Independent
- Can be shared across different operating systems (Windows, macOS, Linux).
- Preserves Attributes
- Keeps metadata (e.g., variable labels, factors, custom classes).
Which command is used for restoring an R object from a file?
In R, one can restore the saved objects from a file using the load()
function. The load()
command loads all objects stored in the file into the current R environment. This command works with .RData
or .rda
files (these are binary files used by R). This command does not work with .csv
, .txt
, or xlsx
, etc. files.
Explain the use of load()
command with example.
The following example first creates objects $x$, $y$, and $z$. These objects will be saved in “my_work.RData” file. These objects will appear in the R workspace after loading.
x <- rnorm(10) y <- 1:20 z <- "Level of Significance" save(x, y, z, file = "my_work.RData") load("my_work.RData")
How many ways are there to read and write files in R?
There are dozens of ways to read and write files in R. The best approach depends on the file type and size. Depending on the file format and the packages used, the following is a categorized breakdown of the most common methods:
Base R Functions
- Reading Files
read.table()
: Generic function to read tabular data (e.g.,.txt
).read.csv()
: For comma-separated values (CSV) files.read.delim()
: For tab-delimited files (.tsv
or.txt
).scan()
: Low-level function to read raw data.load()
: Restores R objects from.RData
or.rda
files.readRDS()
: Reads a single R object from.rds
files.
- Writing Files
write.table()
: Writes data frames to text files.write.csv()
: Writes to CSV files.write.delim()
: Writes tab-delimited files.save()
: Saves multiple R objects to.RData
or.rda
.saveRDS()
: Saves a single R object to.rds
.
Using Packages
- Reading Files
Package | Function | File Type Supported |
---|---|---|
readr | read_csv() | Faster CSV reading |
readxl | read_excel() | Excel (.xlsx , .xls ) |
data.table | fread() | Fast CSV/TSV import |
haven | read_spss() | SPSS (.sav ) |
haven | read_stata() | Stata (.dta ) |
jsonlite | fromJSON() | JSON files |
xml2 | read_xml() | XML files |
- Writing Files
Package | Function | File Type Supported |
---|---|---|
readr | write_csv() | Faster CSV export |
writexl | write_xlsx() | Excel (.xlsx ) |
data.table | fwrite() | Fast CSV/TSV export |
haven | write_sav() | SPSS (.sav ) |
haven | write_dta() | Stata (.dta ) |
jsonlite | toJSON() | JSON files |
xml2 | write_xml() | XML files |
Specialized Methods
For Large Datasets
vroom
(from thevroom
package) – High-speed reading of large CSV/TSV files.arrow
(Apache Arrow) – Efficient for big data (supports Parquet, Feather formats).
For Databases
DBI
+RSQLite
/RMySQL
/odbc
: Read/write from SQL databases.
For Binary & Custom Formats
feather
: Fast binary storage (works well with Python).qs
: A faster alternative tosaveRDS()
for large objects.