R Functions Explained

Learn key R functions Explained: like sort(), search(), subset(), sample(), all(), and any() with practical examples. Discover how to check if an element exists in a vector and understand the differences between all() and any(). Perfect for R beginners!” learn Q&A guide on sort(), search(), subset(), sample(), all(), any(), and element checks in vectors. Boost your R skills today!”

Which function is used for sorting in the R Language?

Several functions in R can be used for sorting data. The most commonly used R functions for sorting are:

  • sort(): Sorts a vector in ascending or descending order. The general syntax is sort(x, decreasing = FALSE, na.last = NA)
  • order(): Returns the indices that would sort a vector (it is useful for sorting data frames). The general syntax of order() is order(x, decreasing = FALSE, na.last = TRUE)
  • arrange(): It sorts a data frame (however, it requires dplyr package). The general syntax is: arrange(.data, …, .by_group = FALSE)
# sort() Function
vec <- c(3, 1, 4, 1, 5)
sort(vec)                		# Ascending (default): 1 1 3 4 5
sort(vec, decreasing = TRUE)  	# Descending: 5 4 3 1 1

# order() Function
df <- data.frame(name = c("Ali", "Usman", "Umar"), age = c(25, 20, 30))
df[order(df$age), ]  # Sort data frame by age (ascending)

# arrange() Function from dplyr package
library(dplyr)
df %>% arrange(age)               # Ascending
df %>% arrange(desc(age))         # Descending
R functions explained sort arrange order

Why search() function used?

In R language, the search() function is used to display the current search path of R objects (such as functions, datasets, variables, etc.). This shows the order in which R looks for objects when you reference them.

What Does search() function do?

  • Lists all attached packages and environments in the order R searches them.
  • Helps diagnose issues when multiple packages have functions with the same name (name conflicts).
  • Shows where R will look when you call a function or variable.

What is the use of subset() and sample() functions in R?

In R language, subset() and sample() are two useful functions for data manipulation and sampling:

  • subset(): is used to extract subsets of data frames or vectors based on some condition. The general syntax is subset(x, subset, select, …)
  • sample(): is used for random sampling from a dataset with or without replacement. The general system is: sample(x, size, replace = FALSE, prob = NULL).

The examples of subset() and sample() are describe below

# Example data frame
df <- data.frame(
  name = c("Ali", "Usman", "Aziz", "Daood"),
  age = c(25, 30, 22, 28),
  salary = c(50000, 60000, 45000, 70000)
)

# Filter rows where age > 25
subset(df, age > 25)

# Filter rows and select specific columns
subset(df, salary > 50000, select = c(name, salary))
R functions explained
# Randomly sample 3 numbers from 1 to 10 without replacement
sample(1:10, 3)

# Sample with replacement (possible duplicates)
sample(1:5, 10, replace = TRUE)

# Sample rows from a data frame
df[sample(nrow(df), 2), ]  # Picks 2 random rows
R functions explained

What is the use of all() and any()?

In R language, the all() and any() functions are logical functions used to evaluate conditions across vectors or arrays.

  • all() function: checks if all elements of a logical vector are TRUE. It returns TRUE only if every element in the input is TRUE, otherwise, it returns FALSE. The general syntax is all(..., na.rm=FALSE)
  • any() Function: checks if at least one element of a logical vector is TRUE. It returns TRUE if any element is TRUE and FALSE only if all are FALSE. The general syntax is any(..., na.rm = FALSE)

The examples of all() and any() functions are:

x <- c(TRUE, TRUE, FALSE)
all(x)  # FALSE (not all elements are TRUE)

y <- c(5, 10, 15)
all(y > 3)  # TRUE (all elements are greater than 3)
x <- c(TRUE, FALSE, FALSE)
any(x)  # TRUE (at least one element is TRUE)

y <- c(2, 4, 6)
any(y > 5)  # TRUE (6 is greater than 5)

Note that if NA is present and na.rm = FALSE, any() returns NA unless a TRUE value exists.

What are the key differences between all() and any()?

The key differences between all() and any() are:

FunctionReturns TRUE WhenReturns FALSE When
all()All elements are TRUEAt least one is FALSE
any()At least one element is TRUEAll are FALSE

What is the R command to check if element 15 is present in a vector $x$?

One can check if the element (say) 15 is present in a vector x using either

  • %in% Operator
  • any() with logical comparison
  • which() to find the position of 15
# %in%
x <- c(10, 15, 20, 25)
15 %in% x  # Returns TRUE
30 %in% x  # Returns FALSE

# any()
x <- c(5, 10, 15)
any(x == 15)  # TRUE
any(x == 99)  # FALSE

# Which()
x <- c(10, 15, 20, 15)
which(x == 15)  # Returns c(2, 4)

Try Normal Distribution Quiz

How to Save Data in R

The post is about data in R language. Learn how to save and read data in R with this comprehensive guide. Discover methods like write.csv(), saveRDS(), and read.table(), understand keyboard input using readline(), scan(), and master file-based data loading for matrices and datasets. Perfect for beginners and intermediate R users!

How can you Save the Data in R Language?

To save data in R Language, there are many ways. The easiest way of saving data in R is to click Data –> Active Data Set –> Export Active Data. A dialogue box will appear. Click OK in the dialogue box. The data will be saved. The other ways to save data in R are:

Saving to CSV Files

# Base R package
write.csv(Your_DataFrae, "path/to/file.csv", row.names = FALSE)

#readr (tidyverse) Package
library(readr)
write_csv(your_DataFrame, "path/to/file.csv")

Saving to MS Excel Files

To save data to Excel files, the writexl or openxlsx package can be used

library(writexl)
write_xlsx(your_DataFrame, "path/to/file.xlsx")

Saving to R’s Native Formats

Single or Multiple objects can be saved to a single file, such as RData

# .RData file
save(object1, object2, file = "path/to/data.RData")

# .rds file
saveRDS(your_DataFrame, "path/to/data.rds")

Saving to Text Files

Data can be saved to text files using the following commands:

# Using Base R Package
write.table(your_DataFrame, "path/to/file.txt", sep = "\t", row.names = FALSE)

# using readr Package
write_delim(your_DataFrame, "path/to/file.txt", delim = "\t")

Saving to JSON File Format

The data can be saved to a JSON file format using the jsonlite package.

write_json(your_DataFrame, "path/to/file.json")

Saving Data to Databases

Write data to SQL databases (for example, SQLite, PostgreSQL), for example

library(DBI)
library(RSQLIte)

# Create a database connect
con <- dbConnect(RSQLite::SQLite(), "path/to/database.db")

# Write a data frame to the database
dbWriteTable(con, "table_name", your_DataFrame)

# Disconnect when done
dbDisconnect(con)

Saving Data to Other Statistical Software Formats

The haven package can be used to save data for SPSS, Stata, or SAS. For example

library(haven)
write_sav(your_DataFrame, "path/to/file.sav")  # SPSS file format
write_dta(your_DataFrame, "path/to/file.dta")  # STATA file format

It is important to note that

  • File Paths: Use absolute file paths, for example, D:/projects/data.csv, or relative paths such as data/file.csv.
  • Overwriting: By default, R will overwrite existing files. Add checks to avoid accidental loss, for example,
    if (!file.exists("file.csv")){
    write.csv(your_DataFrame, "file.csv")
    }

How to Read Data from the Keyboard?

To read the data from the keyboard, one can use the following functions

  • scan(): read data by directly pressing keyboard keys
  • deadline(): read text lines from a file connection
  • print(): used to display specified keystrokes on the display/monitor.

Explain How to Read Data or a Matrix from a File?

  • read.table(): usually read.table() function is used to read data. The default value of a header is set to “FALSE,” and hence, when we do not have a header, we need not use this argument.
  • Use read.csv() or read.table() function to import/read spreadsheet exported files in which columns are separated by commas instead of white spaces. For MS Excel file use, read.xls() function.
  • When you read in a matrix using read.table(), the resultant object will become a data frame, even when all the entries got to be numeric. The as.matrix() function can be used to read it into a matrix form like this
    as.matrix(x, nrow = 5, byrow=T)

What is scan() Function in R?

The scan() function is used to read data into a vector or list from the console or a file.

z <- scan()
1: 12 5
3: 5
4:
Read 3 items

z
### Output
12 5 5
Data in R Language

What is readline() Function in R?

The readline() function is used to read text lines from a connection. The readline() function is used for inputting a line from the keyboard in the form of a string. For example,

w <- readline()
xyz vw u

w

## Output

xyz vw u

Statistics and Data Analysis

Special Values in R Language

R is a powerful language for statistical computing and data analysis. While working with data, one may encounter special values in R Language. There are several special values in R Language (such as NA, NULL, Inf, and NaN) representing missing data, undefined results, or mathematical operations. Understanding their differences and how to handle them correctly is crucial for effective R programming. Misunderstanding these special values can lead to bugs in your R programming code or incorrect analysis.

This guide about special values in R Language covers:

  • NA: Missing or undefined data.
  • NULL: Absence of a value or object.
  • Inf / -Inf: Infinity from calculations like division by zero.
  • NaN: “Not a Number” for undefined math operations.
Special values in R Programming Language

Let us explore each special value with examples.

NA – Not Available (Missing Data)

NA represents missing or unavailable data in vectors, matrices, and data frames. The key properties of special value NA are

  • Used in all data types (logical, numeric, character).
  • Functions like is.na() detect NA values.
x <- c(1, 2, NA, 4)
is.na(x)    # Returns FALSE FALSE TRUE FALSE

Note that Operations involving NA usually result in NA unless explicitly handled with na.rm = TRUE. NA is not the same as "NA" (a character string). Also note that type-specific NAs are NA_integer_, NA_real_, NA_complex_, NA_character_.

NULL – Absence of a Value

NULL signifies an empty or undefined object, often returned by functions expecting no result. It is different from NA because NULL means the object does not exist, while NA means a value is missing. The key properties are:

  • NULL is a zero-length object, while NA has a placeholder.
  • Cannot be part of a vector.
  • Functions return NULL if they operate on a NULL object.
  • Use is.null() to check for NULL.
y <- NULL
is.null(y) # Returns TRUE

Note that NaN is a subtype of NA (is.na(NaN) returns TRUE). Also note that it is used for invalid numerical operations.

Special Values in R Programming Language

Inf and -Inf – Infinity

Inf and -Inf represent positive and negative infinity in R. These values occur when numbers exceed the largest finite representable value. Inf arises from operations like division by zero or overflow. The key properties are:

  • Often results from division by zero.
  • Can be used in comparisons (Inf > 1000 returns TRUE).
1 / 0            # Returns Inf
log(0)           # Returns -Inf
is.infinite(1/0) # TRUE

Note that Infinite values can be checked with is.infinite(x). Inf and -Inf results in NaN.

NaN – Not a Number

NaN results from undefined mathematical operations, like 0/0. One can check NaN values by using is.nan() function. Let us see how to check for NaN using R example:

0 / 0 # Returns NaN
is.nan(0 / 0) # TRUE
is.na(NaN) # TRUE (NaN is a type of NA)

Note that NULL is different from NA and NaN; it means no value exists. It is commonly used for empty lists, missing function arguments, or when an object is undefined.

FALSE and TRUE (Boolean Values)

Results in logical values used in conditions and expressions.

b <- TRUE
c <- FALSE
as.numeric(b)  # 1
as.numeric(c)  # 0

Note that Logical values are stored as integers (TRUE = 1, FALSE = 0). These are useful for indexing and conditional statements.

Comparison between NA, NULL, Inf, NaN

ValueMeaningCheck Function
NAMissing datais.na()
NULLEmpty objectis.null()
InfInfinityis.infinite()
NaNNot a Numberis.nan()

Common Pitfalls and Best Practices

  1. NA vs. NULL: Use NA for missing data in datasets; NULL for empty function returns.
  2. Math with Inf/NaN: Use is.finite() to filter valid numbers.
  3. Debugging Tip: Check for NA before calculations to avoid unexpected NaNs.

Handling Special Values in R

To manage special values in R efficiently, use the following functions:

  • is.na(x): Check for NA values.
  • is.null(x): Check for NULL values.
  • is.infinite(x): Check for Inf or -Inf.
  • is.nan(x): Check for NaN.

Practical Tips When Using Special Values in R Language

The following are some important practical tips when making use of special values in R Language:

  1. Handling Missing Data (NA)
    • Use na.omit(x) or complete.cases(x) to remove NA values.
    • Use replace(x, is.na(x), value) to fill in missing values.
  2. Avoiding NaN Issues
    • Check for potential division by zero.
    • Use ifelse(is.nan(x), replacement, x) to handle NaN.
  3. Checking for Special Values in R
    • is.na(x), is.nan(x), is.infinite(x), and is.null(x) help identify special values.
  4. Using Default Values with NULL
    • Set default function arguments as NULL and use if (is.null(x)) to assign a fallback value.

Summary of Special Values in R Language

Understanding special values in R is essential for data analysis and statistical computing. Properly handling NA, NULL, Inf, and NaN ensures accurate calculations and prevents errors in your R scripts. By using built-in functions, one can effectively manage these special values in R and improve the workflow.

Learn more about Statistics Software