Data frames in R are one of the most essential data structures. A data frame in R is a list with the class “**data.frame**“. The data frame structure is used to store tabular data. Data frames in R Language are essentially lists of vectors of equal length, where each vector represents a column and each element of the vector corresponds to a row.

## Table of Contents

Data frames in R are the workhorse of data analysis, providing a flexible and efficient way to store, manipulate, and analyze data.

### Restrictions on Data Frames in R

The following are restrictions on data frames in R:

- The
**components**(Columns or features) must be vectors (numeric, character, or logical), numeric matrices, factors, lists, or other data frames. **Lists, Matrices, and data frames**provide as many variables to the new data frame as they have columns, elements, or variables.**Numeric vectors,**logical vectors, and factors are included as is, by default, character vectors are coerced to be factors, whose levels are the unique values appearing in the vector.**Vecture structures**appearing as variables of the data frame must all have the same length, and matrix structures must all have the same row size.

A data frame may for many purposes be regarded as a matrix with columns possibly of differing modes and attributes. It may be displayed in matrix form, and its rows and columns are extracted using matrix indexing conventions.

### Key Characteristics of Data Frame

**Column-Based Operations:**R language provides powerful functions and operators for performing operations on entire columns or subsets of columns, making data analysis and manipulation efficient.**Heterogeneous Data:**Data frames can store data of different data types within the same structure, making them versatile for handling various kinds of data.**Named Columns:**Each column in a data frame has a unique name, which is used to reference and access specific data within the frame.**Row-Based Indexing:**Data frames are indexed based on their rows, allowing you to easily extract or manipulate data based on row numbers.

### Making/ Creating Data Frames in R

Objects satisfying the restrictions placed on the columns (components) of a data frame may be used to form one using the function data.frame(). For example:

BMI <- data.frame( age = c(20, 40, 33, 45), weight = c(65, 70, 53, 69), height = c(62, 65, 55, 58) )

Note that a list whose components conform to the restrictions of a data frame may coerced into a data frame using the function as.data.frame().

### Other Way of Creating a Data Frame

One can also use read.table(), read.csv(), read_excel(), and read_csv() functions to read an entire data frame from an external file.

### Accessing and Manipulating Data

**Accessing Data:**Use column names or row indices to extract specific values or subsets of data.**Creating New Columns:**Calculate new columns based on existing ones using arithmetic operations, logical expressions, or functions.**Grouping and Summarizing:**Group data by specific columns and calculate summary statistics (e.g., mean, median, sum).**Sorting Data:**Arrange rows in ascending or descending order based on column values.**Filtering Data:**Select rows based on conditions using logical expressions and indexing.

# Create a data frame manually data <- data.frame( Name = c("Ali", "Usman", "Hamza"), Age = c(25, 30, 35), City = c("Multan", "Lahore", "Faisalabad") ) # Accessing data print(data$Age) # Displays the "Age" column print(data[2, ]) # Displays the second row # Creating a new column data$Age_Category <- ifelse(data$Age < 30, "Young", "Old") # Filtering data young_people <- data[data$Age < 30, ] # Sort data sorted_data <- data[order(data$Age), ]

https://itfeature.com, https://gmstat.com