Skip to content

R Frequently Asked Questions

Statistical Computing and Graphics in R

Menu
  • Learn R
    • R Basics
      • R FAQS about Package
      • R GUI
      • Using R packages
      • Missing Values
    • R Graphics
    • Data Structure
      • Data Frame
      • Matrices
      • List
    • R Programming
    • Statistical Models
  • R Quiz
    • MCQs R Programming
    • R Basic Quiz 7
    • MCQs R Debugging 6
    • MCQs R Vectors 5
    • R History & Basics 4
    • R Language Test 3
    • R Language MCQs 2
    • R Language MCQs 1
  • MCQs
    • MCQs Statistics
      • MCQs Basic Statistics
      • MCQs Probability
      • MCQs Graph & Charts
      • MCQs Sampling
      • MCQs Inference
      • MCQs Correlation & Regression
      • MCQs Time Series
      • MCQs Index Numbers
      • MCQs Quality Control 1
    • MCQS Computer
    • MCQs Mathematics Part-I
  • About ME
  • Contact Us
  • Glossary

Category: Factors in R

Factors in R (Categorical Data)

No Comments
| Factors in R

Factors in R Language are used to represent categorical data in the R language. Factors can be ordered or unordered. One can think of a factor as an integer vector where each integer has a label. Factors are specially treated by modeling functions such as lm() and glm().  Factors are the data objects used for categorical data and store it as levels. Factors can store both string and integer variables. 

Using factors with labels is better than using integers as factors are self-describing; having a variable that has values “Male” and “Female” is better than a variable having values 1 and 2.

Creating a Simple Factor

create a simple factor that has two levels

# Simple factor with two levels
x <- factor(c("yes", "yes", "no", "yes", "no"))
# computes frequency of factors
table(x)

# strips out the class
unclass(x)

The order of the levels can be set using the levels argument to factor(). This can be important in linear modeling because the first level is used as the baseline level.

x <- factor(c("yes","yes","no","yes","no"), levels = c("yes","no"))

Factors can be given names using the label argument. The label argument changes the old values of the variable to a new one. For example,

x <- factor(c("yes", "yes", "no", "yes", "no"), levels = c("yes", "no"), label = c(1,2) )

x <- factor(c("yes","yes","no","yes","no"), levels = c("yes","no"), label = c("Level-1", "level-2"))

x <- factor(c("yes","yes","no","yes","no"), levels = c("yes","no"), label = c("group-1", "group-2"))

Suppose, you have a factor variable with numerical values. You want to compute the mean. The mean vector will result in the average value of the vector, but the mean of the factor variable will result in a warning message. To calculate the mean of the original numeric values of the "f" variable, you have to convert the values using the level argument. For example,

# vector
v <- c(10,20,20,50,10,20,10,50,20)
# vector converted to factor
f <- factor(v)

# mean of the vector
mean(v)

# mean of factor
mean(f)

mean(as.numeric(levels(f)[f]))

Use of cut( ) Function to Create a Factor Variable

The the cut( ) function can also be used to convert a numeric variable into factor. The breaks argument can be used to describe how ranges of numbers will be converted to factor values. If the breaks argument is set to a single number then the resulting factor will be created by dividing the range of the variable into that number of equal-length intervals. However, if a vector of values is given to the breaks argument, the values in the vectors are used to determine the breakpoint. The number of levels of the resultant factor will be one less than the number of values in the vector provided to the breaks argument. For example,

attach(mtcars)
cut(mpg, breaks = 3)

factors <- cut(mpg, breaks = c(10, 18, 25, 30, 35) )

table(factors)

You will notice that the default label for factors produced by cut() function contains the actual range of values that were used to divide the variable into factors.

Learn about Data Frames in R

Share this:

  • Twitter
  • Facebook
  • LinkedIn
  • Skype
  • Tumblr
  • Pinterest
  • Print
  • WhatsApp
  • Telegram
  • Reddit
  • Pocket

Like this:

Like Loading...

Read More »

Subscribe via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 257 other subscribers

Search Form

Facebook

Facebook

Categories

  • Advance R Programming (3)
  • Data Analysis (10)
    • Comparisons Tests (2)
    • Statistical Models (8)
  • Data Structure (9)
    • Data Frame (2)
    • Factors in R (1)
    • List (2)
    • Matrices (2)
    • Vectors in R (1)
  • Importing/ Exporting Data (4)
    • R Data Library (4)
  • R Control Structure (3)
    • For loop in R (1)
    • Switch Statement (1)
  • R FAQS (18)
    • Missing Values (2)
    • R Basics (12)
    • R FAQS about Package (3)
    • R Programming (2)
  • R Graphics (4)
    • Exploring Data in R (1)
    • plot Function (2)
  • R Language Basics (4)
  • R Language Quiz (8)
  • Using R packages (2)
https://www.youtube.com/watch?v=MZpiMyAfnYQ&list=PLB01qg3XnNiMbKkvP2wYzzHkv6ZekaKZx

Posts: itfeature.com: Basic Statistics and Data Analysis

MCQs Time Series Analysis 5

An Introduction to the Pakistan Bureau of Statistics

Prepared by: Dr. Abdul Majid, Statistical Officer, Pakistan Bureau of Statistics, Regional Office Multan. Introduction (Pakistan Bureau of Statistics) Pakistan Bureau of Statistics (PBS) is the prime official agency of Pakistan. It is responsible…

Multiple Linear Regression Models

A wide scatter of points around the regression line shows that there is still room for further improvement. Including extra variables in the model will map up some of the residual variability remaining after…

MCQs Estimation Quiz 7

MCQs from Statistical Inference cover the topics of Estimation and Hypothesis Testing for the preparation of exams and different statistical job tests in Government/ Semi-Government or Private Organization sectors. These tests are also helpful…

Job Interview: Recently Asked Questions

Following are different Job Interview questions asked in interviews related to Jobs of Statistical Officer, Data Analysts, Lecturer in Statistics, Enumerator, etc. These questions are also useful for job interviews related to different disciplines….

Posts: gmstat.com: GM Statistics

MCQs Econometrics Quiz 5

This quiz is about Econometrics, which covers the topics of Regression analysis, correlation, dummy variable, multicollinearity, heteroscedasticity, autocorrelation, and many other topics. Let’s start with MCQs Econometrics test An application of different statistical methods applied to the economic data used…

MCQs Computer Programming – 1

The Quiz is about MCQs of Computer Programing and Programming languages. Take another Quiz bout Application Software If you Found that the Above POSTED MCQ is/ are WRONGPLEASE COMMENT below The MCQ with the CORRECT ANSWER and its DETAILED EXPLANATION.

Islamic Quiz – 2

Islamic Quiz for PPSC Lecturer Test Perform another Quiz about Islamic Quiz 1 If you Found that the Above POSTED MCQ is/ are WRONGPLEASE COMMENT below The MCQ with the CORRECT ANSWER and its DETAILED EXPLANATION.

Islamic Quiz – 1

Islamic Quiz for PPSC Lecturer Test Perform another Quiz about Computer If you Found that the Above POSTED MCQ is/ are WRONGPLEASE COMMENT below The MCQ with the CORRECT ANSWER and its DETAILED EXPLANATION.

MCQs Computer Basics 4

This quiz is about MCQS computer Basics. The greatest scientists of their era, Abu Jaffar Muhammad Ibn Musa Al-Khawarizmi (780-850), Alan Mathison Turing (1912-1954), and John Von Neuman (1903-1957) took part in the invention of computers. A mathematician and a…

R Frequently Asked Questions 2022 . Powered by WordPress

%d bloggers like this:
    pixel