Simulation in R for Sampling (2024)

Introduction to Simulation in R Language

The post is about simulation for sampling in R Programming Language. It contains useful examples for generating samples and then computing basic calculations in generated data.

Simulations are a powerful tool in R for exploring “what-if” scenarios without the need for real-world data. One can use R Language to simulate data from various probability distributions or even design customized functions for more complex simulations.

Question 1: Simulate a coin toss 20 times.

sample(c("H", "T"), 20, replace=T)

Question 2: Write R commands to find out the 95% confidence interval for the mean (unknown variance) from the following population

yp <- c(111, 150, 121, 198, 112, 136, 114, 129, 117, 115, 186, 110, 121, 115, 114)
N  <- length(yp)
ys <- sample(yp, 5)
n  <- length(ys)
mys <- mean(ys)
vys <- vary(ys)
vybar <- var(yp)/n
sdr <- sqrt(vybar)
error <- qnorm(0.975)*sdr
ll <- mys - error
ul <- mys + error

Sampling without Replacement and Histogram

Question 3: If we have a population ِye <- c(112, 114, 119, 125, 158, 117, 135, 141, 185, 128) then simulate this population with $k=100$ and $n=3$ for Simple Random Sampling without Replacement (SRSWOR). Also, find out the sample mean. Draw the histogram of the sample means generated.

k = 100; n = 3
m1 <- c()
ye <- c(112, 114, 119, 125, 158, 117, 135, 141, 185, 128)

for(i in 1:100){
  s <- sample(ye, 3)
  m1[i] <- mean(s)
}

m1
hist(m1)
histogram: Simulation in R

Question 4: Perform a simulation in R by writing the R code considering generating a population of size 500 values from a normal distribution with a mean = 20 and a standard deviation = 30. Select 5000 samples, each of size 50 using the systematic sampling technique, and estimate the mean of each sample. Find the mean and variance of 5000 means.

N = 500; n = 50;
k = N/n; m = c();
pop <- rnorm (N, mean=20, sd=30)

for(i in 1:5000){
  start <- sample(1: k, 1)
  s <- seq(start, N, k)
  sys.sample <- pop[s]
  m[i] = mean(sys.sample)
}

mean(m); var(m)

Question 5: Why do we use simulation for sampling?
Answer: The simulation study is useful to evaluate a sampling strategy. We can generate the populations considering specific situations. Generating the population, the sample of size $n$ is obtained $k$ times. From each sample, the estimator is obtained. The variance of $k$ estimators is calculated for examining the efficiency.

Coin Toss Experiment in R

Question 6: Write an R code to Simulate a coin-tossing experiment.

# Define the Number of tosses of a coin
n_tosses <- 100

# Simulate coin tosses (1 for heads, 0 for tails)
coin_tosses <- sample(c(0, 1), n_tosses, replace = TRUE)

# Calculate the proportion of heads
prop_heads <- mean(coin_tosses)

# Display results
cat("Number of Heads:", sum(coin_tosses), "\n")
cat("Proportion of Heads:", prop_heads, "\n")
# Plot the results
barplot(c(sum(coin_tosses), n_tosses - sum(coin_tosses)),
        names.arg = c("Heads", "Tails"),
        col = c("skyblue", "salmon"),
        main = "Coin Toss Simulation"
       )
Simulation in R for Sampling

One can adapt these examples for more complex statistical simulations or specific scenarios by modifying the simulation process and analyzing the results accordingly. Simulations are commonly used in various fields, such as statistics, finance, and operations research, to model and analyze uncertain or random processes.

Simulation Data in R using For Loops

Learn Basic Statistics and Data Analysis

R as a Calculator: A Comprehensive Guide

Launching R Console in Windows

In this article, we will learn how to use R as a Calculator. Before using R as a calculator, one needs to launch the R software first. In the Windows Operating system, The R installer will have created an icon for R on the desktop and a Start Menu item. Double-click the R icon to start the R Program; R will open the console, to type R commands.

R Prompt

The greater than sing (>) in the console is the prompt symbol. In this tutorial, we will use the R language as a calculator (we will be Using R for the computation of mathematical expressions), by typing some simple mathematical expressions at the prompt (>). Anything that can be computed on a pocket calculator can also be computed at the R prompt. After entering the expression on the prompt, you have to press the Enter key from the keyboard to execute the command.

Using R as a Calculator

Some examples using R as a calculator are as follows

1 + 2   #add two or more numbers
1 - 2   #Substracts two or more numbers
1 * 2   #multiply two or more numbers
1 / 2   #divides two more more numbers
1 %/% 2 #gives the integer part of the quotient
2 ^ 1   #gives exponentiation
31 %% 7 #gives the remainder after division

These operators also work fine for complex numbers.

Upon pressing the enter key, the result of the expression will appear, prefixed by a number in a square bracket:

1 + 2

# output
[1] 54

The [1] indicates that this is the first result from the command.

Scientific Calculator Type Functions in R

One can also use R as an advanced scientific calculator. Some advanced calculations that are available in scientific calculators can also be easily done in R for example,

sqrt(5)      #Square Root of a number
log(10)      #Natural log of a number
sin(45)      #Trignometric function (sin function)
pi           #pi value 3.141593
exp(2)       #Antilog, e raised to a power
log10(5)     #Log of a number base 10
factorial(5) #Factorial of a number e.g 5!
abs(1/-2)    #Absolute values of a number 
2*pi/360     #Number of radian in one Babylonian degree of a circle

Remember R prints all very large or very small numbers in scientific notation.

R as a Calculator

Order of Precedence/ Operations

The R language also makes use of parentheses for grouping operations to follow the rules for the order of operations. for example

1 - 2/3   #It first computes 2/3 and then subtracts it from 1
(1-2)/3   #It first computes (1-2) and then divides it by 3

The R Language recognizes certain goofs, like trying to divide by zero, missing values in data, etc.

1/0       # Undefined, R tells it an infinity (Inf)
0/0       # Not a number (NaN) 
"one"/2   # Strings or characters is divided by a number

Further Reading: Computing Descriptive Statistics in R

Online MCQs Computer Science with Answers

Introduction to R Language

The post is about an introduction to R Language. In this introduction to R Language, we will discuss here a short history of R programming language, obtaining R, the installation path of the language, installing R, and R console. Let us start with an introduction to the R language.

Introduction to R Language

R is an open-source (GPL) programming language for statistical computing and graphics, made after S and S-plus language. The S language was developed by AT&T laboratories in the late ’80s. Robert Gentleman and Ross Ihaka started the research project of the statistics department of the University of Auckland in 1995 called R Language.

The R language is currently maintained by the R core development team (an international team of volunteer developers). The (R Project website) is the main site for information about R. From this page information about obtaining the software, accompanying package, and many other sources of documentation (help files) can be obtained.

Introduction to R Language

R provides a wide variety of statistical and graphical techniques such as linear and non-linear modeling, classical statistical tests, time-series analysis, classification, multivariate analysis, etc., as it is an integrated suite of software having facilities for data manipulation, calculation, and graphics display. It includes

  • Effective data handling and storage facilities
  • Have a suite of operators for calculation on arrays, particularly for matrices
  • Have a large, coherent, integrated collection of intermediate tools for data analysis
  • Graphical data analysis
  • Conditions, loops, user-defined recursive functions, and input-output facilities.

Obtaining R Software

R language Software can be obtained/downloaded from the R Project site the ready-to-run (binaries) files for several operating systems such as Windows, Mac OS X, Linux, Solaris, etc. The source code for R is also available for download and can be compiled for other platforms. R language simplifies many statistical computations as R is a very powerful statistical language with many statistical routines (programming code) developed by people from all over the world and freely available from the R project website as “Packages”. The basic installation of R language contains many powerful sets of tools and it includes some basic packages required for data handling and data analysis.

Many users of R think of R as a statistical system, but it is an environment within which statistical techniques are implemented. The R language can also be extended via packages.

Installing R

For the Windows operating system, the binary version is available from http://cran.r- project.org/bin/windows/base/. “R-4.4.1-win.exe. R-4.4.1” (Race for Your Life) is the latest version of R released on 15 June 2024 by Duncan Murdoch.

After downloading the binary file double-click it, and almost automatic installation of the R system will start although the customized installation option is also available. Follow the instructions during the installation procedure. Once the installation process is complete, you have the R icon on your computer desktop.

The R Console

When R starts, you will see R console windows, where you type commands to get the required results. Note that commands are typed on the R Console command prompt. You can also edit the commands previously typed on the command prompt by using the left, right, up, and down arrow keys, home, end, backspace, insert and delete keys from the keyboard. Command history can be obtained by up and down arrow keys to scroll through recent commands. It is also possible to type commands in a file and then execute the file, using the source function in the R console.

Books on R Programming Language

The following books can be useful for learning the R and S language.

  • “Practicing R for Statistical Computing by Aslam, M, and Imdad Ullah, M., Springer, 2023.
  • “Psychologie statistique avec R” by Yvonnick Noel. Partique R. Springer, 2013.
  • “Instant R: An introduction to R for Statistical Analysis” by Sarah Stowell. Jotunheim Publishing, 2012.
  • “Financial Risk Modeling and Portfolio Optimization with R” by Bernhard Pfaff. Wiley, Chichester, UK, 2012.
  • “An R Companion to Applied Regression” by John Fox and Sanford Weisberg, Sage Publications, Thousand Oaks, CA, USA, 2nd Edition, 2011,
  • “R Graphs Cookbook” by Hrishi Mittal, Packt Publishing, 2011
  • “R in Action” by Rob Kabacoff. Manning, 2010.
  • “The Statistical Analysis with R Beginners Guide” by John M. Quick. Packt Publishing, 2010.
  • “Introducing Monte Carlo Methods with R” by Christian Robert and George Casella. Use R. Springer, 2010.
  • “R for SAS and SPSS users” by Robert A. Muenchen. Springer Series in Statistics and Computing. Springer, 2009.

MCQs General Knowledge