Simulation in R for Sampling (2024)

The post is about simulation for sampling in R Programming Language. It contains some useful basic examples for generating samples and then computing some basic calculations in generated data.

Question 1: Simulate a coin toss 20 times.

sample(c("H", "T"), 20, replace=T)

Question 2: Write R commands to find out the 95% confidence interval for the mean (unknown variance) from the following population

yp <- c(111, 150, 121, 198, 112, 136, 114, 129, 117, 115, 186, 110, 121, 115, 114)
N  <- length(yp)
ys <- sample(yp, 5)
n  <- length(ys)
mys <- mean(ys)
vys <- vary(ys)
vybar <- var(yp)/n
sdr <- sqrt(vybar)
error <- qnorm(0.975)*sdr
ll <- mys - error
ul <- mys + error

Sampling without Replacement and Histogram

Question 3: If we have a population ِye <- c(112, 114, 119, 125, 158, 117, 135, 141, 185, 128) then simulate this population with $k=100$ and $n=3$ for Simple Random Sampling without Replacement (SRSWOR). Also, find out the sample mean. Draw the histogram of the sample means generated.

k = 100; n = 3
m1 <- c()
ye <- c(112, 114, 119, 125, 158, 117, 135, 141, 185, 128)

for(i in 1:100){
  s <- sample(ye, 3)
  m1[i] <- mean(s)
}

m1
hist(m1)
histogram: Simulation in R

Question 4: Perform a simulation in R by writing the R code considering generating a population of size 500 values from a normal distribution with a mean = 20 and a standard deviation = 30. Select 5000 samples, each of size 50 using the systematic sampling technique, and estimate the mean of each sample. Find the mean and variance of 5000 means.

N = 500; n = 50;
k = N/n; m = c();
pop <- rnorm (N, mean=20, sd=30)

for(i in 1:5000){
  start <- sample(1: k, 1)
  s <- seq(start, N, k)
  sys.sample <- pop[s]
  m[i] = mean(sys.sample)
}

mean(m); var(m)

Question 5: Why do we use simulation for sampling?
Answer: The simulation study is useful to evaluate a sampling strategy. We can generate the populations considering specific situations. Generating the population, the sample of size $n$ is obtained $k$ times. From each sample, the estimator is obtained. The variance of $k$ estimators is calculated for examining the efficiency.

Coin Toss Experiment in R

Question 6: Write an R code to Simulate a coin-tossing experiment.

# Define the Number of tosses of a coin
n_tosses <- 100

# Simulate coin tosses (1 for heads, 0 for tails)
coin_tosses <- sample(c(0, 1), n_tosses, replace = TRUE)

# Calculate the proportion of heads
prop_heads <- mean(coin_tosses)

# Display results
cat("Number of Heads:", sum(coin_tosses), "\n")
cat("Proportion of Heads:", prop_heads, "\n")
# Plot the results
barplot(c(sum(coin_tosses), n_tosses - sum(coin_tosses)),
        names.arg = c("Heads", "Tails"),
        col = c("skyblue", "salmon"),
        main = "Coin Toss Simulation"
       )
Simulation in R for Sampling

One can adapt these examples for more complex statistical simulations or specific scenarios by modifying the simulation process and analyzing the results accordingly. Simulations are commonly used in various fields, such as statistics, finance, and operations research, to model and analyze uncertain or random processes.

Simulation Data in R using For Loops

Learn Basic Statistics and Data Analysis

Leave a Reply