Here we learn some basics about how to perform Mean Comparison Tests: hypothesis testing for one sample test, two-sample independent test, and dependent sample test. We will also learn how to find the p-values for a certain distribution such as t-distribution, and critical region values. We will also see how to perform one-tailed and two-tailed hypothesis tests.
Table of Contents
How to Perform One-Sample t-Test in R
A recent article in The Wall Street Journal reported that the 30-year mortgage rate is now less than 6%. A sample of eight small banks in the Midwest revealed the following 30-year rates (in percent)
4.8 | 5.3 | 6.5 | 4.8 | 6.1 | 5.8 | 6.2 | 5.6 |
At the 0.01 significance level (probability of type-I error), can we conclude that the 30-year mortgage rate for small banks is less than 6%?
Manual Calculations for One-Sample t-Test and Confidence Interval
One sample mean comparison test can be performed manually.
# Manual way X <- c(4.8, 5.3, 6.5, 4.8, 6.1, 5.8, 6.2, 5.6) xbar <- mean(X) s <- sd(X) mu = 6 n = length(X) df = n - 1 tcal = (xbar - mu)/(s/sqrt(n) ) tcal c(xbar - qt(0.995, df = df) * s/sqrt(n), xbar + qt(0.995, df = df) * s/sqrt(n))
Critical Values from t-Table
# Critical Value for Left Tail qt(0.01, df = df, lower.tail = T) # Critical Value for Right Tail qt(0.99, df = df, lower.tail = T) # Critical Vale for Both Tails qt(0.995, df = df)
Finding p-Values
# p-value (altenative is less) pt(tcal, df = df) # p-value (altenative is greater) 1 - pt(tcal, df = df) # p-value (alternative two tailed or not equal to) 2 * pt(tcal, df = df)
Performing One-Sample Confidence Interval and t-test Using Built-in Function
One can perform one sample mean comparison test using built-in functions available in the R Language.
# Left Tail test t.test(x = X, mu = 6, alternative = c("less"), conf.level = 0.99) # Right Tail test t.test(x = X, mu = 6, alternative = c("greater"), conf.level = 0.99) # Two Tail test t.test(x = X, mu = 6, alternative = c("two.sided"), conf.level = 0.99)
How to Perform two-Sample t-Test in R
Consider we have two samples stored in two vectors $X$ and $Y$ as shown in R code. We are interested in the Mean Comparison Test among two groups of people regarding (say) their wages in a certain week.
X = c(70, 82, 78, 70, 74, 82, 90) Y = c(60, 80, 91, 89, 77, 69, 88, 82)
Manual Calculations for Two-Sample t-Test and Confidence Interval
The manual calculation for two sample t-tests as mean comparison test is as follows.
nx = length(X) ny = length(Y) xbar = mean(X) sx = sd(X) ybar = mean(Y) sy = sd(Y) df = nx + ny - 2
# Pooled Standard Deviation/ Variance SP = sqrt( ( (nx-1) * sx^2 + (ny-1) * sy^2) / df ) tcal = (( xbar - ybar ) - 0) / (SP *sqrt(1/nx + 1/ny)) tcal # Confidence Interval LL <- (xbar - ybar) - qt(0.975, df)* sqrt((SP^2 *(1/nx + 1/ny) )) UL <- (xbar - ybar) + qt(0.975, df)* sqrt((SP^2 *(1/nx + 1/ny) )) c(LL, UL)
Finding p-values
# The p-value at the left-hand side of Critical Region pt(tcal, df ) # The p-value for two-tailed Critical Region 2 * pt(tcal, df ) # The p-value at the right-hand side of Critical Region 1 - pt(tcal, df)
Finding Critical Values from t-Table
# Left Tail qt(0.025, df = df, lower.tail = T) # Right Tail qt(0.975, df = df, lower.tail = T) # Both tails qt(0.05, df = df)
Performing Two-Sample Confidence Interval and T-test using Built-in Function
One can perform two sample mean comparison test using built-in functions in R Language.
# Left Tail test t.test(X, Y, alternative = c("less"), var.equal = T) # Right Tail test t.test(X, Y, alternative = c("greater"), var.equal = T) # Two Tail test t.test(X, Y, alternative = c("two.sided"), var.equal = T)
Note if $X$ and $Y$ variables are from a data frame then perform the two-sample t-test using the formula symbol (~). Let’s first make the data frame from vectors $X$
and $$Y
.
data <- data.frame(values = c(X, Y), group = c(rep("A", nx), rep("B", ny))) t.test(values ~ group, data = data, alternative = "less", var.equal = T) t.test(values ~ group, data = data, alternative = "greater", var.equal = T) t.test(values ~ group, data = data, alternative = "two.side", var.equal = T)
To understand probability distributions functions in R click the link: Probability Distributions in R