# Mean Comparison Tests: Hypothesis Testing (One Sample and Two Sample)

Here we learn some basics about how to perform Mean Comparison Tests: hypothesis testing for one sample test, two-sample independent test, and dependent sample test. We will also learn how to find the p-values for a certain distribution such as t-distribution, critical region values. We will also see how to perform one-tailed and two-tailed hypothesis tests.

How to Perform One-Sample t-Test in R

A recent article in The Wall Street Journal reported that the 30-year mortgage rate is now less than 6%. A sample of eight small banks in the Midwest revealed the following 30-year rates (in percent)

At the 0.01 significance level (probability of type-I error), can we conclude that the 30-year mortgage rate for small banks is less than 6%?

Manual Calculations for One-Sample t-Test and Confidence Interval

# Manual way
X <- c(4.8, 5.3, 6.5, 4.8, 6.1, 5.8, 6.2, 5.6)
xbar <- mean(X)
s <- sd(X)
mu = 6
n = length(X)
df = n - 1
tcal = (xbar - mu)/(s/sqrt(n) )
tcal

c(xbar - qt(0.995, df = df) * s/sqrt(n), xbar + qt(0.995, df = df) * s/sqrt(n))

Critical Values from t-Table

# Critical Value for Left Tail
qt(0.01, df = df, lower.tail = T)

# Critical Value for Right Tail
qt(0.99, df = df, lower.tail = T)

# Critical Vale for Both Tails
qt(0.995, df = df)

Finding p-Values

# p-value (altenative is less)
pt(tcal, df = df)

# p-value (altenative is greater)
1 - pt(tcal, df = df)

# p-value (alternative two tailed or not equal to)
2 * pt(tcal, df = df)


Performing One-Sample Confidence Interval and t-test Using Built-in Function

# Left Tail test
t.test(x = X, mu = 6, alternative = c("less"), conf.level = 0.99)

# Right Tail test
t.test(x = X, mu = 6, alternative = c("greater"), conf.level = 0.99)

# Two Tail test
t.test(x = X, mu = 6, alternative = c("two.sided"), conf.level = 0.99)


How to Perform two-Sample t-Test in R

Consider we have two samples stored in two vectors $X$ and $Y$ as shown in R code. We are interested in the Mean Comparison Test among two groups of people regarding (say) their wages in a certain week.

X = c(70, 82, 78, 70, 74, 82, 90)
Y = c(60, 80, 91, 89, 77, 69, 88, 82)

Manual Calculations for Two-Sample t-Test and Confidence Interval

nx = length(X)
ny = length(Y)
xbar = mean(X)
sx = sd(X)
ybar = mean(Y)
sy = sd(Y)
df = nx + ny - 2
# Pooled Standard Deviation/ Variance
SP = sqrt( ( (nx-1) * sx^2 + (ny-1) * sy^2) / df )
tcal = (( xbar - ybar ) - 0) / (SP *sqrt(1/nx + 1/ny))
tcal

# Confidence Interval
LL <- (xbar - ybar) - qt(0.975, df)* sqrt((SP^2 *(1/nx + 1/ny) ))
UL <- (xbar - ybar) + qt(0.975, df)* sqrt((SP^2 *(1/nx + 1/ny) ))
c(LL, UL)

Finding p-values

# The p-value at the left-hand side of Critical Region
pt(tcal, df )

# The p-value for two-tailed Critical Region
2 * pt(tcal, df )

# The p-value at the right-hand side of Critical Region
1 - pt(tcal, df)

Finding Critical Values from t-Table

# Left Tail
qt(0.025, df = df, lower.tail = T)

# Right Tail
qt(0.975, df = df, lower.tail = T)

# Both tails
qt(0.05, df = df)

Performing Two-Sample Confidence Interval and t-test Using Built-in Function

# Left Tail test
t.test(X, Y, alternative = c("less"), var.equal = T)

# Right Tail test
t.test(X, Y, alternative = c("greater"), var.equal = T)

# Two Tail test
t.test(X, Y, alternative = c("two.sided"), var.equal = T)

Note if $X$ and $Y$ variables are from a data frame then perform the two-sample t-test using formula symbol (~). Let first make the data frame from vectors X and Y.

data <- data.frame(values = c(X, Y), group = c(rep("A", nx), rep("B", ny)))

t.test(values ~ group, data = data, alternative = "less", var.equal = T)
t.test(values ~ group, data = data, alternative = "greater", var.equal = T)
t.test(values ~ group, data = data, alternative = "two.side", var.equal = T)


To understand probability distributions functions in R click the link: Probability Distributions in R