Statistical Power Analysis in R: A Comprehensive Guide

Introduction to Power Analysis

The post is about statistical power analysis in R. First, define the meaning of power in statistics. The power is the probability ($1-\beta$) of detecting an effect given that the effect is here. Power is the probability of correctly rejecting the null hypothesis when it is false.

Suppose, a simple study of a drug-A and a placebo. Let the drug be truly effective. The power is the probability of finding a difference between two groups (drug-A and placebo group). Imagine that a power of $1-\beta=0.8$ (having a power of 0.8 means that 80% of the time, there will be statistically significant differences between the drug-A and the placebo group, whereas there are 20% of the time, the statistically significant effect will not be obtained between two groups). Also, note that this study was conducted many times. Therefore, the probability of a Type-II error is $\beta=0.2$.

One-Sample Power

The following plot is for a one-sample one-tailed greater than t-test. In the graph below, let the null hypothesis $H_0:\mu = \mu_0$ be true, and the test statistic $t$ follows the null distribution indicated by the hashed area. Under the specific alternative hypothesis, $H_1:\mu = \mu_1$, the test statistic $t$ follows the distribution shown by solid area.

The $\alpha$ is the probability of making a type-I error (that is rejecting $H_0$ when it is true), and the “crit. Val” is the location of the $t_{crit}$ value associated with $H_0$ on the scale of the data. The rejection region is the area under $H_0$ at least as far as $crit. val.” is from $\mu_0$.

The test’s power ($1-\beta$) is the green area, the area under $H_1$ in the rejection region. A type-II error is made when $H_1$ is true, but we fail to reject $H_0$ in the red region.

Type-II Error and Power Analysis in R

#One Sample Power

x <- seq(-4, 4, length = 1000)
hx <- dnorm(x, mean = 0, sd = 1)

plot(x, hx, type = "n", xlim = c(-4, 8), ylim = c(0, 0.5),
     main = expression (paste("Type-II Error (", beta, ") and Power (", 1 - beta, ")")), 
     axes = FALSE)

# one-tailed shift
shift = qnorm (1 - 0.05, mean=0, sd = 1 )*1.7
xfit2 = x + shift
yfit2 = dnorm(xfit2, mean=shift, sd = 1 )

axis (1, at = c(-qnorm(0.05), 0, shift), labels = expression("crit. val.", mu[0], mu[1]))
axis(1, at = c(-4, 4 + shift), labels = expression(-infinity, infinity), 
     lwd = 1, lwd.tick = FALSE)

# The alternative hypothesis area 
# the red - underpowered area

lb <- min(xfit2)               # lower bound
ub <- round(qnorm(0.95), 2)    # upper bound
col1 = "#CC2222"

i <- xfit2 >= lb & xfit2 <= ub
polygon(c(lb, xfit2[i], ub), c(0, yfit2[i],0), col = col1)

# The green area where the power is
col2 = "#22CC22"
i <- xfit2 >= ub
polygon(c(ub, xfit2[i], max(xfit2)), c(0, yfit2[i], 0), col = col2)

# Outline the alternative hypothesis
lines(xfit2, yfit2, lwd = 2)

# Print null hypothesis area
col_null = "#AAAAAA"
polygon (c(min(x), x, max(x)), c(0, hx, 0), col = col_null,
         lwd = 2, density = c(10, 40), angle = -45, border = 0)

lines(x, hx, lwd = 2, lty = "dashed", col=col_null)

axis(1, at = (c(ub, max(xfit2))), labels = c("", expression(infinity)), col = col2,
     lwd = 1, lwd.tick = FALSE)

#Legend
legend("topright", inset = 0.015, title = "Color", 
       c("Null Hypothesis", "Type-II error", "Power"), fill = c(col_null, col1, col2), 
       angle = -45, density = c(20, 1000, 1000), horiz = FALSE)

abline(v=ub, lwd=2, col="#000088", lty = "dashed")
arrows(ub, 0.45, ub+1, 0.45, lwd=3, col="#008800")
arrows(ub, 0.45, ub-1, 0.45, lwd=3, col="#880000")