Introduction to Power Analysis
The post is about statistical power analysis in R. First, define the meaning of power in statistics. The power is the probability ($1-\beta$) of detecting an effect given that the effect is here. Power is the probability of correctly rejecting the null hypothesis when it is false.
Suppose, a simple study of a drug-A and a placebo. Let the drug be truly effective. The power is the probability of finding a difference between two groups (drug-A and placebo group). Imagine that a power of $1-\beta=0.8$ (having a power of 0.8 means that 80% of the time, there will be statistically significant differences between the drug-A and the placebo group, whereas there are 20% of the time, the statistically significant effect will not be obtained between two groups). Also, note that this study was conducted many times. Therefore, the probability of a Type-II error is $\beta=0.2$.
One-Sample Power
The following plot is for a one-sample one-tailed greater than t-test. In the graph below, let the null hypothesis $H_0:\mu = \mu_0$ be true, and the test statistic $t$ follows the null distribution indicated by the hashed area. Under the specific alternative hypothesis, $H_1:\mu = \mu_1$, the test statistic $t$ follows the distribution shown by solid area.
The $\alpha$ is the probability of making a type-I error (that is rejecting $H_0$ when it is true), and the “crit. Val” is the location of the $t_{crit}$ value associated with $H_0$ on the scale of the data. The rejection region is the area under $H_0$ at least as far as $crit. val.” is from $\mu_0$.
The test’s power ($1-\beta$) is the green area, the area under $H_1$ in the rejection region. A type-II error is made when $H_1$ is true, but we fail to reject $H_0$ in the red region.
Type-II Error and Power Analysis in R
#One Sample Power x <- seq(-4, 4, length = 1000) hx <- dnorm(x, mean = 0, sd = 1) plot(x, hx, type = "n", xlim = c(-4, 8), ylim = c(0, 0.5), main = expression (paste("Type-II Error (", beta, ") and Power (", 1 - beta, ")")), axes = FALSE) # one-tailed shift shift = qnorm (1 - 0.05, mean=0, sd = 1 )*1.7 xfit2 = x + shift yfit2 = dnorm(xfit2, mean=shift, sd = 1 ) axis (1, at = c(-qnorm(0.05), 0, shift), labels = expression("crit. val.", mu[0], mu[1])) axis(1, at = c(-4, 4 + shift), labels = expression(-infinity, infinity), lwd = 1, lwd.tick = FALSE) # The alternative hypothesis area # the red - underpowered area lb <- min(xfit2) # lower bound ub <- round(qnorm(0.95), 2) # upper bound col1 = "#CC2222" i <- xfit2 >= lb & xfit2 <= ub polygon(c(lb, xfit2[i], ub), c(0, yfit2[i],0), col = col1) # The green area where the power is col2 = "#22CC22" i <- xfit2 >= ub polygon(c(ub, xfit2[i], max(xfit2)), c(0, yfit2[i], 0), col = col2) # Outline the alternative hypothesis lines(xfit2, yfit2, lwd = 2) # Print null hypothesis area col_null = "#AAAAAA" polygon (c(min(x), x, max(x)), c(0, hx, 0), col = col_null, lwd = 2, density = c(10, 40), angle = -45, border = 0) lines(x, hx, lwd = 2, lty = "dashed", col=col_null) axis(1, at = (c(ub, max(xfit2))), labels = c("", expression(infinity)), col = col2, lwd = 1, lwd.tick = FALSE) #Legend legend("topright", inset = 0.015, title = "Color", c("Null Hypothesis", "Type-II error", "Power"), fill = c(col_null, col1, col2), angle = -45, density = c(20, 1000, 1000), horiz = FALSE) abline(v=ub, lwd=2, col="#000088", lty = "dashed") arrows(ub, 0.45, ub+1, 0.45, lwd=3, col="#008800") arrows(ub, 0.45, ub-1, 0.45, lwd=3, col="#880000")