Statistical Computing and Graphics in R

# Probability Distributions in R

We often make probabilistic statements when working with statistical Probability Distributions. We want to know four things:

• The density (PDF) at a particular value,
• The distribution (CDF) at a particular probability,
• The quantile value corresponding to a particular probability, and
• A random draw of values from a particular distribution.

R has plenty of functions for obtaining density, distribution, quantile, and random variables.

Consider a random variable $X$ which is $N(\mu = 2, \sigma^2 = 16)$. We want to:

1) Calculate the value of PDF at $x=3$ (that is, the height of the curve at $x=3$)

dnorm(x=3, mean = 2, sd = sqrt(16) ) dnorm(x=3, mean = 2, sd = 4) dnorm(x=3, 2, 4)

2) Calculate the value of the CDF at $x=3$ (that is, $P(X\le 3)$)

pnorm(q=3, m=2, sd=4)

3) Calculate the quantile for probability 0.975

qnorm(p = 0.975, m = 2, sd = 4)

4) Generate a random sample of size $n = 10$

rnorm(n = 10, m = 2, sd = 5)

There are many probability distributions available in the R Language. I will list only a few.

Observe that a prefix (d, q, p, and r) is added for each distribution.

Drawing the Density function

The density function dnorm( ) can be used to draw a graph of normal (or any distribution). Let us compare two normal distributions both with mean = 20, and one with sd = 6, and other with sd = 3.

For this purpose, we need $x$-axis values, such as $\overline{x} \pm 3SD \Rightarrow 20 + \pm 3\times 6$.

xaxis &lt;- seq(0, 40, 0.5)
y1 &lt;- dnorm(xaxis, 20, 6)
y2 &lt;- dnorm(xaxis, 20, 3)
plot(xaxis, y2, type = "l", main = "comparing two normal distributions", col = "blue")
points(xaxis, y1, type="l", col = "red")

Finding Probabilities

#Left Tailed Probability
pnorm(1.96)

#Area between two Z-scores
pnorm(1.96) - pnorm(-1.96)

Finding Right Tailed Probabilities

1 - pnorm(1.96)

Solving Real Problem

Suppose, you took a standardized test that has a mean of 500 and a standard deviation of 100. You took 720 marks (score). You are interested in the approximate percentile on this test.

To solve this problem, you have to find the Z-score of 720 and then use the pnorm( ) to find the percentile of your score.

zscore&lt;-scale(x = 720,  500,  100)
pnorm(2.2)
pnorm(zscore[1,1])
pnorm(zscore[1])
pnorm(zscore[1, ])

Probability Distributions in R
Scroll to top