Statistical Models in R Language: Secrets

R language provides an interlocking suite of facilities that make fitting statistical models very simple. The output from statistical models in R language is minimal and one needs to ask for the details by calling extractor functions.

Defining Statistical Models in R Language

The template for a statistical model is a linear regression model with independent, heteroscedastic errors, that is
$$\sum_{j=0}^p \beta_j x_{ij}+ e_i, \quad e_i \sim NID(0, \sigma^2), \quad i=1,2,\dots, n, j=1,2,\cdots, p$$

In matrix form, the statistical model can be written as

$$y=X\beta+e$$

where the $y$ is the dependent (response) variable, $X$ is the model matrix or design matrix (matrix of regressors), and has columns $x_0, x_1, \cdots, x_p$, the determining variables with intercept term. Usually, $x_0$ is a column of ones defining an intercept term in the statistical model.

Statistical Model Examples

Suppose $y, x, x_0, x_1, x_2, \cdots$ are numeric variables, $X$ is a matrix. Following are some examples that specify statistical models in R.

  • y ~ x    or   y ~ 1 + x
    Both examples imply the same simple linear regression model of $y$ on $x$. The first formulae have an implicit intercept term and the second formulae have an explicit intercept term.
  • y ~ 0 + x  or  y ~ -1 + x  or y ~ x – 1
    All these imply the same simple linear regression model of $y$ on $x$ through the origin, without an intercept term.
  • log(y) ~ x1 + x2
    Imply multiple regression of the transformed variable, $(log(y)$ on $x_1$ and $x_2$ with an implicit intercept term.
  • y ~ poly(x , 2)  or  y ~ 1 + x + I(x, 2)
    Imply a polynomial regression model of $y$ on $x$ of degree 2 (second-degree polynomials) and the second formulae use explicit powers as a basis.
  • y~ X + poly(x, 2)
    Multiple regression $y$ with a model matrix consisting of the design matrix $X$ as well as polynomial terms in $x$ to degree 2.

Note that the operator ~ defines a model formula in R language. The form of an ordinary linear regression model is, $response\,\, ~ \,\, op_1\,\, term_1\,\, op_2\,\, term_2\,\, op_3\,\, term_3\,\, \cdots $,

where

  • The response is a vector or matrix defining the response (dependent) variable(s).
  • $op_i$ is an operator, either + or -, implying the inclusion or exclusion of a term in the model. The + operator is optional.
  • $term_i$ is either a matrix or vector or 1. It may be a factor or a formula expression consisting of factors, vectors, or matrices connected by formula operators.
Statistical Models in R Language

FAQS about Statistical Models in R

  1. How statistical models are specified in R Language?
  2. How linear regression is performed in R language using the formula?
  3. How linear regression can be performed without intercept in r?
  4. How polynomial regression can be performed in R?
  5. Write about the ~ operator in R.
Statistical Models in R Language R FAQs https://rfaqs.com

https://gmstat.com
https://itfeature.com

How to View Source Code of R Method/ Function?

The article is about viewing the source code of R Method. There are different ways to view the source code of an R method or function. It will help to know how the function is working.

Source Code of R Method (Internal Functions)

If you want to see the source code of R method or the internal function (functions from base packages), just type the name of the function at the R prompt such as;

rowMeans
view R code of method

Functions or Methods from the S3 Class System

For S3 classes, the methods function can be used to list the methods for a particular generic function or class.

methods(predict)
Methods from the S3

Note that “Non-Visible functions are asterisked” means that the function is not exported from its package’s namespace.

One can still view its source code via the ::: function such as

stats:::predict.lm

or by using getAnywhere() function, such as

getAnywhere(predict.lm)

Note that the getAnywhere() function is useful as you don’t need to know from which package the function or method comes from.

Functions or Methods from the S4 Class System

The S4 system is a newer method dispatch system and is an alternative to the S3 system. The package ‘Matrix’ is an example of S4 function.

library(Matrix)
chol2inv
S4 Class System

The output already offers a lot of information. The standardGeneric is an indicator of an S4 function. The method to see defined S4 methods is to use showMethods(chol2inv), that is;

showMethods(chol2inv)
Source Code of R Method: view R code S4 System

The getMethod can be used to see the source code of one of the methods, such as,

getMethod ("chol2inv", "diagonalMatrix")
view R code S4 System

View Source Code of Unexported Functions

In the case of unexported functions such as ts.union, .cbindts, and .makeNamesTs from the stats namespace, one can view the source code of these unexported functions using the ::: operator or getAnywhere() function, for example;

stats::: .makeNamesTs
getAnywhere(.makeNamesTs)
view R code S4 System

https://itfeature.com

Online MCQs Test Preparation Website

Greek Letters in R Plot Label and Title

Introduction to Greek Letters in R Plot

The post is about writing Greek letters in R plot, their labels, and the title of the plots. There are two main ways to include Greek letters in your R plot labels (axis labels, title, legend):

  1. Using the expression Function
    This is the recommended approach as it provides more flexibility and control over the formatting of the Greek letters and mathematical expressions.
  2. Using raw Greek letter Codes
    This method is less common and requires memorizing the character codes for each Greek letter.

Question: How one can include Greek letters (symbols) in R plot labels?
Answer: Greek letters or symbols can be included in titles and labels of a graph using the expression command. Following are some examples

Note that in these examples random data is generated from a normal distribution. You can use your own data set to produce graphs that have symbols or Greek letters in their labels or titles.

Greek Letters in R Plot

The following are a few examples of writing Greek letters in R plot.

Example 1: Draw Histogram

mycoef <- rnorm (1000)
hist(mycoef, main = expression(beta) )

where beta in expression is the Greek letter (symbol) of $latex \beta$. A histogram similar to the following will be produced.

greek Letters in r plot-1

Example 2:

sample <- rnorm(mean=5, sd=1, n=100)
hist(sample, main=expression( paste("sampled values, ", mu, "=5, ", sigma, "=1" )))

where mu and sigma are symbols of $latex \mu$ and $latex \sigma$ respectively. The histogram will look like

greek symbols in r plot-2

Example 3:

curve(dnorm, from= -3, to=3, n=1000, main="Normal Probability Density Function")

will produce a curve of Normal probability density function ranging from $latex -3$ to $latex 3$.

greek symbols in r plot-3

Normal Density Function

To add a normal density function formula, we need to use the text and paste command, that is

text(-2, 0.3, expression(f(x) == paste(frac(1, sqrt(2*pi* sigma^2 ) ), " ", e^{frac(-(x-mu)^2, 2*sigma^2)})), cex=1.2)

Now the updated curve of the Normal probability density function will be

Normal Probability Density Function

Example 4:

x <- dnorm( seq(-3, 3, 0.001))
plot(seq(-3, 3, 0.001), cumsum(x)/sum(x), 
           type="l", col="blue", xlab="x", 
           main="Normal Cumulative Distribution Function")

The Normal Cumulative Distribution function will look like,

Normal Cumulative Distribution Function

To add the formula, use the text and paste command, that is

text(-1.5, 0.7, 
       expression(phi(x) == paste(frac(1, sqrt(2*pi)), " ", 
       integral(e^(-t^2/2)*dt, -infinity, x))), cex = 1.2)

The curve of Normal Cumulative Distribution Function

The Curve of the Normal Cumulative Distribution Function and its formula in the plot will look like this,

Normal Cumulative distribution

https://itfeature.com

https://gmstat.com