## lm Function in R

Many generic functions are available for the computation of regression coefficients, for the testing of coefficients, for computation of residuals or predictions values, etc. Therefore, a good grasp of `lm()`

function is necessary. Suppose, we have performed the regression analysis using lm() function as done in the previous lesson.

mod <- lm(mpg ~ hp, data = mtcars)

The object returned by the lm() function has a class of “lm”. The objects associated with the “lm” class have mode as a list.

class(mod)

The name of the objects related to the “lm” class can be queried via

names(mod)

All the components of the “lm” class can be assessed directly. For example,

mod$rank mod$coef # or mod$coefficients

The following is the list of some generic functions for the fitted “lm” model.

Generic Function | Short Description |

`print()` | print or displays the results in R Console |

`summary()` | print or displays regression coefficients, their standard errors, t-ratios, p-values, and significance |

`coef()` | extracts regression coefficients |

`residuals()` | or `resid()` : extracts residuals of the fitted model |

`fitted()` | or `fitted.values()` : extracts fitted values |

`anova()` | perform comparisons of the nested model |

`predict()` | compute predicted values for new data |

`plot()` | draw diagnostics plot of the regression model |

`confint()` | compute the confidence intervals for regression coefficients |

`deviance()` | compute the residual sum of squares |

`vcov()` | compute estimated variance-covariance matrix |

`logLik()` | compute the log-likelihood |

`AIC(), BIC()` | compute information criteria |

It is better to save objects from the `summary()`

function.

The summary() function returns an object of class “`summy.lm()`

” and its components can be queried via

sum_mod <- summary(mod) names(sum_mod) names( summary(mod) )

The objects from the `summary()`

function can be obtained as

sum_mod$residuals sum_mod$r.squared sum_mod$adj.r.squared sum_mod$df sum_mod$sigma sum_mod$fstatistic

The confidence interval for estimated coefficients can be computed as

confint(mod, level = 0.95)

Note that level argument is optional if the confidence level is 95% (significance level is 5%).

The prediction intervals for mean and individual for `hp`

(regressor) equal to 200 and 160, can be computed as

predict(mod, newdata=data.frame(hp = c(200, 160)), interval = "confidence" ) predict(mod, newdata=data.frame(hp = c(200, 160)), interval = "prediction" )

The prediction intervals can be used for computing and visualizing confidence bands. For example,

x = seq(50, 350, length=32 ) pred <-predict(mod, newdata=data.frame(x), interval = "prediction" ) plot(hp, mpg) lines(pred[,1] ~ x, col = 1) # fitted values lines(pred[,2] ~ x, col = 2) # lower limit lines(pred[,3] ~ x, col = 2) # upper limit

For diagnostics plot, plot() function can be used and it provides four graphs of

- residuals vs fitted values
- QQ plot of standardized residuals
- scale-location plot of fitted values against the square root of standardized residuals
- standardized residuals vs leverage

To plot say QQ plot only use

plot(mod, which = 2)

which argument is used to select the graph produced out of four.