Curvilinear Regression in R

In this post, we will learn about some basics of curvilinear regression in R.

The curvilinear regression analysis is used to determine if there is a non-linear trend exists between $X$ and $Y$.

Adding more parameters to an equation results in a better fit to the data. A quadratic and cubic equation will always have higher $R^2$ than the linear regression model. Similarly, a cubic equation will usually have higher $R^2$ than a quadratic one.

The logarithmic relationship can be described as follows:
$$Y=m\, log(x)++c$$
the polynomial relationship can be described as follows:
$$Y=m_1x + m_2x^2 + m_3x^3 + m_nx^n + c$$

The logarithmic example is more akin to a simple regression, whereas the polynomial example is multiple regression. Logarithmic relationships are common in the natural world; you may encounter them in many circumstances. Drawing the relationships between response and predictor variables as a scatter plot is generally a good starting point.

Consider the following data that are related in a curvilinear form,

GrowthNutrient
22
94
116
128
1310
1416
1722
1928
1730
1836
2048

Let us perform a curvilinear regression in R language.

Growth <- c(2, 9, 11, 12, 13, 14, 17, 19, 17, 18, 20)
Nutrient <- c(2, 4, 6, 8, 10, 16, 22, 28, 30, 36, 48)

data <- data <- as.data.frame(cbind(Growth, Nutrient))

ggplot(data, aes(Nutrient, Growth) ) +
  geom_point() +
  stat_smooth()
Curvilinear Regression in R

The Scatter plot shows the relationship appears to be a logarithmic one.

Let us carry out a linear regression using the lm() function by taking the $\log$ of the predictor variable rather than the basic variable itself.

data <- cbind(Growth, Nutrient)
mod <- lm(Growth~log(Nutrient, data))

summary(mod)

##
Call:
lm(formula = Growth ~ log(Nutrient), data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2274 -0.9039  0.5400  0.9344  1.3097 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.6914     1.0596   0.652     0.53    
log(Nutrient)   5.1014     0.3858  13.223 3.36e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.229 on 9 degrees of freedom
Multiple R-squared:  0.951,     Adjusted R-squared:  0.9456 
F-statistic: 174.8 on 1 and 9 DF,  p-value: 3.356e-07

Learn about Performing Linear Regression in R

Learn Statistics

Leave a Reply

Discover more from R Frequently Asked Questions

Subscribe now to keep reading and get access to the full archive.

Continue reading

x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
Shield Security