Sometimes when working with linear regression we need to check for non-linearity in the data. One way to do this is to fit a polynomial model and check whether it fits the data better than a linear model. There are other reasons, such as theoretical, that indicate to fit a quadratic or higher order model because it is believed that the variables relationship is inherently polynomial in nature.
Let's fit a quadratic model for the mtcars
dataset. For a linear model see Linear regression on the mtcars dataset.
First we make a scatter plot of the variables mpg
(Miles/gallon), disp
(Displacement (cu.in.)), and wt
(Weight (1000 lbs)). The relationship among mpg
and disp
appears non-linear.
plot(mtcars[,c("mpg","disp","wt")])
A linear fit will show that disp
is not significant.
fit0 = lm(mpg ~ wt+disp, mtcars)
summary(fit0)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 34.96055 2.16454 16.151 4.91e-16 ***
#wt -3.35082 1.16413 -2.878 0.00743 **
#disp -0.01773 0.00919 -1.929 0.06362 .
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#Residual standard error: 2.917 on 29 degrees of freedom
#Multiple R-squared: 0.7809, Adjusted R-squared: 0.7658
Then, to get the result of a quadratic model, we added I(disp^2)
. The new model appears better when looking at R^2
and all variables are significant.
fit1 = lm(mpg ~ wt+disp+I(disp^2), mtcars)
summary(fit1)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 41.4019837 2.4266906 17.061 2.5e-16 ***
#wt -3.4179165 0.9545642 -3.581 0.001278 **
#disp -0.0823950 0.0182460 -4.516 0.000104 ***
#I(disp^2) 0.0001277 0.0000328 3.892 0.000561 ***
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#Residual standard error: 2.391 on 28 degrees of freedom
#Multiple R-squared: 0.8578, Adjusted R-squared: 0.8426
As we have three variables, the fitted model is a surface represented by:
mpg = 41.4020-3.4179*wt-0.0824*disp+0.0001277*disp^2
Another way to specify polynomial regression is using poly
with parameter raw=TRUE
, otherwise orthogonal polynomials will be considered (see the help(ploy)
for more information). We get the same result using:
summary(lm(mpg ~ wt+poly(disp, 2, raw=TRUE),mtcars))
Finally, what if we need to show a plot of the estimated surface? Well there are many options to make 3D
plots in R
. Here we use Fit3d
from p3d
package.
library(p3d)
Init3d(family="serif", cex = 1)
Plot3d(mpg ~ disp+wt, mtcars)
Axes3d()
Fit3d(fit1)