R Language Formula Create Linear, Quadratic and Second Order Interaction Terms

Help us to keep this website almost Ad Free! It takes only 10 seconds of your time:
> Step 1: Go view our video on YouTube: EF Core Bulk Extensions
> Step 2: And Like the video. BONUS: You can also share it!

Example

y ~ . : Here . is interpreted as all variables except y in the data frame used in fitting the model. It is equivalent to the linear combinations of predictor variables. For example y ~ var1 + var2 + var3+...+var15

y ~ . ^ 2 will give all linear (main effects) and second order interaction terms of the variables in the data frame. It is equivalent to y ~ var1 + var2 + ...+var15 + var1:var2 + var1:var3 + var1:var4...and so on

y ~ var1 + var2 + ...+var15 + I(var1^2) + I(var2^2) + I(var3^2)...+I(var15^2) : Here I(var^2) indicates quadratic polynomial of one variable in the data frame.

y ~ poly(var1, degree = 2) + poly(var2, degree = 2)+...poly(var15, degree = 2)

or

y ~ poly(var1, var2, var3, ....var15, degree = 2) will be equivalent to the above expression.

poly(var1, degree = 2) is equivalent to var1 + I(var1^2).

To get cubic polynomials, use degree = 3 in poly().

There is a caveat in using poly versus I(var, 2), which is after fitting the model, each of them will produce different coefficients, but the fitted values are equivalent, because they represent different parameterizations of the same model. It is recommended to use I(var, 2) over poly() to avoid the summary effect seen in poly().

In summary, to get linear, quadratic and second order interaction terms, you will have an expression like

y ~ .^2 + I(var1^2) + I(var2^2)+...I(var15^2)

Demo for four variables:

old <- reformulate( 'y ~ x1+x2+x3+x4' )
new <- reformulate( " y ~ .^2 + I(x1^2) + I(x2^2) + I(x3^2) + I(x4^2) ")
tmp <- .Call(stats:::C_updateform, old, new)
terms.formula(tmp, simplify = TRUE )

# ~y ~ x1 + x2 + x3 + x4 + I(x1^2) + I(x2^2) + I(x3^2) + I(x4^2) + 
#   x1:x2 + x1:x3 + x1:x4 + x2:x3 + x2:x4 + x3:x4
# attr(,"variables")
# list(~y, x1, x2, x3, x4, I(x1^2), I(x2^2), I(x3^2), I(x4^2))
# attr(,"factors")
#         x1 x2 x3 x4 I(x1^2) I(x2^2) I(x3^2) I(x4^2) x1:x2 x1:x3 x1:x4 x2:x3 x2:x4 x3:x4
# ~y       0  0  0  0       0       0       0       0     0     0     0     0     0     0
# x1       1  0  0  0       0       0       0       0     1     1     1     0     0     0
# x2       0  1  0  0       0       0       0       0     1     0     0     1     1     0
# x3       0  0  1  0       0       0       0       0     0     1     0     1     0     1
# x4       0  0  0  1       0       0       0       0     0     0     1     0     1     1
# I(x1^2)  0  0  0  0       1       0       0       0     0     0     0     0     0     0
# I(x2^2)  0  0  0  0       0       1       0       0     0     0     0     0     0     0
# I(x3^2)  0  0  0  0       0       0       1       0     0     0     0     0     0     0
# I(x4^2)  0  0  0  0       0       0       0       1     0     0     0     0     0     0
# attr(,"term.labels")
# [1] "x1"      "x2"      "x3"      "x4"      "I(x1^2)" "I(x2^2)" "I(x3^2)" "I(x4^2)"
# [9] "x1:x2"   "x1:x3"   "x1:x4"   "x2:x3"   "x2:x4"   "x3:x4"  
# attr(,"order")
# [1] 1 1 1 1 1 1 1 1 2 2 2 2 2 2
# attr(,"intercept")
# [1] 1
# attr(,"response")
# [1] 1
# attr(,".Environment")
# <environment: R_GlobalEnv>


Got any R Language Question?