dplyr
uses Non-Standard Evaluation(NSE), which is why we normally can use the variable names without quotes. However, sometimes during the data pipeline, we need to get our variable names from other sources such as a Shiny selection box. In case of functions like select
, we can just use select_
to use a string variable to select
variable1 <- "Sepal.Length"
variable2 <- "Sepal.Width"
iris %>%
select_(variable1, variable2) %>%
head(n=5)
# Sepal.Length Sepal.Width
# 1 5.1 3.5
# 2 4.9 3.0
# 3 4.7 3.2
# 4 4.6 3.1
# 5 5.0 3.6
But if we want to use other features such as summarize or filter we need to use interp
function from lazyeval
package
variable1 <- "Sepal.Length"
variable2 <- "Sepal.Width"
variable3 <- "Species"
iris %>%
select_(variable1, variable2, variable3) %>%
group_by_(variable3) %>%
summarize_(mean1 = lazyeval::interp(~mean(var), var = as.name(variable1)), mean2 = lazyeval::interp(~mean(var), var = as.name(variable2)))
# Species mean1 mean2
# <fctr> <dbl> <dbl>
# 1 setosa 5.006 3.428
# 2 versicolor 5.936 2.770
# 3 virginica 6.588 2.974