Tidy Evaluation: Dynamic column manipulation using dplyr

When it comes to R, the tidyverse is my dream land. I often have the problem where I want to write a fuction to manipulate data frames, but can’t figure out a good way to pass field names to a function and be able to run dplyr code on it.

I know there are some workarounds to this in base R, but I recently started learning about tidy evaluation. It’s a pretty simple framework in R that allows you to functionalize dplyr commands.

The documentation for this framework is extensive and I suggest reading it. I am going to examples of this that have worked for me here.

Add a column to a dataframe

In this example, I create a function that adds a lagged version of any column in a dataframe to the dataframe.

suppressMessages(library(tidyverse))

add_lag_field <- function(df, col) {
  # define new field name
  var_name = paste0(col,"_lag")
  return(df %>% mutate(
    !!sym(var_name) := lag(!!sym(col))
    )
  )
}

head(add_lag_field(iris, "Sepal.Width"))
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Width_lag
## 1          5.1         3.5          1.4         0.2  setosa              NA
## 2          4.9         3.0          1.4         0.2  setosa             3.5
## 3          4.7         3.2          1.3         0.2  setosa             3.0
## 4          4.6         3.1          1.5         0.2  setosa             3.2
## 5          5.0         3.6          1.4         0.2  setosa             3.1
## 6          5.4         3.9          1.7         0.4  setosa             3.6
Next
Previous