How to rowwise sum the variables that contain the same variable string pattern in R

This is another blog post so that I can keep a snippet of code for myself! And if you find it helpful too, all the better.

Archie Madekwe Wow GIF by Saltburn - Find & Share on GIPHY

We will be completing rowwise computations, which is not the default in R. Therefore, we need to explicitly state that is what we are hoping to do

Source: https://cmdlinetips.com/2021/06/row-wise-operations-in-r/

In this instance, we will be using c_across() to specify we want to sum across particular columns.

Specifically… all columns that contain a string pattern of “totals_”

df <- df %>% 
  rowwise() %>%
  mutate(totals_sum = sum(c_across(contains("totals_")), na.rm = TRUE)) %>%
  ungroup()  

rowwise(): This function is used to indicate that operations following it should be applied row by row instead of column by column (which is the default behavior in dplyr).

mutate(totals_sum = sum(c_across(contains("totals_")), na.rm = TRUE)):

Within the mutate() function, sum(c_across(contains("totals_"))) computes the sum of all columns for each row that contain the pattern “totals_”.

The na.rm = TRUE argument is used to ignore NA values in the sum. c_across() is used to select columns within rowwise() context.

ungroup(): This function is used to remove the rowwise grouping imposed by rowwise(), returning the dataframe to a standard tbl_df.

Usually I forget to ungroup. Oops. But this is important for performance reasons and because most dplyr functions expect data not to be in a rowwise format.

Oh Yeah Hot Ones GIF by First We Feast - Find & Share on GIPHY

Create a rowwise binary variable

data <- data %>%
  rowwise() %>%
  mutate(has_ruler = as.integer(any(c_across(starts_with("broad_cat_")) == "ruler"))) %>%
  ungroup()

Leave a comment