You can use the following methods to sum values across multiple columns of a data frame using dplyr:
Method 1: Sum Across All Columns
df %>% mutate(sum = rowSums(., na.rm=TRUE))
Method 2: Sum Across All Numeric Columns
df %>% mutate(sum = rowSums(across(where(is.numeric)), na.rm=TRUE))
Method 3: Sum Across Specific Columns
df %>% mutate(sum = rowSums(across(c(col1, col2))))
The following examples show how to each method with the following data frame that contains information about points scored by various basketball players during different games:
#create data frame df frame(game1=c(22, 25, 29, 13, 22, 30), game2=c(12, 10, 6, 6, 8, 11), game3=c(NA, 15, 15, 18, 22, 13)) #view data frame df game1 game2 game3 1 22 12 NA 2 25 10 15 3 29 6 15 4 13 6 18 5 22 8 22 6 30 11 13
Example 1: Sum Across All Columns
The following code shows how to calculate the sum of values across all columns in the data frame:
library(dplyr)
#sum values across all columns
df %>%
mutate(total_points = rowSums(., na.rm=TRUE))
game1 game2 game3 total_points
1 22 12 NA 34
2 25 10 15 50
3 29 6 15 50
4 13 6 18 37
5 22 8 22 52
6 30 11 13 54
Example 2: Sum Across All Numeric Columns
The following code shows how to calculate the sum of values across all numeric columns in the data frame:
library(dplyr)
#sum values across all numeric columns
df %>%
mutate(total_points = rowSums(across(where(is.numeric)), na.rm=TRUE))
game1 game2 game3 total_points
1 22 12 NA 34
2 25 10 15 50
3 29 6 15 50
4 13 6 18 37
5 22 8 22 52
6 30 11 13 54
Example 3: Sum Across Specific Columns
The following code shows how to calculate the sum of values across the game1 and game2 columns only:
library(dplyr)
#sum values across game1 and game2 only
df %>%
mutate(first2_sum = rowSums(across(c(game1, game2))))
game1 game2 game3 first2_sum
1 22 12 NA 34
2 25 10 15 35
3 29 6 15 35
4 13 6 18 19
5 22 8 22 30
6 30 11 13 41
Additional Resources
The following tutorials explain how to perform other common tasks using dplyr:
How to Remove Rows Using dplyr
How to Arrange Rows Using dplyr
How to Filter by Multiple Conditions Using dplyr