Often you may be interested in only finding the sum of rows in an R data frame that meet some criteria. Fortunately this is easy to do using the following basic syntax:
aggregate(col_to_sum ~ col_to_group_by, data=df, sum)
The following examples show how to use this syntax on the following data frame:
#create data frame df frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'), pts=c(5, 8, 14, 18, 5, 7, 7), rebs=c(8, 8, 9, 3, 8, 7, 4), blocks=c(1, 2, 2, 1, 0, 4, 1)) #view data frame df team pts rebs blocks 1 a 5 8 1 2 a 8 8 2 3 b 14 9 2 4 b 18 3 1 5 b 5 8 0 6 c 7 7 4 7 c 7 4 1
Example 1: Perform a SUMIF Function on One Column
The following code shows how to find the sum of points for each team:
aggregate(pts ~ team, data=df, sum)
team pts
1 a 13
2 b 37
3 c 14
Example 2: Perform a SUMIF Function on Multiple Columns
The following code shows how to find the sum of points and rebounds for each team:
aggregate(cbind(pts, rebs) ~ team, data=df, sum)
team pts rebs
1 a 13 16
2 b 37 20
3 c 14 11
Example 3: Perform a SUMIF Function on All Columns
The following code shows how to find the sum of all columns in the data frame for each team:
aggregate(. ~ team, data=df, sum)
team pts rebs blocks
1 a 13 16 3
2 b 37 20 3
3 c 14 11 5
Note: The period (.) is used in R to represent “all” columns.
Additional Resources
How to Perform a COUNTIF Function in R
How to Sum Specific Columns in R
How to Sum Specific Rows in R