You can use the following basic syntax to find the sum of values by group in pandas:
df.groupby(['group1','group2'])['sum_col'].sum().reset_index()
The following examples show how to use this syntax in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'position': ['G', 'G', 'F', 'C', 'G', 'F', 'F', 'C'], 'points': [25, 17, 14, 9, 12, 9, 6, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #view DataFrame df team position points rebounds 0 A G 25 11 1 A G 17 8 2 A F 14 10 3 A C 9 6 4 B G 12 6 5 B F 9 5 6 B F 6 9 7 B C 4 12
Example 1: Group by One Column, Sum One Column
The following code shows how to group by one column and sum the values in one column:
#group by team and sum the points
df.groupby(['team'])['points'].sum().reset_index()
team points
0 A 65
1 B 31
From the output we can see that:
- The players on team A scored a sum of 65 points.
- The players on team B scored a sum of 31 points.
Example 2: Group by Multiple Columns, Sum Multiple Columns
The following code shows how to group by multiple columns and sum multiple columns:
#group by team and position, sum points and rebounds
df.groupby(['team', 'position'])['points', 'rebounds'].sum().reset_index()
team position points rebounds
0 A C 9 6
1 A F 14 10
2 A G 42 19
3 B C 4 12
4 B F 15 14
5 B G 12 6
From the output we can see that:
- The players on team A in the ‘C’ position scored a sum of 9 points and 6 rebounds.
- The players on team A in the ‘F’ position scored a sum of 14 points and 10 rebounds.
- The players on team A in the ‘G’ position scored a sum of 42 points and 19 rebounds.
And so on.
Note that the reset_index() function prevents the grouping columns from becoming part of the index.
For example, here’s what the output looks like if we don’t use it:
#group by team and position, sum points and rebounds
df.groupby(['team', 'position'])['points', 'rebounds'].sum()
points rebounds
team position
A C 9 6
F 14 10
G 42 19
B C 4 12
F 15 14
G 12 6
Depending on how you’d like the results to appear, you may or may not choose to use the reset_index() function.
Additional Resources
The following tutorials explain how to perform other common grouping operations in pandas:
How to Count Observations by Group in Pandas
How to Find the Max Value by Group in Pandas
How to Calculate Quantiles by Group in Pandas