You can use the following basic syntax to count the number of unique values by group in a pandas DataFrame:
df.groupby('group_column')['count_column'].nunique()
The following examples show how to use this syntax with the following DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'], 'position': ['G', 'G', 'G', 'F', 'F', 'G', 'G', 'F', 'F', 'F'], 'points': [5, 7, 7, 9, 12, 9, 9, 4, 7, 7], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 13, 15]}) #view DataFrame df team position points rebounds 0 A G 5 11 1 A G 7 8 2 A G 7 10 3 A F 9 6 4 A F 12 6 5 B G 9 5 6 B G 9 9 7 B F 4 12 8 B F 7 13 9 B F 7 15
Example 1: Group By One Column & Count Unique Values
The following code shows how to count the number of unique values in the ‘points’ column for each team:
#count number of unique values in 'points' column grouped by 'team' column
df.groupby('team')['points'].nunique()
team
A 4
B 3
Name: points, dtype: int64
From the output we can see:
- There are 4 unique ‘points’ values for team A.
- There are 3 unique ‘points’ values for team B.
Note that we can also use the unique() function to display each unique ‘points’ value by team:
#display unique values in 'points' column grouped by 'team'
df.groupby('team')['points'].unique()
team
A [5, 7, 9, 12]
B [9, 4, 7]
Name: points, dtype: object
Example 2: Group By Multiple Columns & Count Unique Values
The following code shows how to count the number of unique values in the ‘points’ column, grouped by team and position:
#count number of unique values in 'points' column grouped by 'team' and 'position'
df.groupby(['team', 'position'])['points'].nunique()
team position
A F 2
G 2
B F 2
G 1
Name: points, dtype: int64
From the output we can see:
- There are 2 unique ‘points’ values for players in position ‘F’ on team A.
- There are 2 unique ‘points’ values for players in position ‘G’ on team A.
- There are 2 unique ‘points’ values for players in position ‘F’ on team B.
- There is1 unique ‘points’ value for players in position ‘G’ on team B.
Once again, we can use the unique() function to display each unique ‘points’ value by team and position:
#display unique values in 'points' column grouped by 'team' and 'position'
df.groupby(['team', 'position'])['points'].unique()
team position
A F [9, 12]
G [5, 7]
B F [4, 7]
G [9]
Name: points, dtype: object
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
Pandas: How to Find Unique Values in a Column
Pandas: How to Find Unique Values in Multiple Columns
Pandas: How to Count Occurrences of Specific Value in Column