Often you may want to modify or format the column names in a pandas pivot table in a specific way.
Fortunately this is easy to do using built-in functions in pandas.
The following example shows how to do so.
Example: Modify Column Names in Pandas Pivot Table
Suppose we have the following pandas DataFrame that contains information about various basketball players:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'position': ['G', 'G', 'F', 'C', 'G', 'C', 'F', 'F'], 'points': [4, 4, 6, 8, 9, 5, 5, 12]}) #view DataFrame print(df) team position points 0 A G 4 1 A G 4 2 A F 6 3 A C 8 4 B G 9 5 B C 5 6 B F 5 7 B F 12
We can use the following code to create a pivot table in pandas that shows the mean value of points for each team and position in the DataFrame:
#create pivot table
piv = pd.pivot_table(df, values='points', index='team', columns='position')
#view pivot table
print(piv)
position C F G
team
A 8.0 6.0 4.0
B 5.0 8.5 9.0
Now suppose we would like to get rid of the word position in the pivot table and remove the extra team row from the pivot table.
We can use the following syntax to do so:
#format column names
piv.columns = ['_'.join(str(s).strip() for s in col if s) for col in piv.columns]
#reset index
piv.reset_index(inplace=True)
#view updated pivot table
print(piv)
team C F G
0 A 8.0 6.0 4.0
1 B 5.0 8.5 9.0
Notice that we were able to get rid of the word position in the pivot table and remove the extra team row from the pivot table.
Also note that this general solution will work for a pivot table with a MultiIndex as well.
Note: You can find the complete documentation for the pandas pivot_table() function here.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
Pandas: How to Create Pivot Table with Count of Values
Pandas: How to Replace NaN Values in Pivot Table with Zeros
Pandas: How to Convert Pivot Table to DataFrame