You can use the following methods to only keep certain columns in a pandas DataFrame:
Method 1: Specify Columns to Keep
#only keep columns 'col1' and 'col2' df[['col1', 'col2']]
Method 2: Specify Columns to Drop
#drop columns 'col3' and 'col4' df[df.columns[~df.columns.isin(['col3', 'col4'])]]
The following examples show how to use each method with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B'], 'points': [11, 7, 8, 10, 13, 13], 'assists': [5, 7, 7, 9, 12, 9], 'rebounds': [11, 8, 10, 6, 6, 5]}) #view DataFrame df team points assists rebounds 0 A 11 5 11 1 A 7 7 8 2 A 8 7 10 3 B 10 9 6 4 B 13 12 6 5 B 13 9 5
Method 1: Specify Columns to Keep
The following code shows how to define a new DataFrame that only keeps the “team” and “points” columns:
#create new DataFrame and only keep 'team' and 'points' columns
df2 = df[['team', 'points']]
#view new DataFrame
df2
team points
0 A 11
1 A 7
2 A 8
3 B 10
4 B 13
5 B 13
Notice that the resulting DataFrame only keeps the two columns that we specified.
Method 2: Specify Columns to Drop
The following code shows how to define a new DataFrame that drops the “assists” and “rebounds” columns from the original DataFrame:
#create new DataFrame and that drops 'assists' and 'rebounds'
df2 = df[df.columns[~df.columns.isin(['assists', 'rebounds'])]]
#view new DataFrame
df2
team points
0 A 11
1 A 7
2 A 8
3 B 10
4 B 13
5 B 13
Notice that the resulting DataFrame drops the “assists” and “rebounds” columns from the original DataFrame and keeps the remaining columns.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Drop First Column in Pandas DataFrame
How to Drop Duplicate Columns in Pandas
How to Drop Columns by Index in Pandas