You can use the following syntax to create a new column in a pandas DataFrame using multiple if else conditions:
#define conditions conditions = [ (df['column1'] == 'A') & (df['column2'] 20), (df['column1'] == 'A') & (df['column2'] >= 20), (df['column1'] == 'B') & (df['column2'] 20), (df['column1'] == 'B') & (df['column2'] >= 20) ] #define results results = ['result1', 'result2', 'result3', 'result4'] #create new column based on conditions in column1 and column2 df['new_column'] = np.select(conditions, results)
This particular example creates a column called new_column whose values are based on the values in column1 and column2 in the DataFrame.
The following example shows how to use this syntax in practice.
Example: Create New Column Using Multiple If Else Conditions in Pandas
Suppose we have the following pandas DataFrame that contains information about various basketball players:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'points': [15, 18, 22, 24, 12, 17, 20, 28]}) #view DataFrame print(df) team points 0 A 15 1 A 18 2 A 22 3 A 24 4 B 12 5 B 17 6 B 20 7 B 28
Now suppose we would like to create a new column called class that classifies each player into one of the following four groups:
- Bad_A if team is A and points
- Good_A if team is A and points ≥ 20
- Bad_B if team is B and points
- Good_B if team is B and points ≥ 20
We can use the following syntax to do so:
import numpy as np #define conditions conditions = [ (df['team'] == 'A') & (df['points'] 20), (df['team'] == 'A') & (df['points'] >= 20), (df['team'] == 'B') & (df['points'] 20), (df['team'] == 'B') & (df['points'] >= 20) ] #define results results = ['Bad_A', 'Good_A', 'Bad_B', 'Good_B'] #create new column based on conditions in column1 and column2 df['class'] = np.select(conditions, results) #view updated DataFrame print(df) team points class 0 A 15 Bad_A 1 A 18 Bad_A 2 A 22 Good_A 3 A 24 Good_A 4 B 12 Bad_B 5 B 17 Bad_B 6 B 20 Good_B 7 B 28 Good_B
The new column called class displays the classification of each player based on the values in the team and points columns.
Note: You can find the complete documentation for the NumPy select() function here.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
Pandas: How to Create Boolean Column Based on Condition
Pandas: How to Count Values in Column with Condition
Pandas: How to Use Groupby and Count with Condition