Home » Pandas: Create New Column Using Multiple If Else Conditions

Pandas: Create New Column Using Multiple If Else Conditions

by Erma Khan

You can use the following syntax to create a new column in a pandas DataFrame using multiple if else conditions:

#define conditions
conditions = [
    (df['column1'] == 'A') & (df['column2'] 20),
    (df['column1'] == 'A') & (df['column2'] >= 20),
    (df['column1'] == 'B') & (df['column2'] 20),
    (df['column1'] == 'B') & (df['column2'] >= 20)
]

#define results
results = ['result1', 'result2', 'result3', 'result4']

#create new column based on conditions in column1 and column2
df['new_column'] = np.select(conditions, results)

This particular example creates a column called new_column whose values are based on the values in column1 and column2 in the DataFrame.

The following example shows how to use this syntax in practice.

Example: Create New Column Using Multiple If Else Conditions in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'points': [15, 18, 22, 24, 12, 17, 20, 28]})

#view DataFrame
print(df)

  team  points
0    A      15
1    A      18
2    A      22
3    A      24
4    B      12
5    B      17
6    B      20
7    B      28

Now suppose we would like to create a new column called class that classifies each player into one of the following four groups:

  • Bad_A if team is A and points
  • Good_A if team is A and points ≥ 20
  • Bad_B if team is B and points
  • Good_B if team is B and points ≥ 20

We can use the following syntax to do so:

import numpy as np

#define conditions
conditions = [
    (df['team'] == 'A') & (df['points'] 20),
    (df['team'] == 'A') & (df['points'] >= 20),
    (df['team'] == 'B') & (df['points'] 20),
    (df['team'] == 'B') & (df['points'] >= 20)
]

#define results
results = ['Bad_A', 'Good_A', 'Bad_B', 'Good_B']

#create new column based on conditions in column1 and column2
df['class'] = np.select(conditions, results)

#view updated DataFrame
print(df)

  team  points   class
0    A      15   Bad_A
1    A      18   Bad_A
2    A      22  Good_A
3    A      24  Good_A
4    B      12   Bad_B
5    B      17   Bad_B
6    B      20  Good_B
7    B      28  Good_B

The new column called class displays the classification of each player based on the values in the team and points columns.

Note: You can find the complete documentation for the NumPy select() function here.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

Pandas: How to Create Boolean Column Based on Condition
Pandas: How to Count Values in Column with Condition
Pandas: How to Use Groupby and Count with Condition

Related Posts