You can use the following methods to select columns in a pandas DataFrame by condition:
Method 1: Select Columns Where At Least One Row Meets Condition
#select columns where at least one row has a value greater than 2
df.loc[:, (df > 2).any()]
Method 2: Select Columns Where All Rows Meet Condition
#select columns where all rows have a value greater than 2
df.loc[:, (df > 2).all()]
Method 3: Select Columns Where At Least One Row Meets Multiple Conditions
#select columns where at least one row has a value between 10 and 15
df.loc[:, ((df>=10) & (df15)).any()]
The following examples show how to use each method with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'apples': [7, 3, 3, 4, 3], 'oranges': [2, 0, 2, 0, 1], 'bananas': [5, 0, 4, 0, 12]}, index=['Farm1', 'Farm2', 'Farm3', 'Farm4', 'Farm5']) #view DataFrame print(df) apples oranges bananas Farm1 7 2 5 Farm2 3 0 0 Farm3 3 2 4 Farm4 4 0 0 Farm5 3 1 12
Example 1: Select Columns Where At Least One Row Meets Condition
We can use the following code to select the columns in the DataFrame where at least one row in the column has a value greater than 2:
#select columns where at least one row has a value greater than 2
df.loc[:, (df > 2).any()]
apples bananas
Farm1 7 5
Farm2 3 0
Farm3 3 4
Farm4 0 0
Farm5 3 12
Notice that the apples and bananas columns are returned because both of these columns have at least one row with a value greater than 2.
Example 2: Select Columns Where All Rows Meet Condition
We can use the following code to select the columns in the DataFrame where every row in the column has a value greater than 2:
#select columns where every row has a value greater than 2
df.loc[:, (df > 2).all()]
apples
Farm1 7
Farm2 3
Farm3 3
Farm4 4
Farm5 3
Notice that only the apples column is returned because it is the only column where every row in the column has a value greater than 2.
Example 3: Select Columns Where At Least One Row Meets Multiple Conditions
We can use the following code to select the columns in the DataFrame where at least one row in the column has a value between 10 and 15:
#select columns where every row has a value greater than 2
df.loc[:, ((df>=10) & (df15)).any()]
bananas
Farm1 5
Farm2 0
Farm3 4
Farm4 0
Farm5 12
Notice that only the bananas column is returned because it is the only column where at least one row in the column has a value between 10 and 15.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Select Columns by Name in Pandas
How to Select Columns by Index in Pandas
How to Select Columns Containing a Specific String in Pandas