You can use the following syntax to drop rows that contain a certain string in a data frame in R:
df[!grepl('string', df$column),]
This tutorial provides several examples of how to use this syntax in practice with the following data frame in R:
#create data frame df frame(team=c('A', 'A', 'A', 'B', 'B', 'C'), conference=c('East', 'East', 'East', 'West', 'West', 'East'), points=c(11, 8, 10, 6, 6, 5)) #view data frame df team conference points 1 A East 11 2 A East 8 3 A East 10 4 B West 6 5 B West 6 6 C East 5
Example 1: Drop Rows that Contain a Specific String
The following code shows how to drop all rows in the data frame that contain ‘A’ in the team column:
df[!grepl('A', df$team),]
team conference points
4 B West 6
5 B West 6
6 C East 5
Or we could drop all rows in the data frame that contain ‘West’ in the conference column:
df[!grepl('West', df$conference),]
team conference points
1 A East 11
2 A East 8
3 A East 10
6 C East 5
Example 2: Drop Rows that Contain a String in a List
The following code shows how to drop all rows in the data frame that contain ‘A’ or ‘B’ in the team column:
df[!grepl('A|B', df$team),]
6 C East 5
We could also define a vector of strings and then remove all rows in the data frame that contain any of the strings in the vector in the team column:
#define vector of strings remove A', 'B') #remove rows that contain any string in the vector in the team column df[!grepl(paste(remove, collapse='|'), df$team),] 6 C East 5
Notice that both methods lead to the same result.
Additional Resources
How to Remove Rows with Some or All NAs in R
How to Remove Duplicate Rows in R
How to Sum Specific Rows in R