Often you may want to remove rows with all or some NAs (missing values) in a data frame in R.
This tutorial explains how to remove these rows using base R and the tidyr package. We’ll use the following data frame for each of the following examples:
#create data frame with some missing values
df #view data frame
df
points assists rebounds
1 12 4 5
2 NA NA NA
3 19 3 7
4 22 NA 12
5 32 5 NA
Remove NAs Using Base R
The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in any column:
#remove all rows with a missing value in any column df[complete.cases(df), ] points assists rebounds 1 12 4 5 3 19 3 7
The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in specific columns:
#remove all rows with a missing value in the third column df[complete.cases(df[ , 3]),] points assists rebounds 1 12 4 5 3 19 3 7 4 22 NA 12 #remove all rows with a missing value in either the first or third column df[complete.cases(df[ , c(1,3)]),] points assists rebounds 1 12 4 5 3 19 3 7 4 22 NA 12
Remove NAs Using Tidyr
The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in any column:
#load tidyr package
library(tidyr)
#remove all rows with a missing value in any column
df %>% drop_na()
points assists rebounds
1 12 4 5
3 19 3 7
The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in specific columns:
#load tidyr package
library(tidyr)
#remove all rows with a missing value in the third column
df %>% drop_na(rebounds)
points assists rebounds
1 12 4 5
3 19 3 7
4 22 NA 12
You can find more R tutorials here.