You can use one of the following two methods to remove columns from a data frame in R that contain NA values:
Method 1: Use Base R
df[ , colSums(is.na(df))==0]
Method 2: Use dplyr
library(dplyr) df %>% select_if(~ !any(is.na(.)))
Both methods produce the same result.
The following examples show how to use each method in practice with the following data frame:
#create data frame df frame(team=c('A', 'B', 'C', 'D', 'E'), points=c(99, NA, NA, 88, 95), assists=c(33, 28, 31, 39, 34), rebounds=c(30, 28, 24, 24, NA)) #view data frame df team points assists rebounds 1 A 99 33 30 2 B NA 28 28 3 C NA 31 24 4 D 88 39 24 5 E 95 34 NA
Example 1: Remove Columns with NA Values Using Base R
The following code shows how to remove columns with NA values using functions from base R:
#define new data frame new_df na(df))==0] #view new data frame new_df team assists 1 A 33 2 B 28 3 C 31 4 D 39 5 E 34
Notice that the two columns with NA values (points and rebounds) have both been removed from the data frame.
Example 2: Remove Columns with NA Values Using dplyr
The following code shows how to remove columns with NA values using functions from the dplyr package:
library(dplyr)
#define new data frame
new_df % select_if(~ !any(is.na(.)))
#view new data frame
new_df
team assists
1 A 33
2 B 28
3 C 31
4 D 39
5 E 34
Once again, the two columns with NA values (points and rebounds) have both been removed from the data frame.
Additional Resources
The following tutorials explain how to perform other common tasks in R:
How to Add a Column to a Data Frame in R
How to Rename Data Frame Columns in R
How to Sort a Data Frame by Column in R