Home » How to Remove Columns with NA Values in R

How to Remove Columns with NA Values in R

by Erma Khan
spot_img

You can use one of the following two methods to remove columns from a data frame in R that contain NA values:

Method 1: Use Base R

df[ , colSums(is.na(df))==0]

Method 2: Use dplyr

library(dplyr)

df %>% select_if(~ !any(is.na(.)))

Both methods produce the same result.

The following examples show how to use each method in practice with the following data frame:

#create data frame
df frame(team=c('A', 'B', 'C', 'D', 'E'),
                 points=c(99, NA, NA, 88, 95),
                 assists=c(33, 28, 31, 39, 34),
                 rebounds=c(30, 28, 24, 24, NA))

#view data frame
df

  team points assists rebounds
1    A     99      33       30
2    B     NA      28       28
3    C     NA      31       24
4    D     88      39       24
5    E     95      34       NA

Example 1: Remove Columns with NA Values Using Base R

The following code shows how to remove columns with NA values using functions from base R:

#define new data frame
new_df na(df))==0]

#view new data frame
new_df

  team assists
1    A      33
2    B      28
3    C      31
4    D      39
5    E      34

Notice that the two columns with NA values (points and rebounds) have both been removed from the data frame.

Example 2: Remove Columns with NA Values Using dplyr

The following code shows how to remove columns with NA values using functions from the dplyr package:

library(dplyr)

#define new data frame
new_df % select_if(~ !any(is.na(.)))

#view new data frame
new_df

  team assists
1    A      33
2    B      28
3    C      31
4    D      39
5    E      34

Once again, the two columns with NA values (points and rebounds) have both been removed from the data frame.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Add a Column to a Data Frame in R
How to Rename Data Frame Columns in R
How to Sort a Data Frame by Column in R

spot_img

Related Posts