Home » dplyr: How to Change Factor Levels Using mutate()

dplyr: How to Change Factor Levels Using mutate()

by Erma Khan

You can use the following basic syntax in dplyr to change the levels of a factor variable by using the mutate() function:

library(dplyr)

df % mutate(team=recode(team,
                                'H' = 'Hawks',
                                'M' = 'Mavs',
                                'C' = 'Cavs'))

This particular syntax makes the following changes to the team variable in the data frame:

  • ‘H’ becomes ‘Hawks’
  • ‘M’ becomes ‘Mavs’
  • ‘C’ becomes ‘Cavs’

The following example shows how to use this syntax in practice.

Example: Change Factor Levels Using mutate()

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame
df frame(team=factor(c('H', 'H', 'M', 'M', 'C', 'C')),
                 points=c(22, 35, 19, 15, 29, 23))

#view data frame
df

  team points
1    H     22
2    H     35
3    M     19
4    M     15
5    C     29
6    C     23

We can use the following syntax with the mutate() function from the dplyr package to change the levels of the team variable:

library(dplyr)

#change factor levels of team variable
df % mutate(team=recode(team,
                                'H' = 'Hawks',
                                'M' = 'Mavs',
                                'C' = 'Cavs'))

#view updated data frame
df

   team points
1 Hawks     22
2 Hawks     35
3  Mavs     19
4  Mavs     15
5  Cavs     29
6  Cavs     23

Using this syntax, we were able to make the following changes to the team variable in the data frame:

  • ‘H’ becomes ‘Hawks’
  • ‘M’ becomes ‘Mavs’
  • ‘C’ becomes ‘Cavs’

We can verify that the factor levels have been changed by using the levels() function:

#display factor levels of team variable
levels(df$team)

[1] "Cavs"  "Hawks" "Mavs" 

Also note that you can choose to change just one factor level instead of all of them.

For example, we can use the following syntax to only change ‘H’ to ‘Hawks’ and leave the other factor levels unchanged:

library(dplyr)

#change one factor level of team variable
df % mutate(team=recode(team, 'H' = 'Hawks'))

#view updated data frame
df

   team points
1 Hawks     22
2 Hawks     35
3     M     19
4     M     15
5     C     29
6     C     23

Notice that ‘H’ has been changed to ‘Hawks’ but the other two factor levels remained unchanged.

Additional Resources

The following tutorials explain how to perform other common tasks in dplyr:

How to Remove Rows Using dplyr
How to Select Columns by Index Using dplyr
How to Filter Rows that Contain a Certain String Using dplyr

Related Posts