Home » How to Use the ntile() Function in dplyr (With Examples)

How to Use the ntile() Function in dplyr (With Examples)

by Erma Khan

You can use the ntile() function from the dplyr package in R to break up an input vector into n buckets.

This function uses the following basic syntax:

ntile(x, n)

where:

  • x: Input vector
  • n: Number of buckets

Note: The size of the buckets can differ by up to one.

The following examples show how to use this function in practice.

Example 1: Use ntile() with a Vector

The following code shows how to use the ntile() function to break up a vector with 11 elements into 5 different buckets:

library(dplyr)

#create vector
x #break up vector into 5 buckets
ntile(x, 5)

 [1] 1 1 1 2 2 3 3 4 4 5 5

From the output we can see that each element from the original vector has been placed into one of five buckets.

The smallest values are assigned to bucket 1 while the largest values are assigned to bucket 5.

For example:

  • The smallest values of 1, 3, and 4 are assigned to bucket 1.
  • The largest values of 22 and 23 are assigned to bucket 5.

Example 2: Use ntile() with a Data Frame

Suppose we have the following data frame in R that shows the points scored by various basketball players:

#create data frame
df frame(player=LETTERS[1:9],
                 points=c(12, 19, 7, 22, 24, 28, 30, 19, 15))

#view data frame
df

  player points
1      A     12
2      B     19
3      C      7
4      D     22
5      E     24
6      F     28
7      G     30
8      H     19
9      I     15

The following code shows how to use the ntile() function to create a new column in the data frame that assigns each player into one of three buckets, depending on their points scored:

library(dplyr)

#create new column that assigns players into buckets based on points
df$bucket #view updated data frame
df

  player points bucket
1      A     12      1
2      B     19      2
3      C      7      1
4      D     22      2
5      E     24      3
6      F     28      3
7      G     30      3
8      H     19      2
9      I     15      1

The new bucket column assigns a value between 1 and 3 to each player.

The players with the lowest points receive a value of 1 and the players with the highest points receive a value of 3.

Additional Resources

The following tutorials explain how to use other common functions in R:

How to Use the across() Function in dplyr
How to Use the relocate() Function in dplyr
How to Use the slice() Function in dplyr

Related Posts