Home » How to Reshape Data Between Wide and Long Format in R

How to Reshape Data Between Wide and Long Format in R

by Erma Khan

A data frame in R can be displayed in a wide or long format.

Depending on your goal, you may want the data frame to be in one of these specific formats.

The easiest way to reshape data between these formats is to use the following two functions from the tidyr package in R:

  • pivot_longer(): Reshapes a data frame from wide to long format.
  • pivot_wider(): Reshapes a data frame from long to wide format.

The following examples show how to use each function in practice.

Example 1: Reshape Data from Wide to Long

Suppose we have the following data frame in R that is currently in a wide format:

#create data frame
df frame(player=c('A', 'B', 'C', 'D'),
                 year1=c(12, 15, 19, 19),
                 year2=c(22, 29, 18, 12))

#view data frame
df

  player year1 year2
1      A    12    22
2      B    15    29
3      C    19    18
4      D    19    12

We can use the pivot_longer() function to pivot this data frame into a long format:

library(tidyr)

#pivot the data frame into a long format
df %>% pivot_longer(cols=c('year1', 'year2'),
                    names_to='year',
                    values_to='points')

# A tibble: 8 x 3
  player year  points
      
1 A      year1     12
2 A      year2     22
3 B      year1     15
4 B      year2     29
5 C      year1     19
6 C      year2     18
7 D      year1     19
8 D      year2     12

Notice that the column names year1 and year2 are now used as values in a new column called “year” and the values from these original columns are placed into one new column called “points.”

The final result is a long data frame.

Note: You can find the complete documentation for the pivot_longer() function here.

Example 2: Reshape Data from Long to Wide

Suppose we have the following data frame in R that is currently in a long format:

#create data frame
df frame(player=rep(c('A', 'B'), each=4),
                 year=rep(c(1, 1, 2, 2), times=2),
                 stat=rep(c('points', 'assists'), times=4),
                 amount=c(14, 6, 18, 7, 22, 9, 38, 4))

#view data frame
df

  player year    stat amount
1      A    1  points     14
2      A    1 assists      6
3      A    2  points     18
4      A    2 assists      7
5      B    1  points     22
6      B    1 assists      9
7      B    2  points     38
8      B    2 assists      4

We can use the pivot_wider() function to pivot this data frame into a wide format:

library(tidyr)

#pivot the data frame into a wide format
df %>% pivot_wider(names_from = stat, values_from = amount)

# A tibble: 4 x 4
  player  year points assists
         
1 A          1     14       6
2 A          2     18       7
3 B          1     22       9
4 B          2     38       4

Notice that the values from the stat column are now used as column names and the values from the amount column are used as cell values in these new columns.

The final result is a wide data frame.

Note: You can find the complete documentation for the pivot_wider() function here.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Count Unique Values by Group in R
How to Count Non-NA Values in R
How to Create Relative Frequency Tables in R

Related Posts