How to 'Pivot Wider' when you have only character values

By Synnøve Yndestad in R

January 27, 2022

Reshaping your data by pivoting from long to wide, or wide to long is used frequently when wrangling your data.
The pivot_longer() and pivot_wider() function from the tidyr package does this job excellent in most cases. I have however had some issues when reshaping data containing only characters.

This is my solution to this issue.

# Generate a long data frame
Subject <- c("Ken", "Ken", "Ken", "Abby","Abby", "Nick", "Nick","Nick","Nick","Rose")
Likes <- c("Cake", "Fish", "Mushroom", "Cake", "Pie", "Cake", "Mushroom", "Fish", "Pie", "Cake")
df <- as.data.frame(cbind(Subject, Likes))

df
##    Subject    Likes
## 1      Ken     Cake
## 2      Ken     Fish
## 3      Ken Mushroom
## 4     Abby     Cake
## 5     Abby      Pie
## 6     Nick     Cake
## 7     Nick Mushroom
## 8     Nick     Fish
## 9     Nick      Pie
## 10    Rose     Cake

Now we have a long data frame with types of food different people like.

I want to “trick” pivot_longer() into pivoting my data even if I lack a value column. To do that, add a temporary value column n = 1

# Add a value column in order to use pivot_wider()
df$n <- 1
df
##    Subject    Likes n
## 1      Ken     Cake 1
## 2      Ken     Fish 1
## 3      Ken Mushroom 1
## 4     Abby     Cake 1
## 5     Abby      Pie 1
## 6     Nick     Cake 1
## 7     Nick Mushroom 1
## 8     Nick     Fish 1
## 9     Nick      Pie 1
## 10    Rose     Cake 1

Then we can pivot wider easily.

library(tidyverse)
df.w <- pivot_wider(df, id_cols = Subject,    # Specify id-column to keep
                        names_from = Likes,   # Column characters that go to column names
                        values_from = n)      # Value 

df.w
## # A tibble: 4 × 5
##   Subject  Cake  Fish Mushroom   Pie
##   <chr>   <dbl> <dbl>    <dbl> <dbl>
## 1 Ken         1     1        1    NA
## 2 Abby        1    NA       NA     1
## 3 Nick        1     1        1     1
## 4 Rose        1    NA       NA    NA

Now we see that everybody likes cake, but this is not the output we wanted. Time to replace the temporary “1” values with characters.

We have the hard way:
Rename the value = 1 for each column individually

# The hard way, not applicable to other datasets
df.w$Cake <- str_replace(df.w$Cake, "1", "Cake")
df.w$Fish <- str_replace(df.w$Fish, "1", "Fish")
df.w$Mushroom <- str_replace(df.w$Mushroom, "1", "Mushroom")
df.w$Pie <- str_replace(df.w$Pie, "1", "Pie")

And the easy way:
A more generic way to do this is to make a function that will iterate over each column and replace the value with the respective column name.

# Search for "1", replace with column names
for (i in 1:length(df.w)) {
  df.w[[i]] <- str_replace(df.w[[i]], "1", colnames(df.w)[i])
}

df.w
## # A tibble: 4 × 5
##   Subject Cake  Fish  Mushroom Pie  
##   <chr>   <chr> <chr> <chr>    <chr>
## 1 Ken     Cake  Fish  Mushroom <NA> 
## 2 Abby    Cake  <NA>  <NA>     Pie  
## 3 Nick    Cake  Fish  Mushroom Pie  
## 4 Rose    Cake  <NA>  <NA>     <NA>

Neat!

Does this also works backwards? Yes it does!

Go from Wide to Long:

# Search for column names, replace with "1"
for (i in 1:length(df.w)) {
  df.w[[i]] <- str_replace(df.w[[i]], colnames(df.w)[i], "1")   
}

# Pivot longer
df.l = df.w %>% pivot_longer(!Subject,                   # Select all but Subject columns to pivot
                             names_to = "Likes",         # Create the "Likes" column
                             values_to = "n",            # The value column
                             values_drop_na = TRUE) %>%  # Remember to drop the NA rows
                select(-n)                               # Remove the "n" column

df.l 
## # A tibble: 10 × 2
##    Subject Likes   
##    <chr>   <chr>   
##  1 Ken     Cake    
##  2 Ken     Fish    
##  3 Ken     Mushroom
##  4 Abby    Cake    
##  5 Abby    Pie     
##  6 Nick    Cake    
##  7 Nick    Fish    
##  8 Nick    Mushroom
##  9 Nick    Pie     
## 10 Rose    Cake

Now we are back where we started, a long data frame containing only character values.

Posted on:
January 27, 2022
Length:
3 minute read, 635 words
Categories:
R
Tags:
tidyverse pivot_wider() Categoricals
See Also:
Read and merge multiple files by folder
Plotting bar charts in R, geom_bar vs geom_col
For loop for Multiple Trend in Proportions