1.0 | 2.0 | 3.0 |
---|---|---|
loud | complaint | problems |
pain | stress | confused |
dull | pain | stress |
this is my data set and I would like to reorganize the rows so that if there is a word that appears in each column it is transferred to a corresponding row. For example
1.0 | 2.0 | 3.0 |
---|---|---|
loud | NA | NA |
pain | pain | NA |
dull | NA | NA |
NA | complaint | NA |
NA | stress | stress |
NA | NA | confused |
NA | NA | problems |
etc., so that every word gets its own row and if the word in that row appears in the column it is matched up with the other columns
I have been looking how to do this but cannot find good code. One idea i was given was create a list of all appearing words and then try to match them to each column but still havent found code for this.
4 Answers
4
Reset to default
Highest score (default)
Trending (recent votes count more)
Date modified (newest first)
Date created (oldest first)
We may use match
in base R
– get the unique
elements from the unlist
ed data as a vector
, loop over the columns, get the index of the matching elements, and replace
the index with the matched elements and convert to a data.frame after taking care of the lengths
v1 <- unique(unlist(df1))
lst1 <- lapply(df1, (x)
{i1 <- match(x, v1)
replace(rep(NA, max(i1)), i1, v1[i1])
})
list2DF(lapply(lst1, `length<-`, max(lengths(lst1))))
1.0 2.0 3.0
1 loud <NA> <NA>
2 pain pain <NA>
3 dull <NA> <NA>
4 <NA> complaint <NA>
5 <NA> stress stress
6 <NA> <NA> problems
7 <NA> <NA> confused
data
df1 <- structure(list(`1.0` = c("loud", "pain", "dull"), `2.0` = c("complaint",
"stress", "pain"), `3.0` = c("problems", "confused", "stress"
)), class = "data.frame", row.names = c(NA, -3L))
1
-
1
The second lapply in
list2DF
isn't necessary; you can just use thenrow
argument:list2DF(lst1, nrow = max(lengths(lst1))
– rps122721 hours ago
Here is a tidyverse
version.
suppressMessages(library(tidyverse))
x = tibble(`1.0` = c("loud", "pain", "dull"),
`2.0` = c("complaint", "stress", "pain"),
`3.0` = c("problems", "confused", "stress"))
x %>%
gather("version", "value") %>%
mutate(id = value) %>%
spread(version, value) %>%
select(-id)
#> # A tibble: 7 x 3
#> `1.0` `2.0` `3.0`
#> <chr> <chr> <chr>
#> 1 <NA> complaint <NA>
#> 2 <NA> <NA> confused
#> 3 dull <NA> <NA>
#> 4 loud <NA> <NA>
#> 5 pain pain <NA>
#> 6 <NA> <NA> problems
#> 7 <NA> stress stress
Created on 2023-04-11 by the reprex package (v2.0.0)
If you need to arrange the rows in the order of occurrence you can change the second statement to
mutate(id = fct_inorder(value)) %>%
NOTE: The functions gather
and spread
are superseded by pivot_longer
and pivot_wider
functions. In my opinion, the older are a bit easier to use and are good enough for this case. The new functions are much more powerful.
Fast + efficient data.table solution:
x <- data.table(`1.0` = c("loud", "pain", "dull"),
`2.0` = c("complaint", "stress", "pain"),
`3.0` = c("problems", "confused", "stress"))
dcast(unique(melt(x, measure.vars = names(x))), value ~ variable)
value 1.0 2.0 3.0
1: complaint <NA> complaint <NA>
2: confused <NA> <NA> confused
3: dull dull <NA> <NA>
4: loud loud <NA> <NA>
5: pain pain pain <NA>
6: problems <NA> <NA> problems
7: stress <NA> stress stress
Here is a base R option using stack
+ reshape
reshape(
transform(stack(df), v = values),
direction = "wide",
idvar = "values",
timevar = "ind"
)[-1]
which gives
v.x1 v.x2 v.x3
1 loud <NA> <NA>
2 pain pain <NA>
3 dull <NA> <NA>
4 <NA> complaint <NA>
5 <NA> stress stress
7 <NA> <NA> problems
8 <NA> <NA> confused
Data
> dput(df)
structure(list(x1 = c("loud", "pain", "dull"), x2 = c("complaint",
"stress", "pain"), x3 = c("problems", "confused", "stress")), class = "data.frame", row.names = c(NA,
-3L))
Your Answer
Michael Allen is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
Post as a guest
Required, but never shown
Post as a guest
Required, but never shown
By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy
Not the answer you're looking for? Browse other questions tagged
or ask your own question.
or ask your own question.
|