unname coder's blog

Posts

Showing posts with the label dplyr

Using the dplyr library in R to “print” the name of the non-NA columns

- May 11, 2021

6 Here is my data frame: a <- data.frame(id=c(rep("A",2),rep("B",2)), x=c(rep(2,2),rep(3,2)), p.ABC= c(1,NA,1,1), p.DEF= c(NA,1,NA,NA), p.TAR= c(1,NA,1,1), p.REP= c(NA,1,1,NA), p.FAR= c(NA,NA,1,1)) I Want to create a new character column (using mutate() in the dplyr library in R), which tells (by row) the name of the columns that have a non-NA value (here the non-NA value is always 1). However, it should only search among the columns that start with "p." and it should order the names by alphabetical order and then concatenate them using the expression "_" as a separator. You can find below the desired result, under the column cal...

Extract rows where value appears in any of multiple columns

- April 30, 2021

4 2 Let' say I have two data.frames name_df = read.table(text = "player_name a b c d e f g", header = T) game_df = read.table(text = "game_id winner_name loser_name 1 a b 2 b a 3 a c 4 a d 5 b c 6 c d 7 d e 8 e f 9 f a 10 g f 11 g a 12 f e 13 a d", header = T) name_df contains a unique list of all the winner_name or loser_name values in game_df . I want to create a new data.frame that has, for each person in the name_df a row if a given name (e.g. a ) appears in either the winner_name or loser_name column So I essentially want to merge game_df with name_df , but the key column ( name ) can appear in either winner_name or loser_name . So, for just a and b the final output would look something like: final_df = read.table(text = ...

Problem using rowwise() to count the number of NA's in each row of a dataframe

- April 30, 2021

2 0 I'm having trouble using rowwise() to count the number of NAs in each row. My minimal example: df <- data.frame(Q1 = c(rep(1, 1), rep(NA, 9)), Q2 = c(rep(2, 2), rep(NA, 8)), Q3 = c(rep(3, 3), rep(NA, 7)) ) df Q1 Q2 Q3 1 1 2 3 2 NA 2 3 3 NA NA 3 4 NA NA NA 5 NA NA NA 6 NA NA NA 7 NA NA NA 8 NA NA NA 9 NA NA NA 10 NA NA NA I would like to create a new column that counts the number of NAs in each row. I can do this very simply by writing df$Count_NA <- rowSums(is.na(df)) df Q1 Q2 Q3 Count_NA 1 1 2 3 0 2 NA 2 3 1 3 NA NA 3 2 4 NA NA NA 3 5 NA NA NA 3 6 NA NA NA 3 7 NA NA NA 3 8 NA NA NA 3 9 NA NA NA 3 10 NA NA NA 3 B...

Canonical tidyverse method to update some values of a vector from a look-up table

- April 30, 2021

22 3 I frequently need to recode some (not all!) values in a data frame column based off of a look-up table. I'm not satisfied by the ways I know of to solve the problem. I'd like to be able to do it in a clear, stable, and efficient way. Before I write my own function, I'd want to make sure I'm not duplicating something standard that's already out there. ## Toy example data = data.frame( id = 1:7, x = c("A", "A", "B", "C", "D", "AA", ".") ) lookup = data.frame( old = c("A", "D", "."), new = c("a", "d", "!") ) ## desired result # id x # 1 1 a # 2 2 a # 3 3 B # 4 4 C # 5 5 d # 6 6 AA # 7 7 ! I c...