How to count the number of matched strings in R, when the string pattern to match is a column from another dataframe?

4

I have got two extremely large dataframes, the first data frame consists of a column body, which is a list of comments and the second one consists of names. I want to count how many elements in body contain each element of names. Here's a small reproducible dataset (the original dataset has about 2000 names, where each name is a name of the car):

df1 <- tibble(body = c("The Tesla Roadster has a range of 620 miles",
                       "ferrari needs to make an electric car",
                       "How much does a tesla cost?",
                       "When is the new Mercedes releasing?",
                       "Can't wait to get my hands on the new Tesla"))

df2 <- tibble(names = c("FORD", "TESLA", "MERCEDES", "FERRARI", "JAGUAR", "HYUNDAI"))

As mentioned above, I am trying to count the number of times each of the values in names occur in body and then preferably, I want to add it as a column in df2. I have tried it in the following way:

counter = c()
for (i in df2$names) {
  counter[i] = sum(ifelse(str_detect(df1$body, i),1, 0))
}

While this method works, it takes extremely long amounts of time and returns a vector where the names are attributes of the counter values, I then unstack it and join the dataframe to df2 using names as keys. This is the only method that works, apart from that I have tried using str_count but with my current proficiency in R, the code was absolutely horrible and got me nowhere.

Is there a more efficient way of finding the matching strings? I have tried to find similar questions on stack but to no avail!

Many Thanks in advance :)

Share
Improve this question
New contributor
krp007 is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
0

Comments

Popular posts from this blog

Meaning of `{}` for return expression

Get current scroll position of ScrollView in React Native

React Native - Image Cache