parse_by_speaker.Rd
This is a function used by `parse_cr`. It takes takes a vector of regular expressions (`speaker_list`) and a file path (`file`). It uses the `all_text` function to read the text file. It uses the `escape_specials` function to clean the speaker list and creates a regular expression of strings (speaker names) at which to split the text. Finally it splits the text by speakerand returns a vector of speeches.
parse_by_speaker(speaker_list, file)
A path to a txt file to be paresed
A character vector of texts parsed by speaker
This is a helper function for the main user-facing function `cr_parse()`
## The function is currently defined as
function (speaker_list, file)
{
speaker_list %<>% escape_specials()
text <- all_text(file)
text <- str_c(":::", text)
speaker_pattern <- speaker_list %>% str_replace_all(";",
"|")
t <- text %>% str_split(speaker_pattern) %>% unlist()
extracted <- text %>% str_extract_all(speaker_pattern) %>%
unlist()
s <- c("header", extracted) %>% str_c(" :::")
speech <- map2(.x = s, .y = t, .f = paste)
return(speech)
}
#> function (speaker_list, file)
#> {
#> speaker_list %<>% escape_specials()
#> text <- all_text(file)
#> text <- str_c(":::", text)
#> speaker_pattern <- speaker_list %>% str_replace_all(";",
#> "|")
#> t <- text %>% str_split(speaker_pattern) %>% unlist()
#> extracted <- text %>% str_extract_all(speaker_pattern) %>%
#> unlist()
#> s <- c("header", extracted) %>% str_c(" :::")
#> speech <- map2(.x = s, .y = t, .f = paste)
#> return(speech)
#> }
#> <environment: 0x1261b6ce8>