Extract new dataframe column from list-column matching pattern

Creaes a new column in your dataframe based on a subset of list-column values following a certain patten. For example, this is useful if you have labels you always apply to a repository with a set structure, e.g. key-value pairs like "priority:high", "priority:medium", and "priority:low" or other structures like "engagement-team", "teaching-team", etc. This function could create a new variable (e.g. "priority", "team") with the values encoded within the labels.

listcol_extract(data, col_name, regex, new_col_name = NULL, keep_regex = FALSE)

Arguments

data	Dataframe containing a list column (e.g. an issues dataframe)
col_name	Character string containing column name of list column (e.g. `labels_name` or `assignees_login`)
regex	Character string of regular expression to identify list items of interest (e.g. `"^priority:", "^(bug\|feature)$"`)
new_col_name	Optional name of new column. Otherwise `regex` is used, stripped of any leading or trailing punctuation
keep_regex	Optional logical denoting whether to keep regex part of matched item in value. Defaults to `FALSE`

Value

Dataframe with new column taking values extracted from list column

Details

This function works only if each observatino contains at most one instance of a given patterns. When multiple labels match the same pattern, one is returned at random.

Examples

if (FALSE) {
issues <- get_issues(repo)
issues_df <- parse_issues(issues)
listcol_extract(issues_df, "labels_name", "-team$")
}