Creaes a new column in your dataframe based on a subset of list-column values following a certain patten. For example, this is useful if you have labels you always apply to a repository with a set structure, e.g. key-value pairs like "priority:high", "priority:medium", and "priority:low" or other structures like "engagement-team", "teaching-team", etc. This function could create a new variable (e.g. "priority", "team") with the values encoded within the labels.

listcol_extract(data, col_name, regex, new_col_name = NULL, keep_regex = FALSE)

Arguments

data

Dataframe containing a list column (e.g. an issues dataframe)

col_name

Character string containing column name of list column (e.g. labels_name or assignees_login)

regex

Character string of regular expression to identify list items of interest (e.g. "^priority:", "^(bug|feature)$")

new_col_name

Optional name of new column. Otherwise regex is used, stripped of any leading or trailing punctuation

keep_regex

Optional logical denoting whether to keep regex part of matched item in value. Defaults to FALSE

Value

Dataframe with new column taking values extracted from list column

Details

This function works only if each observatino contains at most one instance of a given patterns. When multiple labels match the same pattern, one is returned at random.

Examples

if (FALSE) { issues <- get_issues(repo) issues_df <- parse_issues(issues) listcol_extract(issues_df, "labels_name", "-team$") }