edsnlp.pipes.ner.frailty.utils
make_assign_regex [source]
Format a list of regex to make an 'assign' parameter.
This function merges multiple regex into one big 'OR' matching group, that can be used as the 'assign' parameter of a contextual matcher's pattern.
make_status_assign [source]
Function to create common assign dictionnaries.
The priority argument serves to indicate whether the assign dict should have priority on the initial regex regarding severity status.
make_include_dict_from_list [source]
Function to merge several dictionnaries into one suitable for an 'include'.
The typical use-case is when multiple 'assigns' are possible, and we want to return a match only when at least one of these 'assigns' is matched, ie we make the 'include' an OR statement. If we were to have the 'include' be a list of dictionnaries similar to the 'assign' list, it would instead be an AND statement.
normalize_space_characters [source]
Function to normalize space characters in regex.
This function can be used to keep regex definitions human-readable, while still processing correctly spaces. This function is useful since during development, it was found out some specific edge cases required the domain matchers to set the 'ignore_space_tokens' argument of their ContextualMatcher to False.