Skip to content

edsnlp.pipes.core.contextual_matcher.contextual_matcher

ContextualMatcher

Bases: BaseNERComponent

Allows additional matching in the surrounding context of the main match group, for qualification/filtering.

Parameters

PARAMETER DESCRIPTION
nlp

spaCy Language object.

TYPE: PipelineProtocol DEFAULT: None

name

The name of the pipe

TYPE: Optional[str] DEFAULT: None

patterns
The patterns to match
PARAMETER DESCRIPTION
span_getter

A span getter to pick the assigned spans from already extracted entities in the doc.

TYPE: Optional[SpanGetterArg]

regex

A single Regex or a list of Regexes

TYPE: ListOrStr

regex_attr

An attributes to overwrite the given attr when matching with Regexes.

TYPE: Optional[str]

regex_flags

Regex flags

terms

A single term or a list of terms (for exact matches)

TYPE: Union[RegexFlag, int]

exclude
One or more exclusion patterns
PARAMETER DESCRIPTION
regex

A single Regex or a list of Regexes

TYPE: ListOrStr

regex_attr

An attributes to overwrite the given attr when matching with Regexes.

TYPE: Optional[str]

regex_flags

Regex flags

TYPE: RegexFlag

span_getter

A span getter to pick the assigned spans from already extracted entities.

TYPE: Optional[SpanGetterArg]

window

Context window to search for patterns around the anchor. Defaults to "sent" ( i.e. the sentence of the anchor span).

TYPE: Optional[ContextWindow]

TYPE: AsList[SingleExcludeModel]

include
One or more inclusion patterns
PARAMETER DESCRIPTION
regex

A single Regex or a list of Regexes

TYPE: ListOrStr

regex_attr

An attributes to overwrite the given attr when matching with Regexes.

TYPE: Optional[str]

regex_flags

Regex flags

TYPE: RegexFlag

span_getter

A span getter to pick the assigned spans from already extracted entities.

TYPE: Optional[SpanGetterArg]

window

Context window to search for patterns around the anchor. Defaults to "sent" ( i.e. the sentence of the anchor span).

TYPE: Optional[ContextWindow]

TYPE: AsList[SingleIncludeModel]

assign
One or more assignment patterns
PARAMETER DESCRIPTION
span_getter

A span getter to pick the assigned spans from already extracted entities in the doc.

TYPE: Optional[SpanGetterArg]

regex

A single Regex or a list of Regexes

TYPE: ListOrStr

regex_attr

An attributes to overwrite the given attr when matching with Regexes.

TYPE: Optional[str]

regex_flags

Regex flags

TYPE: RegexFlag

window

Context window to search for patterns around the anchor. Defaults to "sent" ( i.e. the sentence of the anchor span).

TYPE: Optional[ContextWindow]

replace_entity

If set to True, the match from the corresponding assign key will be used as entity, instead of the main match. See this paragraph

TYPE: Optional[bool]

reduce_mode

Set how multiple assign matches are handled. See the documentation of the reduce_mode parameter

TYPE: Optional[Flags]

required

If set to True, the assign key must match for the extraction to be kept. If it does not match, the extraction is discarded.

TYPE: Optional[str]

TYPE: AsList[SingleAssignModel]

source

A label describing the pattern

TYPE: str

TYPE: FullConfig

assign_as_span

Whether to store eventual extractions defined via the assign key as Spans or as string

TYPE: bool DEFAULT: False

attr

Attribute to match on, eg TEXT, NORM, etc.

TYPE: str DEFAULT: NORM

ignore_excluded

Whether to skip excluded tokens during matching.

TYPE: bool DEFAULT: False

ignore_space_tokens

Whether to skip space tokens during matching.

TYPE: bool DEFAULT: False

alignment_mode

Overwrite alignment mode.

TYPE: str DEFAULT: expand

regex_flags

RegExp flags to use when matching, filtering and assigning (See here)

TYPE: Union[RegexFlag, int] DEFAULT: 0

include_assigned

Whether to include (eventual) assign matches to the final entity

TYPE: bool DEFAULT: False

label_name

Deprecated, use label instead. The label to assign to the matched entities

TYPE: Optional[str] DEFAULT: None

label

The label to assign to the matched entities

TYPE: str DEFAULT: None

span_setter

How to set matches on the doc

TYPE: SpanSetterArg DEFAULT: {'ents': True}

set_extensions

Define the extensions used by the component

filter_one

Filter extracted entity based on the exclusion and inclusion filters of the configuration.

Parameters

PARAMETER DESCRIPTION
span

Span to filter

TYPE: Span

RETURNS DESCRIPTION
Optional[Span]

None if the span was filtered, the span else

assign_one

Get additional information in the context of each entity. This function will populate two custom attributes:

  • ent._.source
  • ent._.assigned, a dictionary with all retrieved information

Parameters

PARAMETER DESCRIPTION
span

Span to enrich

TYPE: Span

RETURNS DESCRIPTION
List[Span]

Spans with additional information

process_one

Processes one span, applying both the filters and the assignments

Parameters

PARAMETER DESCRIPTION
span

Span object

TYPE: Span

pattern

TYPE: SingleConfig

YIELDS DESCRIPTION
span

Filtered spans, with optional assignments

process

Process the document, looking for named entities.

Parameters

PARAMETER DESCRIPTION
doc

spaCy Doc object

TYPE: Doc

RETURNS DESCRIPTION
List[Span]

List of detected spans.

__call__

Adds spans to document.

Parameters

PARAMETER DESCRIPTION
doc

spaCy Doc object

TYPE: Doc

RETURNS DESCRIPTION
doc

spaCy Doc object, annotated for extracted terms.

TYPE: Doc