edsnlp.pipes.core.normalizer.factory
create_component [source]
Normalisation pipeline. Modifies the NORM attribute, acting on five dimensions :
lowercase: using the defaultNORMaccents: deterministic and fixed-length normalisation of accents.quotes: deterministic and fixed-length normalisation of quotation marks.spaces: "removal" of spaces tokens (via the tag_ attribute).pollution: "removal" of pollutions (via the tag_ attribute).
Parameters
| PARAMETER | DESCRIPTION |
|---|---|
nlp | The pipeline object. TYPE: |
name | The component name. TYPE: |
lowercase | Whether to remove case. TYPE: |
accents |
TYPE: |
quotes |
TYPE: |
spaces |
TYPE: |
pollution | Optional TYPE: |