Skip to content

edsnlp.processing.deprecated_pipe

slugify [source]

Slugify a chained attribute name

Parameters

PARAMETER DESCRIPTION
chained_attr

The string to slugify (replace dots by _)

TYPE: str

RETURNS DESCRIPTION
str

The slugified string

pipe [source]

Helper to process a pandas, koalas or spark dataframe. This function is deprecated. Prefer using the following instead:

import edsnlp

docs = edsnlp.data.from_***(
    df,
    converter='omop',
    doc_attributes=context,
)
docs = docs.map_pipeline(nlp)
res = edsnlp.data.to_***(
    docs,
    converter='ents',  # or custom extractor
    span_getter="ents",
    span_attributes=span_attributes,
    **kwargs
)

You can also call this function to get a migration suggestion.

Parameters

PARAMETER DESCRIPTION
df

The dataframe to process, can be a pandas, spark or koalas dataframe

TYPE: Union[DataFrame, DataFrame, DataFrame]

nlp

The pipeline to use

TYPE: PipelineProtocol

n_jobs

Number of CPU workers to use

TYPE: int DEFAULT: -2

context

List of context attributes to keep

TYPE: List[str] DEFAULT: []

results_extractor

Function to extract results from the pipeline. Defaults to one row per entities.

TYPE: Optional[Callable[[Doc], List[Dict[str, Any]]]] DEFAULT: None

additional_spans

Additional spans groups to keep, defaults to ents (doc.ents)

TYPE: SpanGetterArg DEFAULT: []

extensions

Span extensions to export as a column. Can be a list of extension names, a dict of extension names to types, or a string

TYPE: ExtensionSchema DEFAULT: []

dtypes

Spark schema to use for the output dataframe. This is only used if the input dataframe is a spark dataframe.

TYPE: Any DEFAULT: None

kwargs

Additional keyword arguments to pass to the edsnlp.data.to_* function

DEFAULT: {}

RETURNS DESCRIPTION
Union[DataFrame, DataFrame, DataFrame]

The processed dataframe