Skip to content

edsnlp.pipes.trainable.embeddings.text_cnn.text_cnn

TextCnnEncoder [source]

Bases: WordContextualizerComponent

The eds.text_cnn component is a simple 1D convolutional network to contextualize word embeddings (as computed by the embedding component passed as argument).

To be memory efficient when handling batches of variable-length sequences, this module employs sequence packing, while taking care of avoiding contamination between the different docs.

Parameters

PARAMETER DESCRIPTION
nlp

The pipeline object

TYPE: PipelineProtocol

name

The name of the component

TYPE: str

embedding

Embedding module to apply to the input

TYPE: TorchComponent[WordEmbeddingBatchOutput, BatchInput]

output_size

Size of the output embeddings Defaults to the input_size

TYPE: Optional[int] DEFAULT: None

out_channels

Number of channels

TYPE: int DEFAULT: None

kernel_sizes

Window size of each kernel

TYPE: Sequence[int] DEFAULT: (3, 4, 5)

activation

Activation function to use

TYPE: str DEFAULT: relu

residual

Whether to use residual connections

TYPE: bool DEFAULT: True

normalize

Whether to normalize before or after the residual connection

TYPE: Literal['pre', 'post', 'none'] DEFAULT: pre

forward [source]

Encode embeddings with a 1d convolutional network

Parameters

PARAMETER DESCRIPTION
batch
  • embeddings: embeddings of shape (batch_size, seq_len, input_size)
  • mask: mask of shape (batch_size, seq_len)

TYPE: BatchInput

RETURNS DESCRIPTION
WordEmbeddingBatchOutput
  • embeddings: encoded embeddings of shape (batch_size, seq_len, input_size)
  • mask: (same) mask of shape (batch_size, seq_len)