Skip to content

Chunker

Package containing Chunker modules.

Modules:

Name Description
BaseChunker

An abstract base class for chunker.

Authors

Devita (devita1@gdplabs.id)

Reviewers

Timotius Nugroho Chandra (timotius.n.chandra@gdplabs.id)

BaseChunker

Bases: ABC

An abstract base class for chunker.

This class segmenting or chunking elements based on contextual information. Subclasses are expected to implement the 'chunk' method to handle chunking elements.

Methods:

Name Description
chunk

Abstract method to chunk a document.

chunk(elements, **kwargs) abstractmethod

Chunk a document.

This method is abstract and must be implemented in subclasses. It defines the process of chunking information from elements.

Parameters:

Name Type Description Default
elements Any

The information to be chunked. ideally formatted as List[Dict].

required
**kwargs Any

Additional keyword arguments for customization.

{}

Returns:

Name Type Description
Any Any

The chunked information, ideally formatted as List[Dict]. Each dictionary within the list are recommended to follows the structure of model 'Element', to ensure consistency and ease of use across Document Processing Orchestrator.