Skip to content

Base data generator

Defines an abstract base class to generate data.

The end goal is to index documents into a Vector DB.

BaseDataGenerator

Bases: ABC

Base class for data generator.

generate(elements, **kwargs) abstractmethod

Generates data for a list of chunks.

Parameters:

Name Type Description Default
elements Any

The elements to be used for generating data / metadata. ideally formatted as List[Dict].

required
**kwargs Any

Additional keyword arguments for customization.

{}

Returns:

Name Type Description
Any Any

The generated data, ideally formatted as List[Dict]. Each dictionary within the list are recommended to follows the structure of model 'Element', to ensure consistency and ease of use across Document Processing Orchestrator.