Skip to content

Base data generator

Defines an abstract base class to generate data.

The end goal is to index documents into a Vector DB.

Authors

Timotius Nugroho Chandra (timotius.n.chandra@gdplabs.id)

Reviewers

Henry Wicaksono (henry.wicaksono@gdplabs.id) Kevin Susanto (kevin.susanto@gdplabs.id)

References

[1] Golf - GLAIR Integration (https://docs.google.com/drawings/d/1o3-7loAj9fbJGqxHndIEiYTEABtGFRw58U1BSR4gioE)

BaseDataGenerator

Bases: ABC

Base class for data generator.

generate(elements, **kwargs) abstractmethod

Generates data for a list of chunks.

Parameters:

Name Type Description Default
elements Any

The elements to be used for generating data / metadata. ideally formatted as List[Dict].

required
**kwargs Any

Additional keyword arguments for customization.

{}

Returns:

Name Type Description
Any Any

The generated data, ideally formatted as List[Dict]. Each dictionary within the list are recommended to follows the structure of model 'Element', to ensure consistency and ease of use across Document Processing Orchestrator.