Base data generator
Defines an abstract base class to generate data.
The end goal is to index documents into a Vector DB.
Reviewers
Henry Wicaksono (henry.wicaksono@gdplabs.id) Kevin Susanto (kevin.susanto@gdplabs.id)
References
[1] Golf - GLAIR Integration (https://docs.google.com/drawings/d/1o3-7loAj9fbJGqxHndIEiYTEABtGFRw58U1BSR4gioE)
BaseDataGenerator
Bases: ABC
Base class for data generator.
generate(elements, **kwargs)
abstractmethod
Generates data for a list of chunks.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
elements |
Any
|
The elements to be used for generating data / metadata. ideally formatted as List[Dict]. |
required |
**kwargs |
Any
|
Additional keyword arguments for customization. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
Any |
Any
|
The generated data, ideally formatted as List[Dict]. Each dictionary within the list are recommended to follows the structure of model 'Element', to ensure consistency and ease of use across Document Processing Orchestrator. |