Schema
Schema package for gllm_training.
This module re-exports all schema components from various modules for backward compatibility.
Reviewer
- Muhammad Afif Al Hawari (muhammad.a.a.hawari@gdplabs.id)
References
NONE
ColumnMappingConfig
Bases: BaseModel
Defines the structure for column mapping configuration.
Attributes:
| Name | Type | Description |
|---|---|---|
input_columns |
Dict[str, str]
|
Mapping of template variables to DataFrame column names. |
output_columns |
List[str]
|
List of column names for output. |
label_columns |
Optional[str]
|
Column name for labels. |
ExperimentConfig
Bases: BaseModel
Defines the configuration for a single fine-tuning experiment.
Attributes:
| Name | Type | Description |
|---|---|---|
experiment_id |
str
|
The ID of the experiment. |
hyperparameters |
Hyperparameters
|
The hyperparameters for the experiment. |
hyperparameters_id |
str
|
The ID of the hyperparameters. |
topic |
str
|
The topic of the experiment. |
model_name |
str
|
The name of the model. |
framework |
FinetuningLibraryTypes
|
The fine-tuning framework to use. |
multimodal |
bool
|
Whether the model is multimodal. |
datasets_path |
str | None
|
The path to the datasets directory. |
train_filename |
str | None
|
CSV filename for training data. |
validation_filename |
str | None
|
CSV filename for validation data. |
prompt_filename |
str | None
|
CSV filename for prompt templates. |
spreadsheet_id |
str | None
|
The ID of the Google Sheets spreadsheet. |
google_sheets_client_email |
str | None
|
Google Sheets client email. |
google_sheets_private_key |
str | None
|
Google Sheets private key. |
google_scopes |
list[str] | None
|
Google scopes for service account. |
google_token_uri |
str | None
|
Google token URI for service account. |
train_sheet |
str
|
The name of the training sheet. |
validation_sheet |
str | None
|
The name of the validation sheet. |
prompt_sheet |
str
|
The name of the prompt sheet. |
prompt_name |
str | None
|
The name of the prompt. |
dataset_text_field |
str
|
The text field in the dataset. |
column_mapping_config |
dict[str, Any] | None
|
Custom configuration for column mappings. |
save_processed_dataset |
bool
|
Whether to save the processed dataset. |
output_processed_dir |
str
|
The output directory for the processed dataset. |
model_path |
str | None
|
The path to the fine-tuned adapter or full model. |
FinetunedHyperparameters
Bases: BaseModel
Defines the hyperparameters for a fine-tuning experiment.
Attributes:
| Name | Type | Description |
|---|---|---|
hyperparameters_id |
str
|
The ID of the hyperparameters. |
model_settings |
ModelConfig
|
The model configuration. |
lora_config |
LoraConfig
|
The LoRA configuration. |
training_config |
TrainingConfig
|
The training configuration. |
storage_config |
StorageConfig
|
The storage configuration. |
FrameworkType
Bases: StrEnum
Defines valid fine-tuning frameworks.
GoogleSheetsAuthentication
Bases: BaseModel
Defines valid authentication parameters for Google Sheets API interaction.
Attributes:
| Name | Type | Description |
|---|---|---|
spreadsheet_id |
str
|
The ID of the Google Sheets spreadsheet. |
client_email |
str
|
The client email for the Google Sheets API. |
private_key |
str
|
The private key for the Google Sheets API. |
google_token_uri |
str
|
The Google token URI for the Google Sheets API. |
StorageConfig
Bases: BaseModel
Defines the storage configuration for model uploads.
Attributes:
| Name | Type | Description |
|---|---|---|
bucket_name |
str
|
The name of the bucket. |
upload_to_cloud |
bool
|
Whether to upload the model to the cloud. |
object_prefix |
str
|
The object prefix in the bucket. |
endpoint_url |
str | None
|
The complete URL to use for the storage client. |
access_key_id |
str | None
|
The access key ID for cloud storage account. |
secret_access_key |
str | None
|
The secret access key for cloud storage account. |
region |
str | None
|
The region to use when creating the storage client. |
provider |
str
|
The cloud provider. |