Sft Trainer
SFT trainer package for fine-tuning language models.
This package provides functionality for supervised fine-tuning of large language models.
Reviewers
- Muhammad Afif Al Hawari (muhammad.a.a.hawari@gdplabs.id)
References
NONE
BaseTrainer
Bases: ABC
Base class for model trainers.
This class defines the common interface that all trainers must implement. Each concrete trainer should provide implementation for training models and handling the training lifecycle from initialization to model saving.
The interface is designed to be consistent across different trainer types, making it easy to switch between implementations, add new ones, and maintain the factory pattern architecture.
save_model(model_path=None, results=None, **kwargs)
abstractmethod
Saves model artifacts, handling both trained and existing models.
This method provides a unified interface for: 1. Saving a newly trained model (when results is provided) 2. Uploading an existing model (when model_path is provided)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_path |
str
|
Path to existing model artifacts. |
None
|
results |
dict
|
Training results for newly trained models. |
None
|
**kwargs |
Any
|
Configuration parameters for model saving. |
{}
|
train(**kwargs)
abstractmethod
Train the model using the prepared data and configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs |
Any
|
Arbitrary keyword arguments for evaluation parameters. |
{}
|
SFTComponents(experiment_args, hyperparameters)
Initializes and holds the core components required for SFT fine-tuning.
This class orchestrates the setup of the model, tokenizer, datasets, and the trainer instance based on the provided configuration arguments. It acts as a container for all the artifacts needed to run the training process.
Initializes a new instance of the SFTComponents class.
This constructor handles data loading, model and tokenizer preparation, prompt processing, and dataset creation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
experiment_args |
ExperimentConfig
|
The configuration for the experiment setup. |
required |
hyperparameters |
FinetunedHyperparameters
|
The configuration for the hyperparameters. |
required |
Raises:
| Type | Description |
|---|---|
DataSourceNotProvidedError
|
If no valid data source is specified in |
PromptTemplateError
|
If there's an issue with loading or processing prompts. |
DatasetProcessorError
|
If dataset preprocessing fails. |
ValueError
|
If the specified fine-tuning framework is not recognized or if trainer preparation fails. |
RuntimeError
|
If any critical component initialization fails. |
SFTTrainer(model_name, datasets_path=None, train_filename=None, validation_filename=None, prompt_filename=None, spreadsheet_id=None, train_sheet=None, validation_sheet=None, prompt_sheet=None, prompt_name=None, experiment_id=None, hyperparameters=None, storage_config=None, resume_from_checkpoint=None, topic=None, framework=None, multimodal=None, column_mapping_config=None, save_processed_dataset=True, output_processed_dir=None, dataset_text_field=None)
Bases: BaseTrainer
SFT Trainer for fine-tuning models using various frameworks.
This trainer orchestrates the fine-tuning process, from data preparation to model training and saving. It uses a set of configuration objects to customize the training behavior.
Initialize the SFT Trainer.
This method sets up the environment, initializes the configuration objects, and checks for hardware requirements.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name |
`str`
|
Model to fine-tune. Default is "Qwen/Qwen3-1.7b". |
required |
datasets_path |
`str`, *optional*
|
Path to datasets. Default is None. |
None
|
train_filename |
`str`, *optional*
|
CSV filename for training data. Default is "training_data.csv". |
None
|
validation_filename |
`str`, *optional*
|
CSV filename for validation data. Default is "validation_data.csv". |
None
|
prompt_filename |
`str`, *optional*
|
CSV filename for prompt templates. Default is "prompt_data.csv". |
None
|
spreadsheet_id |
`str`, *optional*
|
Google Sheets spreadsheet ID. Default is None. |
None
|
train_sheet |
`str`, *optional*
|
Name of the training sheet. Default is "training_data". |
None
|
validation_sheet |
`str`, *optional*
|
Name of the validation sheet. Default is "validation_data". |
None
|
prompt_sheet |
`str`, *optional*
|
Name of the prompt sheet. Default is "prompt_data". |
None
|
prompt_name |
`str`, *optional*
|
Name of the prompt template. Default is "prompt_default". |
None
|
experiment_id |
`str`, *optional*
|
ID of the experiment. Default is "sft_experiment_1". |
None
|
hyperparameters |
`FinetunedHyperparameters`, *optional*
|
Configuration for training hyperparameters. Default is a new instance of FinetunedHyperparameters. |
None
|
storage_config |
`StorageConfig`, *optional*
|
Configuration for model storage. Default is a new instance of StorageConfig. |
None
|
resume_from_checkpoint |
`bool`, *optional*
|
Whether to resume from checkpoint. Default is taken from hyperparameters.resume_from_checkpoint. |
None
|
topic |
`str`, *optional*
|
Topic of the experiment. Default is "SLM_FINETUNING". |
None
|
framework |
`str`, *optional*
|
Fine-tuning framework to use. Default is "unsloth". |
None
|
multimodal |
`bool`, *optional*
|
Whether the model is multimodal. Default is False. |
None
|
column_mapping_config |
`ColumnMappingConfig`, *optional*
|
Configuration for column mappings. Default is a new instance of ColumnMappingConfig. |
None
|
save_processed_dataset |
`bool`, *optional*, defaults to True
|
Whether to save processed dataset. Default is True. |
True
|
output_processed_dir |
`str`, *optional*
|
Output directory for processed dataset. Default is "data/dataset". |
None
|
dataset_text_field |
`str`, *optional*
|
Name of the field in the dataset that contains the text to be finetuned. Default is "prompt". |
None
|
save_model(results=None, model_path=None)
Saves model artifacts, handling both trained and existing models.
This method provides a unified interface for: 1. Saving a newly trained model (when results is provided) 2. Uploading an existing model (when model_path is provided)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results |
dict
|
Training results for newly trained models. |
None
|
model_path |
str
|
Path to the model directory. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
Path where the model was saved or uploaded. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If neither model_path nor results is provided, or if configuration is invalid. |
RuntimeError
|
If the save operation fails. |
train()
Runs the complete model fine-tuning and evaluation process.
This method orchestrates the entire workflow, initializing components, running the training loop, and saving the results.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: A dictionary containing training metrics and results, such as training loss, evaluation loss, and runtime. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If required training parameters are missing or invalid. |
RuntimeError
|
If components fail to initialize, the model or trainer is not set up correctly, or if the training process itself fails for any reason. |
SFTValidator(components=None)
Validator for SFT training components and paths.
This class provides validation methods for models, paths, and other components required for Supervised Fine-Tuning.
Initialize the validator with SFT components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
components |
Optional[Any]
|
SFT components to validate. |
None
|
update_components(components)
Update the components reference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
components |
Any
|
New SFT components to validate. |
required |
validate_components_initialized()
Ensure required components are initialized.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If components are not properly initialized. |
validate_model_for_saving()
Validate that model is ready for saving.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model is not ready for saving. |
validate_model_path(model_path)
staticmethod
Validate that the model path exists and is accessible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_path |
str
|
Path to the model directory. |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the model path doesn't exist. |