Skip to content

Sft Trainer

SFT trainer package for fine-tuning language models.

This package provides functionality for supervised fine-tuning of large language models.

Authors
  • Alfan Dinda Rahmawan (alfan.d.rahmawan@gdplabs.id)
Reviewers
  • Muhammad Afif Al Hawari (muhammad.a.a.hawari@gdplabs.id)
References

NONE

BaseTrainer

Bases: ABC

Base class for model trainers.

This class defines the common interface that all trainers must implement. Each concrete trainer should provide implementation for training models and handling the training lifecycle from initialization to model saving.

The interface is designed to be consistent across different trainer types, making it easy to switch between implementations, add new ones, and maintain the factory pattern architecture.

save_model(model_path=None, results=None, **kwargs) abstractmethod

Saves model artifacts, handling both trained and existing models.

This method provides a unified interface for: 1. Saving a newly trained model (when results is provided) 2. Uploading an existing model (when model_path is provided)

Parameters:

Name Type Description Default
model_path str

Path to existing model artifacts.

None
results dict

Training results for newly trained models.

None
**kwargs Any

Configuration parameters for model saving.

{}

train(**kwargs) abstractmethod

Train the model using the prepared data and configuration.

Parameters:

Name Type Description Default
**kwargs Any

Arbitrary keyword arguments for evaluation parameters.

{}

SFTComponents(experiment_args, hyperparameters)

Initializes and holds the core components required for SFT fine-tuning.

This class orchestrates the setup of the model, tokenizer, datasets, and the trainer instance based on the provided configuration arguments. It acts as a container for all the artifacts needed to run the training process.

Initializes a new instance of the SFTComponents class.

This constructor handles data loading, model and tokenizer preparation, prompt processing, and dataset creation.

Parameters:

Name Type Description Default
experiment_args ExperimentConfig

The configuration for the experiment setup.

required
hyperparameters FinetunedHyperparameters

The configuration for the hyperparameters.

required

Raises:

Type Description
DataSourceNotProvidedError

If no valid data source is specified in experiment_args.

PromptTemplateError

If there's an issue with loading or processing prompts.

DatasetProcessorError

If dataset preprocessing fails.

ValueError

If the specified fine-tuning framework is not recognized or if trainer preparation fails.

RuntimeError

If any critical component initialization fails.

SFTTrainer(model_name, datasets_path=None, train_filename=None, validation_filename=None, prompt_filename=None, spreadsheet_id=None, train_sheet=None, validation_sheet=None, prompt_sheet=None, prompt_name=None, experiment_id=None, hyperparameters=None, storage_config=None, resume_from_checkpoint=None, topic=None, framework=None, multimodal=None, column_mapping_config=None, save_processed_dataset=True, output_processed_dir=None, dataset_text_field=None)

Bases: BaseTrainer

SFT Trainer for fine-tuning models using various frameworks.

This trainer orchestrates the fine-tuning process, from data preparation to model training and saving. It uses a set of configuration objects to customize the training behavior.

Initialize the SFT Trainer.

This method sets up the environment, initializes the configuration objects, and checks for hardware requirements.

Parameters:

Name Type Description Default
model_name `str`

Model to fine-tune. Default is "Qwen/Qwen3-1.7b".

required
datasets_path `str`, *optional*

Path to datasets. Default is None.

None
train_filename `str`, *optional*

CSV filename for training data. Default is "training_data.csv".

None
validation_filename `str`, *optional*

CSV filename for validation data. Default is "validation_data.csv".

None
prompt_filename `str`, *optional*

CSV filename for prompt templates. Default is "prompt_data.csv".

None
spreadsheet_id `str`, *optional*

Google Sheets spreadsheet ID. Default is None.

None
train_sheet `str`, *optional*

Name of the training sheet. Default is "training_data".

None
validation_sheet `str`, *optional*

Name of the validation sheet. Default is "validation_data".

None
prompt_sheet `str`, *optional*

Name of the prompt sheet. Default is "prompt_data".

None
prompt_name `str`, *optional*

Name of the prompt template. Default is "prompt_default".

None
experiment_id `str`, *optional*

ID of the experiment. Default is "sft_experiment_1".

None
hyperparameters `FinetunedHyperparameters`, *optional*

Configuration for training hyperparameters. Default is a new instance of FinetunedHyperparameters.

None
storage_config `StorageConfig`, *optional*

Configuration for model storage. Default is a new instance of StorageConfig.

None
resume_from_checkpoint `bool`, *optional*

Whether to resume from checkpoint. Default is taken from hyperparameters.resume_from_checkpoint.

None
topic `str`, *optional*

Topic of the experiment. Default is "SLM_FINETUNING".

None
framework `str`, *optional*

Fine-tuning framework to use. Default is "unsloth".

None
multimodal `bool`, *optional*

Whether the model is multimodal. Default is False.

None
column_mapping_config `ColumnMappingConfig`, *optional*

Configuration for column mappings. Default is a new instance of ColumnMappingConfig.

None
save_processed_dataset `bool`, *optional*, defaults to True

Whether to save processed dataset. Default is True.

True
output_processed_dir `str`, *optional*

Output directory for processed dataset. Default is "data/dataset".

None
dataset_text_field `str`, *optional*

Name of the field in the dataset that contains the text to be finetuned. Default is "prompt".

None

save_model(results=None, model_path=None)

Saves model artifacts, handling both trained and existing models.

This method provides a unified interface for: 1. Saving a newly trained model (when results is provided) 2. Uploading an existing model (when model_path is provided)

Parameters:

Name Type Description Default
results dict

Training results for newly trained models.

None
model_path str

Path to the model directory.

None

Returns:

Name Type Description
str

Path where the model was saved or uploaded.

Raises:

Type Description
ValueError

If neither model_path nor results is provided, or if configuration is invalid.

RuntimeError

If the save operation fails.

train()

Runs the complete model fine-tuning and evaluation process.

This method orchestrates the entire workflow, initializing components, running the training loop, and saving the results.

Returns:

Type Description
dict[str, Any]

dict[str, Any]: A dictionary containing training metrics and results, such as training loss, evaluation loss, and runtime.

Raises:

Type Description
ValueError

If required training parameters are missing or invalid.

RuntimeError

If components fail to initialize, the model or trainer is not set up correctly, or if the training process itself fails for any reason.

SFTValidator(components=None)

Validator for SFT training components and paths.

This class provides validation methods for models, paths, and other components required for Supervised Fine-Tuning.

Initialize the validator with SFT components.

Parameters:

Name Type Description Default
components Optional[Any]

SFT components to validate.

None

update_components(components)

Update the components reference.

Parameters:

Name Type Description Default
components Any

New SFT components to validate.

required

validate_components_initialized()

Ensure required components are initialized.

Raises:

Type Description
RuntimeError

If components are not properly initialized.

validate_model_for_saving()

Validate that model is ready for saving.

Raises:

Type Description
RuntimeError

If model is not ready for saving.

validate_model_path(model_path) staticmethod

Validate that the model path exists and is accessible.

Parameters:

Name Type Description Default
model_path str

Path to the model directory.

required

Raises:

Type Description
FileNotFoundError

If the model path doesn't exist.