Ft libraries
Base interfaces for fine-tuning framework libraries.
This module defines the protocols and base classes for implementing fine-tuning libraries that can be used with the gllm-training library.
BaseFineTuningLibrary
Bases: ABC
Base class for fine-tuning library implementations that extends Component.
framework_name: str
property
Return the name of the framework.
Returns:
| Type | Description |
|---|---|
str
|
Framework name (e.g., "unsloth", "transformers"). |
load_model(experiment_args, hyperparameters, **kwargs)
abstractmethod
Load a model based on the provided configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
experiment_args |
ExperimentConfig
|
Configuration for the experiment. |
required |
hyperparameters |
FinetunedHyperparameters
|
Configuration for the hyperparameters. |
required |
**kwargs |
Any
|
Additional framework-specific arguments. |
{}
|
Returns:
| Type | Description |
|---|---|
|
A tuple containing the loaded model and tokenizer. |
save_model(model, path, **kwargs)
abstractmethod
Save the model to the specified path.
The associated tokenizer should also be saved, often to the same path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model |
The model instance to save. |
required | |
path |
str
|
The directory path to save the model to. |
required |
**kwargs |
Any
|
Additional framework-specific arguments for saving. |
{}
|
save_tokenizer(tokenizer, path, **kwargs)
abstractmethod
Saves the tokenizer to the specified path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tokenizer |
The tokenizer instance to save. |
required | |
path |
str
|
The directory path to save the tokenizer to. |
required |
**kwargs |
Any
|
Additional framework-specific arguments. |
{}
|
setup_training(model, tokenizer, hyperparameters, train_dataset, eval_dataset=None, dataset_text_field='prompt', **kwargs)
abstractmethod
Set up a trainer for the model based on the provided configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model |
The model instance to train. |
required | |
tokenizer |
The tokenizer instance to use with the model and data. |
required | |
hyperparameters |
FinetunedHyperparameters
|
Configuration for the hyperparameters. |
required |
train_dataset |
Dataset
|
The training dataset. |
required |
eval_dataset |
Dataset
|
The evaluation dataset (optional). |
None
|
dataset_text_field |
str
|
Field in dataset containing the text data. Defaults to "prompt". |
'prompt'
|
**kwargs |
Any
|
Additional framework-specific arguments. |
{}
|
Returns:
| Type | Description |
|---|---|
|
The initialized trainer instance. |
FineTuningLibraryProtocol
Bases: Protocol[ModelType_co, TrainerType_co, TokenizerType]
Protocol that all fine-tuning frameworks must implement.
framework_name: str
property
Return the name of the framework.
Returns:
| Type | Description |
|---|---|
str
|
Framework name (e.g., "unsloth", "transformers"). |
load_model(experiment_args, hyperparameters, **kwargs)
Load a model based on the provided configuration.
The associated tokenizer should also be loaded/initialized internally
and be retrievable via get_tokenizer().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
experiment_args |
ExperimentConfig
|
Configuration for the experiment. |
required |
hyperparameters |
FinetunedHyperparameters
|
Configuration for the hyperparameters. |
required |
**kwargs |
Any
|
Additional framework-specific arguments. |
{}
|
Returns:
| Type | Description |
|---|---|
Tuple[ModelType_co, TokenizerType]
|
A tuple containing the loaded model and tokenizer. |
save_model(model, path, **kwargs)
Save the model to the specified path.
save_tokenizer(tokenizer, path, **kwargs)
Saves the tokenizer to the specified path.
setup_training(model, tokenizer, hyperparameters, train_dataset, eval_dataset=None, dataset_text_field='prompt', **kwargs)
Set up a trainer for the model based on the provided configuration.