Experiment tracker
Base class for all experiment trackers.
References
NONE
BaseExperimentTracker(project_name, **kwargs)
Bases: ABC
Base class for all experiment trackers.
This class defines the core interface for experiment tracking across different backends. It provides methods for logging individual results and batch results using the observability pattern.
Attributes:
Name | Type | Description |
---|---|---|
project_name |
str
|
The name of the project. |
Initialize the experiment tracker.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_name |
str
|
The name of the project. |
required |
**kwargs |
Any
|
Additional configuration parameters. |
{}
|
alog(evaluation_result, dataset_name, data, run_id=None, metadata=None, **kwargs)
abstractmethod
async
Log a single evaluation result (asynchronous).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_result |
EvaluationOutput
|
The evaluation result to log. |
required |
dataset_name |
str
|
Name of the dataset being evaluated. |
required |
data |
MetricInput
|
The input data that was evaluated. |
required |
run_id |
Optional[str]
|
ID of the experiment run. Can be auto-generated if None. |
None
|
metadata |
Optional[Dict[str, Any]]
|
Additional metadata to log. |
None
|
**kwargs |
Any
|
Additional configuration parameters. |
{}
|
get_experiment_history(**kwargs)
Get all experiment runs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs |
Any
|
Additional configuration parameters. |
{}
|
Returns:
Type | Description |
---|---|
List[Dict[str, Any]]
|
List[Dict[str, Any]]: List of experiment runs. |
get_run_results(run_id, **kwargs)
Get detailed results for a specific run.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
run_id |
str
|
ID of the experiment run. |
required |
**kwargs |
Any
|
Additional configuration parameters. |
{}
|
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: Detailed run results. |
log(evaluation_result, dataset_name, data, run_id=None, metadata=None, **kwargs)
abstractmethod
Log a single evaluation result (synchronous).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_result |
EvaluationOutput
|
The evaluation result to log. |
required |
dataset_name |
str
|
Name of the dataset being evaluated. |
required |
data |
MetricInput
|
The input data that was evaluated. |
required |
run_id |
Optional[str]
|
ID of the experiment run. Can be auto-generated if None. |
None
|
metadata |
Optional[Dict[str, Any]]
|
Additional metadata to log. |
None
|
**kwargs |
Any
|
Additional configuration parameters. |
{}
|
log_batch(evaluation_results, dataset_name, data, run_id=None, metadata=None, **kwargs)
abstractmethod
async
Log a batch of evaluation results (asynchronous).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
evaluation_results |
List[EvaluationOutput]
|
The evaluation results to log. |
required |
dataset_name |
str
|
Name of the dataset being evaluated. |
required |
data |
List[MetricInput]
|
List of input data that was evaluated. |
required |
run_id |
Optional[str]
|
ID of the experiment run. Can be auto-generated if None. |
None
|
metadata |
Optional[Dict[str, Any]]
|
Additional metadata to log. |
None
|
**kwargs |
Any
|
Additional configuration parameters. |
{}
|
log_context(**kwargs)
async
Log a context.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs |
Any
|
Additional configuration parameters. |
{}
|