Skip to content

Experiment tracker

Base class for all experiment trackers.

Authors

Apri Dwi Rachmadi (apri.d.rachmadi@gdplabs.id)

References

NONE

BaseExperimentTracker(project_name, **kwargs)

Bases: ABC

Base class for all experiment trackers.

This class defines the core interface for experiment tracking across different backends. It provides methods for logging individual results and batch results using the observability pattern.

Attributes:

Name Type Description
project_name str

The name of the project.

Initialize the experiment tracker.

Parameters:

Name Type Description Default
project_name str

The name of the project.

required
**kwargs Any

Additional configuration parameters.

{}

alog(evaluation_result, dataset_name, data, run_id=None, metadata=None, **kwargs) abstractmethod async

Log a single evaluation result (asynchronous).

Parameters:

Name Type Description Default
evaluation_result EvaluationOutput

The evaluation result to log.

required
dataset_name str

Name of the dataset being evaluated.

required
data MetricInput

The input data that was evaluated.

required
run_id Optional[str]

ID of the experiment run. Can be auto-generated if None.

None
metadata Optional[Dict[str, Any]]

Additional metadata to log.

None
**kwargs Any

Additional configuration parameters.

{}

get_experiment_history(**kwargs)

Get all experiment runs.

Parameters:

Name Type Description Default
**kwargs Any

Additional configuration parameters.

{}

Returns:

Type Description
List[Dict[str, Any]]

List[Dict[str, Any]]: List of experiment runs.

get_run_results(run_id, **kwargs)

Get detailed results for a specific run.

Parameters:

Name Type Description Default
run_id str

ID of the experiment run.

required
**kwargs Any

Additional configuration parameters.

{}

Returns:

Type Description
list[dict[str, Any]]

list[dict[str, Any]]: Detailed run results.

log(evaluation_result, dataset_name, data, run_id=None, metadata=None, **kwargs) abstractmethod

Log a single evaluation result (synchronous).

Parameters:

Name Type Description Default
evaluation_result EvaluationOutput

The evaluation result to log.

required
dataset_name str

Name of the dataset being evaluated.

required
data MetricInput

The input data that was evaluated.

required
run_id Optional[str]

ID of the experiment run. Can be auto-generated if None.

None
metadata Optional[Dict[str, Any]]

Additional metadata to log.

None
**kwargs Any

Additional configuration parameters.

{}

log_batch(evaluation_results, dataset_name, data, run_id=None, metadata=None, **kwargs) abstractmethod async

Log a batch of evaluation results (asynchronous).

Parameters:

Name Type Description Default
evaluation_results List[EvaluationOutput]

The evaluation results to log.

required
dataset_name str

Name of the dataset being evaluated.

required
data List[MetricInput]

List of input data that was evaluated.

required
run_id Optional[str]

ID of the experiment run. Can be auto-generated if None.

None
metadata Optional[Dict[str, Any]]

Additional metadata to log.

None
**kwargs Any

Additional configuration parameters.

{}

log_context(**kwargs) async

Log a context.

Parameters:

Name Type Description Default
**kwargs Any

Additional configuration parameters.

{}