Skip to content

Evaluator

Base class for all evaluators.

Authors

Surya Mahadi (made.r.s.mahadi@gdplabs.id)

References

NONE

BaseEvaluator(name, batch_status_check_interval=DefaultValues.BATCH_STATUS_CHECK_INTERVAL, batch_max_iterations=DefaultValues.BATCH_MAX_ITERATIONS)

Bases: ABC

Base class for all evaluators.

Attributes:

Name Type Description
name str

The name of the evaluator.

required_fields set[str]

The required fields for the evaluator.

input_type type | None

The type of the input data.

Initialize the evaluator.

Parameters:

Name Type Description Default
name str

The name of the evaluator.

required
batch_status_check_interval float

Time between batch status checks in seconds. Defaults to 30.0.

BATCH_STATUS_CHECK_INTERVAL
batch_max_iterations int

Maximum number of status check iterations before timeout. Defaults to 120 (60 minutes with default interval).

BATCH_MAX_ITERATIONS

Raises:

Type Description
ValueError

If batch_status_check_interval or batch_max_iterations are not positive.

aggregate_required_fields(metrics, mode='any') staticmethod

Aggregate required fields from multiple metrics.

Parameters:

Name Type Description Default
metrics Iterable[BaseMetric]

The metrics to aggregate from.

required
mode str

The aggregation mode. Options: - "union": All fields required by any metric - "intersection": Only fields required by all metrics - "any": Empty set (no validation) Defaults to "any".

'any'

Returns:

Type Description
set[str]

set[str]: The aggregated required fields.

Raises:

Type Description
ValueError

If mode is not one of the supported options.

can_evaluate_any(metrics, data) staticmethod

Check if any of the metrics can evaluate the given data.

Parameters:

Name Type Description Default
metrics Iterable[BaseMetric]

The metrics to check.

required
data MetricInput

The data to validate against.

required

Returns:

Name Type Description
bool bool

True if any metric can evaluate the data, False otherwise.

ensure_list_of_dicts(data, key) staticmethod

Ensure that a field in the data is a list of dictionaries.

Parameters:

Name Type Description Default
data MetricInput

The data to validate.

required
key str

The key to check.

required

Raises:

Type Description
ValueError

If the field is not a list or contains non-dictionary elements.

ensure_non_empty_list(data, key) staticmethod

Ensure that a field in the data is a non-empty list.

Parameters:

Name Type Description Default
data MetricInput

The data to validate.

required
key str

The key to check.

required

Raises:

Type Description
ValueError

If the field is not a list or is empty.

evaluate(data) async

Evaluate the data (single item or batch).

Parameters:

Name Type Description Default
data MetricInput | list[MetricInput]

The data to be evaluated. Can be a single item or a list for batch processing.

required

Returns:

Type Description
EvaluationOutput | list[EvaluationOutput]

EvaluationOutput | list[EvaluationOutput]: The evaluation output with global_explanation. Returns a list if input is a list.

get_input_fields() classmethod

Return declared input field names if input_type is provided; otherwise None.

Returns:

Type Description
list[str] | None

list[str] | None: The input fields.

get_input_spec() classmethod

Return structured spec for input fields if input_type is provided; otherwise None.

Returns:

Type Description
list[dict[str, Any]] | None

list[dict[str, Any]] | None: The input spec.