Evaluator
Base class for all evaluators.
References
NONE
BaseEvaluator(name, batch_status_check_interval=DefaultValues.BATCH_STATUS_CHECK_INTERVAL, batch_max_iterations=DefaultValues.BATCH_MAX_ITERATIONS)
Bases: ABC
Base class for all evaluators.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
The name of the evaluator. |
required_fields |
set[str]
|
The required fields for the evaluator. |
input_type |
type | None
|
The type of the input data. |
Initialize the evaluator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name of the evaluator. |
required |
batch_status_check_interval
|
float
|
Time between batch status checks in seconds. Defaults to 30.0. |
BATCH_STATUS_CHECK_INTERVAL
|
batch_max_iterations
|
int
|
Maximum number of status check iterations before timeout. Defaults to 120 (60 minutes with default interval). |
BATCH_MAX_ITERATIONS
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If batch_status_check_interval or batch_max_iterations are not positive. |
aggregate_required_fields(metrics, mode='any')
staticmethod
Aggregate required fields from multiple metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
Iterable[BaseMetric]
|
The metrics to aggregate from. |
required |
mode
|
str
|
The aggregation mode. Options: - "union": All fields required by any metric - "intersection": Only fields required by all metrics - "any": Empty set (no validation) Defaults to "any". |
'any'
|
Returns:
| Type | Description |
|---|---|
set[str]
|
set[str]: The aggregated required fields. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If mode is not one of the supported options. |
can_evaluate_any(metrics, data)
staticmethod
Check if any of the metrics can evaluate the given data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metrics
|
Iterable[BaseMetric]
|
The metrics to check. |
required |
data
|
MetricInput
|
The data to validate against. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if any metric can evaluate the data, False otherwise. |
ensure_list_of_dicts(data, key)
staticmethod
Ensure that a field in the data is a list of dictionaries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
MetricInput
|
The data to validate. |
required |
key
|
str
|
The key to check. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the field is not a list or contains non-dictionary elements. |
ensure_non_empty_list(data, key)
staticmethod
Ensure that a field in the data is a non-empty list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
MetricInput
|
The data to validate. |
required |
key
|
str
|
The key to check. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the field is not a list or is empty. |
evaluate(data)
async
Evaluate the data (single item or batch).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
MetricInput | list[MetricInput]
|
The data to be evaluated. Can be a single item or a list for batch processing. |
required |
Returns:
| Type | Description |
|---|---|
EvaluationOutput | list[EvaluationOutput]
|
EvaluationOutput | list[EvaluationOutput]: The evaluation output with global_explanation. Returns a list if input is a list. |
get_input_fields()
classmethod
Return declared input field names if input_type is provided; otherwise None.
Returns:
| Type | Description |
|---|---|
list[str] | None
|
list[str] | None: The input fields. |
get_input_spec()
classmethod
Return structured spec for input fields if input_type is provided; otherwise None.
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]] | None
|
list[dict[str, Any]] | None: The input spec. |