Evaluate

Evaluate Module.

This module is used to evaluate the model using a convenience function.

`evaluate(data, inference_fn, evaluators, experiment_tracker=None, batch_size=10, allow_batch_evaluation=False, summary_evaluators=None, **kwargs)` `async`

Evaluate the model.

Parameters:

Name	Type	Description	Default
`data`	`str \| BaseDataset`	The data to evaluate.	required
`inference_fn`	`Callable`	The inference function to use.	required
`evaluators`	`list[BaseEvaluator \| BaseMetric]`	The evaluators to use.	required
`experiment_tracker`	`BaseExperimentTracker \| None`	The experiment tracker to use.	`None`
`batch_size`	`int`	The batch size to use for evaluation (runner-level chunking for memory management).	`10`
`allow_batch_evaluation`	`bool`	Enable batch processing mode for LLM API calls. When True, the runner passes entire chunks to evaluators for batch processing. Defaults to False.	`False`
`summary_evaluators`	`list[SummaryEvaluatorCallable] \| None`	Custom summary evaluators to compute batch-level statistics. Each callable receives (evaluation_results, data) and returns a dict of summary metrics. Defaults to None.	`None`
`**kwargs`	`Any`	Additional configuration parameters.	`{}`

Returns:

Name	Type	Description
`EvaluationResult`	`EvaluationResult`	Structured result containing evaluation results and experiment URLs/paths.