Dataset

Dataset init file.

`BaseDataset(version=None, hash=None, name=None, description=None, schema=None, additional_metadata=None)`

Bases: ABC, Iterable

Base class for all datasets.

Attributes:

Name	Type	Description
`dataset`	`list[MetricInput]`	The dataset to evaluate.
`version`	`str \| None`	The version of the dataset.
`hash`	`str \| None`	The hash of the dataset.
`name`	`str \| None`	The name of the dataset.
`description`	`str \| None`	The description of the dataset.
`schema`	`type[BaseModel] \| None`	The schema of the dataset.
`additional_metadata`	`dict[str, Any] \| None`	Additional metadata of the dataset.

Initialize the dataset.

Parameters:

Name	Type	Description	Default
`version`	`str \| None`	The version of the dataset. Defaults to None.	`None`
`hash`	`str \| None`	The hash of the dataset. Defaults to None.	`None`
`name`	`str \| None`	The name of the dataset. Defaults to None.	`None`
`description`	`str \| None`	The description of the dataset. Defaults to None.	`None`
`schema`	`type[BaseModel] \| None`	The schema of the dataset. Defaults to None.	`None`
`additional_metadata`	`dict[str, Any] \| None`	Additional metadata of the dataset. Defaults to None.	`None`

`getitem(index)`

Get the item at the given index.

Parameters:

Name	Type	Description	Default
`index`	`int`	The index of the item to get.	required

Returns:

Type	Description
`MetricInput \| list[MetricInput]`	MetricInput \| list[MetricInput]: The item at the given index or a list of items if the index is a list.

`iter()`

Iterate over the dataset.

Returns:

Type	Description
`Iterator[MetricInput]`	Iterator[MetricInput]: An iterator over the dataset.

`len()`

Get the length of the dataset.

Returns:

Name	Type	Description
`int`	`int`	The length of the dataset.

`filter(filter_fn)`

Filter the dataset.

Parameters:

Name	Type	Description	Default
`filter_fn`	`Callable[[MetricInput], bool]`	The filter function.	required

`load()` `abstractmethod`

Load the dataset.

Returns:

Type	Description
`list[MetricInput]`	list[MetricInput]: The loaded dataset.

Raises:

Type	Description
`NotImplementedError`	If the load method is not implemented.

`map(map_fn)`

Map the dataset.

Parameters:

Name	Type	Description	Default
`map_fn`	`Callable[[MetricInput], MetricInput]`	The map function.	required

`sample(n=3)`

Sample n items from the dataset.

Parameters:

Name	Type	Description	Default
`n`	`int`	The number of items to sample.	`3`

Returns:

Type	Description
`list[MetricInput]`	list[MetricInput]: The sampled items.

`shuffle()`

Shuffle the dataset.

`validate()` `abstractmethod`

Validate the dataset.

Raises:

Type	Description
`NotImplementedError`	If the validate method is not implemented.

`DatasetRegistry()`

Registry for dataset configurations.

Initialize the dataset registry.

Parameters:

Name	Type	Description	Default
`_configs`	`Dict[DatasetType, DatasetConfig]`	The dictionary of dataset configurations.	required
`_prefix_map`	`Dict[str, DatasetType]`	The dictionary of dataset prefixes.	required
`_extension_map`	`Dict[str, DatasetType]`	The dictionary of dataset extensions.	required

`detect_type(dataset)`

Detect the dataset type from the dataset string.

Parameters:

Name	Type	Description	Default
`dataset`	`str`	The dataset string to detect.	required

Returns:

Type	Description
`Optional[DatasetType]`	Optional[DatasetType]: The detected dataset type, or None if not supported.

`get_config(dataset_type)`

Get configuration for a dataset type.

Parameters:

Name	Type	Description	Default
`dataset_type`	`DatasetType`	The dataset type to get the configuration for.	required

Returns:

Type	Description
`Optional[DatasetConfig]`	Optional[DatasetConfig]: The configuration for the dataset type, or None if not found.

`register_config(config)`

Register a dataset configuration.

Parameters:

Name	Type	Description	Default
`config`	`DatasetConfig`	The dataset configuration to register.	required

`DictDataset(dataset, name=None)`

Bases: BaseDataset

Dict-Based Dataset.

This class is a subclass of the BaseDataset class. It is used to store a dataset in a dictionary format.

Attributes:

Name	Type	Description
`dataset`	`list[dict]`	The dataset to evaluate.

Initialize the DictDataset class.

Parameters:

Name	Type	Description	Default
`dataset`	`Dataset`	The dataset to use for the evaluation.	required

`from_csv(path, name=None, **kwargs)` `classmethod`

Load a dataset from a CSV file.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the CSV file.	required
`name`	`str \| None`	The name of the dataset.	`None`
`**kwargs`	`Any`	Additional arguments to pass to pandas read_csv.	`{}`

Returns:

Name	Type	Description
`DictDataset`	`DictDataset`	The loaded dataset.

`from_jsonl(path, **kwargs)` `classmethod`

Load a dataset from a JSONL file.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the JSONL file.	required
`**kwargs`	`Any`	Additional arguments to pass to the constructor.	`{}`

Returns:

Name	Type	Description
`DictDataset`	`DictDataset`	The loaded dataset.

`load()`

Load the dataset.

Returns:

Type	Description
`list[MetricInput]`	list[MetricInput]: The loaded dataset.

`validate()`

Validate the dataset.

Raises:

Type	Description
`ValueError`	If the dataset is not a list of MetricInput.

`HuggingFaceDataset(dataset)`

Bases: BaseDataset

Hugging Face dataset class for the evaluator.

Attributes:

Name	Type	Description
`dataset`	`list[MetricInput]`	The dataset to use for the evaluation.

Initialize the HuggingFaceDataset class.

Parameters:

Name	Type	Description	Default
`dataset`	`Dataset`	The dataset to use for the evaluation.	required

`from_hub(path_or_name, split, **kwargs)` `staticmethod`

Create a HuggingFaceDataset from a Hugging Face dataset.

Parameters:

Name	Type	Description	Default
`path_or_name`	`str`	The path or name of the dataset.	required
`split`	`str`	The split of the dataset.	required
`**kwargs`	`Any`	Additional arguments to pass to the load function.	`{}`

Returns:

Name	Type	Description
`HuggingFaceDataset`	`HuggingFaceDataset`	The created dataset.

`from_list(dataset)` `staticmethod`

Create a HuggingFaceDataset from a list of MetricInput.

Parameters:

Name	Type	Description	Default
`dataset`	`list[MetricInput]`	The dataset to create.	required

Returns:

Name	Type	Description
`HuggingFaceDataset`	`HuggingFaceDataset`	The created dataset.

`load()`

Load the dataset.

Returns:

Type	Description
`list[MetricInput]`	list[MetricInput]: The loaded dataset.

`validate()`

Validate the dataset.

Raises:

Type	Description
`ValueError`	If the dataset is not a list of MetricInput.

`LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None)`

Bases: BaseDataset

Langfuse dataset class for the evaluator.

Attributes:

Name	Type	Description
`dataset`	`list[MetricInput]`	The dataset to use for the evaluation.
`langfuse_client`	`Langfuse`	The Langfuse client instance.
`dataset_name`	`str`	The name of the dataset in Langfuse.
`expected_output_key`	`str \| None`	The key for expected output. Defaults to "expected_response".
`mapping`	`dict[str, Any] \| None`	Optional mapping for field keys. Defaults to None.

Initialize the LangfuseDataset class.

Parameters:

Name	Type	Description	Default
`dataset`	`List[MetricInput]`	The dataset to use for the evaluation.	required
`langfuse_client`	`Langfuse`	The Langfuse client instance.	required
`dataset_name`	`Optional[str]`	The name of the dataset in Langfuse.	`None`
`expected_output_key`	`str \| None`	The key for expected output. Defaults to "expected_response".	`'expected_response'`
`mapping`	`dict[str, Any] \| None`	Optional mapping for field keys. Defaults to None.	`None`

`convert_to_standard_dataset(expected_output_key=None, mapping=None)`

Convert the dataset to standard data.

Parameters:

Name	Type	Description	Default
`expected_output_key`	`str \| None`	The key for expected output. Defaults to None.	`None`
`mapping`	`dict[str, Any] \| None`	Optional mapping for field keys. Defaults to None.	`None`

Returns:

Type	Description
`List[MetricInput]`	List[MetricInput]: The converted dataset.

`from_csv(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, **kwargs)` `staticmethod`

Create a LangfuseDataset from a CSV file.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the CSV file.	required
`langfuse_client`	`Langfuse`	The Langfuse client instance.	required
`dataset_name`	`str`	The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None.	`None`
`dataset_description`	`str`	The description of the dataset. Defaults to an empty string.	`''`
`metadata`	`dict`	Optional metadata for the dataset. Defaults to None.	`None`
`is_append`	`bool`	If True, append items to existing dataset. If False, only create if dataset doesn't exist.	`False`
`**kwargs`	`Any`	Additional arguments to pass to the constructor.	`{}`

Returns:

Name	Type	Description
`LangfuseDataset`	`LangfuseDataset`	The created dataset.

`from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, metadata=None, is_append=False)` `staticmethod`

Create a LangfuseDataset from a list of MetricInput.

Parameters:

Name	Type	Description	Default
`dataset`	`List[MetricInput]`	The dataset to create.	required
`langfuse_client`	`Langfuse`	The Langfuse client instance.	required
`dataset_name`	`str`	The name of the dataset in Langfuse.	required
`dataset_description`	`str`	The description of the dataset. Defaults to an empty string.	`''`
`mapping`	`dict[str, Any] \| None`	Optional mapping for field keys. Defaults to None.	`None`
`metadata`	`dict`	Optional metadata for the dataset. Defaults to None.	`None`
`is_append`	`bool`	If True, append items to existing dataset. If False, only create if dataset doesn't exist.	`False`

Returns:

Name	Type	Description
`LangfuseDataset`	`LangfuseDataset`	The created dataset.

`from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None, metadata=None, is_append=False)` `async` `staticmethod`

Create a LangfuseDataset from Google Sheets.

Parameters:

Name	Type	Description	Default
`sheet_id`	`str`	The ID of the Google Sheet.	required
`worksheet_name`	`str`	The name of the worksheet within the Google Sheet.	required
`client_email`	`str`	The client email for Google Sheets API.	required
`private_key`	`str`	Base64-encoded private key for Google Sheets API.	required
`langfuse_client`	`Langfuse`	The Langfuse client instance.	required
`dataset_name`	`str`	The name of the dataset in Langfuse.	`None`
`dataset_description`	`str`	The description of the dataset. Defaults to an empty string.	`''`
`mapping`	`dict[str, Any] \| None`	Optional mapping for field keys. Defaults to None.	`None`
`metadata`	`dict`	Optional metadata for the dataset. Defaults to None.	`None`
`is_append`	`bool`	If True, append items to existing dataset. If False, only create if dataset doesn't exist.	`False`

Returns:

Name	Type	Description
`LangfuseDataset`	`LangfuseDataset`	The created dataset.

`from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, **kwargs)` `staticmethod`

Create a LangfuseDataset from a JSONL file.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the JSONL file.	required
`langfuse_client`	`Langfuse`	The Langfuse client instance.	required
`dataset_name`	`str`	The name of the dataset in Langfuse.	`None`
`dataset_description`	`str`	The description of the dataset. Defaults to an empty string.	`''`
`metadata`	`dict`	Optional metadata for the dataset. Defaults to None.	`None`
`is_append`	`bool`	If True, append items to existing dataset. If False, only create if dataset doesn't exist.	`False`
`**kwargs`	`Any`	Additional arguments to pass to the constructor.	`{}`

Returns:

Name	Type	Description
`LangfuseDataset`	`LangfuseDataset`	The created dataset.

`from_langfuse(langfuse_client, dataset_name, mapping=None)` `staticmethod`

Load a dataset from Langfuse.

Parameters:

Name	Type	Description	Default
`langfuse_client`	`Langfuse`	The Langfuse client instance.	required
`dataset_name`	`str`	The name of the dataset in Langfuse.	required
`mapping`	`dict[str, Any] \| None`	Optional mapping for field keys. Defaults to None.	`None`

Returns:

Name	Type	Description
`LangfuseDataset`	`LangfuseDataset`	The loaded dataset.

Raises:

Type	Description
`ValueError`	If the dataset is not found or has no data.

`load()`

Load the dataset.

Returns:

Type	Description
`List[MetricInput]`	List[MetricInput]: The loaded dataset with proper Langfuse structure.

`validate()`

Validate the dataset.

Raises:

Type	Description
`ValueError`	If the dataset is not a list of MetricInput or if required fields are missing.

`SpreadsheetDataset(dataset)`

Bases: BaseDataset

Spreadsheet dataset class for the evaluator.

Attributes:

Name	Type	Description
`dataset`	`list[MetricInput]`	The dataset to use for the evaluation.

Initialize the SpreadsheetDataset class.

Parameters:

Name	Type	Description	Default
`dataset`	`Dataset`	The dataset to use for the evaluation.	required

`from_gsheets(sheet_id, worksheet_name, client_email, private_key)` `async` `staticmethod`

Load the dataset from Google Sheets.

Parameters:

Name	Type	Description	Default
`sheet_id`	`str`	The ID of the Google Sheet.	required
`worksheet_name`	`str`	The name of the worksheet within the Google Sheet.	required
`client_email`	`str`	The client email for Google Sheets API.	required
`private_key`	`str`	Base64-encoded private key for Google Sheets API.	required

Returns:

Name	Type	Description
`SpreadsheetDataset`	`SpreadsheetDataset`	The loaded dataset.

`load()`

Load the dataset.

Returns:

Type	Description
`list[MetricInput]`	list[MetricInput]: The loaded dataset.

`validate()`

Validate the dataset.

Raises:

Type	Description
`ValueError`	If the dataset is not a list of MetricInput.

`load_simple_agent_dataset()`

Load the simple agent dataset from the local CSV file.

The dataset contains agent interaction data with trajectories, questions, and responses, suitable for both agent trajectory evaluation and generation quality evaluation.

Returns:

Name	Type	Description
`DictDataset`	`DictDataset`	The loaded simple agent dataset containing MetricInput dictionaries with keys: 'query', 'generated_response', 'expected_response', 'agent_trajectory', 'expected_agent_trajectory'.

Raises:

Type	Description
`FileNotFoundError`	If the CSV file doesn't exist.
`ValueError`	If the CSV file is empty or malformed.

`load_simple_qa_dataset()`

Load the simple QA dataset from the local CSV file.

The dataset contains question-answer pairs with generated responses and contexts, suitable for RAG evaluation and testing.

Returns:

Name	Type	Description
`DictDataset`	`DictDataset`	The loaded simple QA dataset containing MetricInput dictionaries with keys: 'query', 'generated_response', 'expected_output', 'retrieved_context'.

Raises:

Type	Description
`FileNotFoundError`	If the CSV file doesn't exist.
`ValueError`	If the CSV file is empty or malformed.

Dataset

BaseDataset(version=None, hash=None, name=None, description=None, schema=None, additional_metadata=None)

__getitem__(index)

__iter__()

__len__()

filter(filter_fn)

load() abstractmethod

map(map_fn)

sample(n=3)

shuffle()

validate() abstractmethod

DatasetRegistry()

detect_type(dataset)

get_config(dataset_type)

register_config(config)

DictDataset(dataset, name=None)

from_csv(path, name=None, **kwargs) classmethod

from_jsonl(path, **kwargs) classmethod

load()

validate()

HuggingFaceDataset(dataset)

from_hub(path_or_name, split, **kwargs) staticmethod

from_list(dataset) staticmethod

load()

validate()

LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None)

convert_to_standard_dataset(expected_output_key=None, mapping=None)

from_csv(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, **kwargs) staticmethod

from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, metadata=None, is_append=False) staticmethod

from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None, metadata=None, is_append=False) async staticmethod

from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, **kwargs) staticmethod

from_langfuse(langfuse_client, dataset_name, mapping=None) staticmethod

load()

validate()

SpreadsheetDataset(dataset)

from_gsheets(sheet_id, worksheet_name, client_email, private_key) async staticmethod

load()

validate()

load_simple_agent_dataset()

load_simple_qa_dataset()

`BaseDataset(version=None, hash=None, name=None, description=None, schema=None, additional_metadata=None)`

`getitem(index)`

`iter()`

`len()`

`filter(filter_fn)`

`load()` `abstractmethod`

`map(map_fn)`

`sample(n=3)`

`shuffle()`

`validate()` `abstractmethod`

`DatasetRegistry()`

`detect_type(dataset)`

`get_config(dataset_type)`

`register_config(config)`

`DictDataset(dataset, name=None)`

`from_csv(path, name=None, **kwargs)` `classmethod`

`from_jsonl(path, **kwargs)` `classmethod`

`load()`

`validate()`

`HuggingFaceDataset(dataset)`

`from_hub(path_or_name, split, **kwargs)` `staticmethod`

`from_list(dataset)` `staticmethod`

`load()`

`validate()`

`LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None)`

`convert_to_standard_dataset(expected_output_key=None, mapping=None)`

`from_csv(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, **kwargs)` `staticmethod`

`from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, metadata=None, is_append=False)` `staticmethod`

`from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None, metadata=None, is_append=False)` `async` `staticmethod`

`from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', metadata=None, is_append=False, **kwargs)` `staticmethod`

`from_langfuse(langfuse_client, dataset_name, mapping=None)` `staticmethod`

`load()`

`validate()`

`SpreadsheetDataset(dataset)`

`from_gsheets(sheet_id, worksheet_name, client_email, private_key)` `async` `staticmethod`

`load()`

`validate()`

`load_simple_agent_dataset()`

`load_simple_qa_dataset()`