Dataset
Dataset module.
References
NONE
DictDataset(dataset)
Bases: BaseDataset
Dict-Based Dataset.
This class is a subclass of the BaseDataset class. It is used to store a dataset in a dictionary format.
Attributes:
Name | Type | Description |
---|---|---|
dataset |
list[dict]
|
The dataset to evaluate. |
Initialize the DictDataset class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
Dataset
|
The dataset to use for the evaluation. |
required |
from_csv(path, **kwargs)
classmethod
Load a dataset from a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the CSV file. |
required |
**kwargs |
Any
|
Additional arguments to pass to pandas read_csv. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
from_jsonl(path, **kwargs)
classmethod
Load a dataset from a JSONL file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the JSONL file. |
required |
**kwargs |
Any
|
Additional arguments to pass to the constructor. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
DictDataset |
DictDataset
|
The loaded dataset. |
load()
Load the dataset.
Returns:
Type | Description |
---|---|
list[MetricInput]
|
list[MetricInput]: The loaded dataset. |
validate()
Validate the dataset.
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset is not a list of MetricInput. |
HuggingFaceDataset(dataset)
Bases: BaseDataset
Hugging Face dataset class for the evaluator.
Attributes:
Name | Type | Description |
---|---|---|
dataset |
list[MetricInput]
|
The dataset to use for the evaluation. |
Initialize the HuggingFaceDataset class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
Dataset
|
The dataset to use for the evaluation. |
required |
from_hub(path_or_name, split, **kwargs)
staticmethod
Create a HuggingFaceDataset from a Hugging Face dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path_or_name |
str
|
The path or name of the dataset. |
required |
split |
str
|
The split of the dataset. |
required |
**kwargs |
Any
|
Additional arguments to pass to the load function. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
HuggingFaceDataset |
HuggingFaceDataset
|
The created dataset. |
from_list(dataset)
staticmethod
Create a HuggingFaceDataset from a list of MetricInput.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
list[MetricInput]
|
The dataset to create. |
required |
Returns:
Name | Type | Description |
---|---|---|
HuggingFaceDataset |
HuggingFaceDataset
|
The created dataset. |
load()
Load the dataset.
Returns:
Type | Description |
---|---|
list[MetricInput]
|
list[MetricInput]: The loaded dataset. |
validate()
Validate the dataset.
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset is not a list of MetricInput. |
LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None)
Bases: BaseDataset
Langfuse dataset class for the evaluator.
Attributes:
Name | Type | Description |
---|---|---|
dataset |
list[MetricInput]
|
The dataset to use for the evaluation. |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
dataset_name |
str
|
The name of the dataset in Langfuse. |
expected_output_key |
str | None
|
The key for expected output. Defaults to "expected_response". |
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
Initialize the LangfuseDataset class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
List[MetricInput]
|
The dataset to use for the evaluation. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
Optional[str]
|
The name of the dataset in Langfuse. |
None
|
expected_output_key |
str | None
|
The key for expected output. Defaults to "expected_response". |
'expected_response'
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
convert_to_standard_dataset(expected_output_key=None, mapping=None)
Convert the dataset to standard data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
expected_output_key |
str | None
|
The key for expected output. Defaults to None. |
None
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
Returns:
Type | Description |
---|---|
List[MetricInput]
|
List[MetricInput]: The converted dataset. |
from_csv(path, langfuse_client, dataset_name=None, dataset_description='', **kwargs)
staticmethod
Create a LangfuseDataset from a CSV file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the CSV file. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None. |
None
|
dataset_name |
str
|
The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None. |
None
|
dataset_description |
str
|
The description of the dataset. If None, defaults to an empty string. |
''
|
**kwargs |
Any
|
Additional arguments to pass to the constructor. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, is_append=False)
staticmethod
Create a LangfuseDataset from a list of MetricInput.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
List[MetricInput]
|
The dataset to create. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
required |
dataset_description |
str
|
The description of the dataset. |
''
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
is_append |
bool
|
If True, append items to existing dataset. If False, only create if dataset doesn't exist. |
False
|
Returns:
Name | Type | Description |
---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None)
async
staticmethod
Create a LangfuseDataset from Google Sheets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sheet_id |
str
|
The ID of the Google Sheet. |
required |
worksheet_name |
str
|
The name of the worksheet within the Google Sheet. |
required |
client_email |
str
|
The client email for Google Sheets API. |
required |
private_key |
str
|
Base64-encoded private key for Google Sheets API. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
None
|
dataset_description |
str
|
The description of the dataset. |
''
|
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', **kwargs)
staticmethod
Create a LangfuseDataset from a JSONL file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
str
|
The path to the JSONL file. |
required |
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
None
|
dataset_description |
str
|
The description of the dataset. |
''
|
**kwargs |
Any
|
Additional arguments to pass to the constructor. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
LangfuseDataset |
LangfuseDataset
|
The created dataset. |
from_langfuse(langfuse_client, dataset_name, mapping=None)
staticmethod
Load a dataset from Langfuse.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
langfuse_client |
Langfuse
|
The Langfuse client instance. |
required |
dataset_name |
str
|
The name of the dataset in Langfuse. |
required |
mapping |
dict[str, Any] | None
|
Optional mapping for field keys. Defaults to None. |
None
|
Returns:
Name | Type | Description |
---|---|---|
LangfuseDataset |
LangfuseDataset
|
The loaded dataset. |
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset is not found or has no data. |
load()
Load the dataset.
Returns:
Type | Description |
---|---|
List[MetricInput]
|
List[MetricInput]: The loaded dataset with proper Langfuse structure. |
validate()
Validate the dataset.
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset is not a list of MetricInput or if required fields are missing. |
SpreadsheetDataset(dataset)
Bases: BaseDataset
Spreadsheet dataset class for the evaluator.
Attributes:
Name | Type | Description |
---|---|---|
dataset |
list[MetricInput]
|
The dataset to use for the evaluation. |
Initialize the SpreadsheetDataset class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
Dataset
|
The dataset to use for the evaluation. |
required |
from_gsheets(sheet_id, worksheet_name, client_email, private_key)
async
staticmethod
Load the dataset from Google Sheets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sheet_id |
str
|
The ID of the Google Sheet. |
required |
worksheet_name |
str
|
The name of the worksheet within the Google Sheet. |
required |
client_email |
str
|
The client email for Google Sheets API. |
required |
private_key |
str
|
Base64-encoded private key for Google Sheets API. |
required |
Returns:
Name | Type | Description |
---|---|---|
SpreadsheetDataset |
SpreadsheetDataset
|
The loaded dataset. |
load()
Load the dataset.
Returns:
Type | Description |
---|---|
list[MetricInput]
|
list[MetricInput]: The loaded dataset. |
validate()
Validate the dataset.
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset is not a list of MetricInput. |
load_simple_qa_dataset()
Load the simple QA dataset from the local CSV file.
The dataset contains question-answer pairs with generated responses and contexts, suitable for RAG evaluation and testing.
Returns:
Name | Type | Description |
---|---|---|
DictDataset |
DictDataset
|
The loaded simple QA dataset containing MetricInput dictionaries with keys: 'query', 'generated_response', 'expected_output', 'retrieved_context'. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the CSV file doesn't exist. |
ValueError
|
If the CSV file is empty or malformed. |