Skip to content

Langfuse dataset

Langfuse Dataset Class.

This class is a wrapper around the Langfuse dataset class. The dataset can be loaded from Langfuse and data can be added to Langfuse.

Authors

Christina Alexandra (christina.alexandra@gdplabs.id)

References

NONE

LangfuseDataset(dataset, langfuse_client, dataset_name=None, expected_output_key='expected_response', mapping=None)

Bases: BaseDataset

Langfuse dataset class for the evaluator.

Attributes:

Name Type Description
dataset list[MetricInput]

The dataset to use for the evaluation.

langfuse_client Langfuse

The Langfuse client instance.

dataset_name str

The name of the dataset in Langfuse.

expected_output_key str | None

The key for expected output. Defaults to "expected_response".

mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

Initialize the LangfuseDataset class.

Parameters:

Name Type Description Default
dataset List[MetricInput]

The dataset to use for the evaluation.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name Optional[str]

The name of the dataset in Langfuse.

None
expected_output_key str | None

The key for expected output. Defaults to "expected_response".

'expected_response'
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None

convert_to_standard_dataset(expected_output_key=None, mapping=None)

Convert the dataset to standard data.

Parameters:

Name Type Description Default
expected_output_key str | None

The key for expected output. Defaults to None.

None
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None

Returns:

Type Description
List[MetricInput]

List[MetricInput]: The converted dataset.

from_csv(path, langfuse_client, dataset_name=None, dataset_description='', **kwargs) staticmethod

Create a LangfuseDataset from a CSV file.

Parameters:

Name Type Description Default
path str

The path to the CSV file.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None.

None
dataset_name str

The name to register this dataset under in Langfuse. If None, defaults to the CSV filename without extension. Defaults to None.

None
dataset_description str

The description of the dataset. If None, defaults to an empty string.

''
**kwargs Any

Additional arguments to pass to the constructor.

{}

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_dict(dataset, langfuse_client, dataset_name, dataset_description='', mapping=None, is_append=False) staticmethod

Create a LangfuseDataset from a list of MetricInput.

Parameters:

Name Type Description Default
dataset List[MetricInput]

The dataset to create.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

required
dataset_description str

The description of the dataset.

''
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None
is_append bool

If True, append items to existing dataset. If False, only create if dataset doesn't exist.

False

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_gsheets(sheet_id, worksheet_name, client_email, private_key, langfuse_client, dataset_name=None, dataset_description='', mapping=None) async staticmethod

Create a LangfuseDataset from Google Sheets.

Parameters:

Name Type Description Default
sheet_id str

The ID of the Google Sheet.

required
worksheet_name str

The name of the worksheet within the Google Sheet.

required
client_email str

The client email for Google Sheets API.

required
private_key str

Base64-encoded private key for Google Sheets API.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

None
dataset_description str

The description of the dataset.

''
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_jsonl(path, langfuse_client, dataset_name=None, dataset_description='', **kwargs) staticmethod

Create a LangfuseDataset from a JSONL file.

Parameters:

Name Type Description Default
path str

The path to the JSONL file.

required
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

None
dataset_description str

The description of the dataset.

''
**kwargs Any

Additional arguments to pass to the constructor.

{}

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The created dataset.

from_langfuse(langfuse_client, dataset_name, mapping=None) staticmethod

Load a dataset from Langfuse.

Parameters:

Name Type Description Default
langfuse_client Langfuse

The Langfuse client instance.

required
dataset_name str

The name of the dataset in Langfuse.

required
mapping dict[str, Any] | None

Optional mapping for field keys. Defaults to None.

None

Returns:

Name Type Description
LangfuseDataset LangfuseDataset

The loaded dataset.

Raises:

Type Description
ValueError

If the dataset is not found or has no data.

load()

Load the dataset.

Returns:

Type Description
List[MetricInput]

List[MetricInput]: The loaded dataset with proper Langfuse structure.

validate()

Validate the dataset.

Raises:

Type Description
ValueError

If the dataset is not a list of MetricInput or if required fields are missing.