Skip to content

Preset

Presets for RAG pipelines.

LM(language_model_id='openai/gpt-4o-mini', language_model_credentials=None, system_instruction='')

Bases: BasePipelinePreset

A pipeline preset to perform a simple response generation task.

This class functions both as a preset configuration and as an executable pipeline.

Attributes:

Name Type Description
language_model_id str | ModelId

The model id, can either be a ModelId instance or a string in the following format: 1. For azure-openai provider: azure-openai/azure-endpoint:azure-deployment. 2. For openai-compatible provider: openai-compatible/base-url:model-name. 3. For langchain provider: langchain/<package>.<class>:model-name. 4. For other providers: provider/model-name.

language_model_credentials str | dict[str, Any] | None

The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain.

system_instruction str

The system instruction for the language model.

Example
lm = LM(language_model_id="openai/gpt-4o-mini", language_model_credentials=OPENAI_API_KEY)
lm("Name 10 animals that starts with the letter 'A'")

Initialize a new preset.

Parameters:

Name Type Description Default
language_model_id str | ModelId

The model id, can either be a ModelId instance or a string in the following format: 1. For azure-openai provider: azure-openai/azure-endpoint:azure-deployment. 2. For openai-compatible provider: openai-compatible/base-url:model-name. 3. For langchain provider: langchain/<package>.<class>:model-name. 4. For other providers: provider/model-name. Defaults to "openai/gpt-4o-mini".

'openai/gpt-4o-mini'
language_model_credentials str | dict[str, Any] | None

The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain. Defaults to None.

None
system_instruction str

The system instruction for the language model. Defaults to an empty string.

''

build()

Build the pipeline.

Build a pipeline that performs simple response generation task.

Returns:

Name Type Description
Pipeline Pipeline

The built pipeline.

build_config(query, attachments=None, config=None)

Build the runtime configuration for the pipeline.

Parameters:

Name Type Description Default
query str

The query to pass to the language model.

required
attachments list[Any] | None

The attachments to pass to the language model. Defaults to None.

None
config dict[str, Any] | None

The configuration to pass to the language model. Defaults to None.

None

Returns:

Type Description
dict[str, Any]

dict[str, Any]: The runtime configuration for the pipeline.

build_initial_state(query, attachments=None, config=None)

Build the initial state for the pipeline.

Parameters:

Name Type Description Default
query str

The query to pass to the language model.

required
attachments list[Any] | None

The attachments to pass to the language model. Defaults to None.

None
config dict[str, Any] | None

The configuration to pass to the language model. Defaults to None.

None

Returns:

Type Description
dict[str, Any]

dict[str, Any]: The initial state for the pipeline.

SimpleRAG(language_model_id='openai/gpt-4o-mini', language_model_credentials=None, system_instruction='', embedding_model_id='openai/text-embedding-3-small', embedding_model_credentials=None, data_store_type='chroma', data_store_index='default', data_store_host=None, data_store_port=None, data_store_config=None)

Bases: BasePipelinePreset

A simple RAG pipeline preset.

This preset implements a basic RAG pipeline with the following steps: 1. Retrieve relevant chunks using BasicRetriever 2. Repack the chunks into a context 3. Bundle the context for response synthesis 4. Generate a response using StuffResponseSynthesizer

Attributes:

Name Type Description
language_model_id str | ModelId

The model id, can either be a ModelId instance or a string in the following format: 1. For azure-openai provider: azure-openai/azure-endpoint:azure-deployment. 2. For openai-compatible provider: openai-compatible/base-url:model-name. 3. For langchain provider: langchain/<package>.<class>:model-name. 4. For other providers: provider/model-name.

language_model_credentials str | dict[str, Any] | None

The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain.

system_instruction str

The system instruction for the language model.

embedding_model_id str

The embedding model to use..

embedding_model_credentials str | None

The credentials for the embedding model.

data_store_type str

The type of data store to use.

data_store_index str

The index name for the data store.

data_store_host str | None

The host for the data store.

data_store_port int | None

The port for the data store.

data_store_config dict | None

The configuration for the data store.

Example:

rag = SimpleRAG(
    language_model_id="openai/gpt-4o-mini",
    language_model_credentials="test-api-key",
    embedding_model_id="openai/text-embedding-3-small",
    embedding_model_credentials="test-embedding-api-key",
    data_store_type="chroma",
    data_store_index="default",
    data_store_config={"client_type": "persistent", "persist_directory": "./chroma_db"},
)

rag("What is the capital of France?")

Initialize a new SimpleRAG preset.

Parameters:

Name Type Description Default
language_model_id str | ModelId

The model id, can either be a ModelId instance or a string in the following format: 1. For azure-openai provider: azure-openai/azure-endpoint:azure-deployment. 2. For openai-compatible provider: openai-compatible/base-url:model-name. 3. For langchain provider: langchain/<package>.<class>:model-name. 4. For other providers: provider/model-name. Defaults to "openai/gpt-4o-mini".

'openai/gpt-4o-mini'
language_model_credentials str | dict[str, Any] | None

The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain. Defaults to None.

None
system_instruction str

The system instruction for the language model. Defaults to an empty string.

''
embedding_model_id str

The embedding model to use. Defaults to "openai/text-embedding-3-small".

'openai/text-embedding-3-small'
embedding_model_credentials str | None

The credentials for the embedding model. Defaults to None.

None
data_store_type str

The type of data store to use. Defaults to "chroma".

'chroma'
data_store_index str

The index name for the data store. Defaults to "default".

'default'
data_store_host str | None

The host for the data store. Defaults to None.

None
data_store_port int | None

The port for the data store. Defaults to None.

None
data_store_config dict

The configuration for the data store. Defaults to None, in which case a default configuration for the selected data store type will be used.

None

build()

Build the pipeline.

Returns:

Name Type Description
Pipeline Pipeline

The built pipeline.

build_config(query, attachments=None, config=None)

Build the runtime configuration for the pipeline.

Parameters:

Name Type Description Default
query str

The input query.

required
attachments list[Any] | None

The attachments. Defaults to None.

None
config dict[str, Any] | None

The configuration. Defaults to None.

None

Returns:

Type Description
dict[str, Any]

dict[str, Any]: The runtime configuration for the pipeline.

build_initial_state(query, attachments=None, config=None)

Build the initial state for the pipeline.

Parameters:

Name Type Description Default
query str

The input query.

required
attachments list[Any] | None

The attachments. Defaults to None.

None
config dict[str, Any] | None

The configuration. Defaults to None.

None

Returns:

Type Description
dict[str, Any]

dict[str, Any]: The initial state for the pipeline.