Preset

Presets for RAG pipelines.

`LM(language_model_id='openai/gpt-4o-mini', language_model_credentials=None, system_instruction='')`

Bases: BasePipelinePreset

A pipeline preset to perform a simple response generation task.

This class functions both as a preset configuration and as an executable pipeline.

Attributes:

Name	Type	Description
`language_model_id`	`str \| ModelId`	The model id, can either be a ModelId instance or a string in the following format: 1. For `azure-openai` provider: `azure-openai/azure-endpoint:azure-deployment`. 2. For `openai-compatible` provider: `openai-compatible/base-url:model-name`. 3. For `langchain` provider: `langchain/<package>.<class>:model-name`. 4. For other providers: `provider/model-name`.
`language_model_credentials`	`str \| dict[str, Any] \| None`	The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain.
`system_instruction`	`str`	The system instruction for the language model.

Example

lm = LM(language_model_id="openai/gpt-4o-mini", language_model_credentials=OPENAI_API_KEY)
lm("Name 10 animals that starts with the letter 'A'")

Initialize a new preset.

Parameters:

Name	Type	Description	Default
`language_model_id`	`str \| ModelId`	The model id, can either be a ModelId instance or a string in the following format: 1. For `azure-openai` provider: `azure-openai/azure-endpoint:azure-deployment`. 2. For `openai-compatible` provider: `openai-compatible/base-url:model-name`. 3. For `langchain` provider: `langchain/<package>.<class>:model-name`. 4. For other providers: `provider/model-name`. Defaults to "openai/gpt-4o-mini".	`'openai/gpt-4o-mini'`
`language_model_credentials`	`str \| dict[str, Any] \| None`	The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain. Defaults to None.	`None`
`system_instruction`	`str`	The system instruction for the language model. Defaults to an empty string.	`''`

`build()`

Build the pipeline.

Build a pipeline that performs simple response generation task.

Returns:

Name	Type	Description
`Pipeline`	`Pipeline`	The built pipeline.

`build_config(query, attachments=None, config=None)`

Build the runtime configuration for the pipeline.

Parameters:

Name	Type	Description	Default
`query`	`str`	The query to pass to the language model.	required
`attachments`	`list[Any] \| None`	The attachments to pass to the language model. Defaults to None.	`None`
`config`	`dict[str, Any] \| None`	The configuration to pass to the language model. Defaults to None.	`None`

Returns:

Type	Description
`dict[str, Any]`	dict[str, Any]: The runtime configuration for the pipeline.

`build_initial_state(query, attachments=None, config=None)`

Build the initial state for the pipeline.

Parameters:

Name	Type	Description	Default
`query`	`str`	The query to pass to the language model.	required
`attachments`	`list[Any] \| None`	The attachments to pass to the language model. Defaults to None.	`None`
`config`	`dict[str, Any] \| None`	The configuration to pass to the language model. Defaults to None.	`None`

Returns:

Type	Description
`dict[str, Any]`	dict[str, Any]: The initial state for the pipeline.

`SimpleRAG(language_model_id='openai/gpt-4o-mini', language_model_credentials=None, system_instruction='', embedding_model_id='openai/text-embedding-3-small', embedding_model_credentials=None, data_store_type='chroma', data_store_index='default', data_store_host=None, data_store_port=None, data_store_config=None)`

Bases: BasePipelinePreset

A simple RAG pipeline preset.

This preset implements a basic RAG pipeline with the following steps: 1. Retrieve relevant chunks using BasicRetriever 2. Repack the chunks into a context 3. Bundle the context for response synthesis 4. Generate a response using StuffResponseSynthesizer

Attributes:

Name	Type	Description
`language_model_id`	`str \| ModelId`	The model id, can either be a ModelId instance or a string in the following format: 1. For `azure-openai` provider: `azure-openai/azure-endpoint:azure-deployment`. 2. For `openai-compatible` provider: `openai-compatible/base-url:model-name`. 3. For `langchain` provider: `langchain/<package>.<class>:model-name`. 4. For other providers: `provider/model-name`.
`language_model_credentials`	`str \| dict[str, Any] \| None`	The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain.
`system_instruction`	`str`	The system instruction for the language model.
`embedding_model_id`	`str`	The embedding model to use..
`embedding_model_credentials`	`str \| None`	The credentials for the embedding model.
`data_store_type`	`str`	The type of data store to use.
`data_store_index`	`str`	The index name for the data store.
`data_store_host`	`str \| None`	The host for the data store.
`data_store_port`	`int \| None`	The port for the data store.
`data_store_config`	`dict \| None`	The configuration for the data store.

Example:

rag = SimpleRAG(
    language_model_id="openai/gpt-4o-mini",
    language_model_credentials="test-api-key",
    embedding_model_id="openai/text-embedding-3-small",
    embedding_model_credentials="test-embedding-api-key",
    data_store_type="chroma",
    data_store_index="default",
    data_store_config={"client_type": "persistent", "persist_directory": "./chroma_db"},
)

rag("What is the capital of France?")

Initialize a new SimpleRAG preset.

Parameters:

Name	Type	Description	Default
`language_model_id`	`str \| ModelId`	The model id, can either be a ModelId instance or a string in the following format: 1. For `azure-openai` provider: `azure-openai/azure-endpoint:azure-deployment`. 2. For `openai-compatible` provider: `openai-compatible/base-url:model-name`. 3. For `langchain` provider: `langchain/<package>.<class>:model-name`. 4. For other providers: `provider/model-name`. Defaults to "openai/gpt-4o-mini".	`'openai/gpt-4o-mini'`
`language_model_credentials`	`str \| dict[str, Any] \| None`	The credentials for the language model. Can either be: 1. An API key. 2. A path to a credentials JSON file, currently only supported for Google Vertex AI. 3. A dictionary of credentials, currently only supported for LangChain. Defaults to None.	`None`
`system_instruction`	`str`	The system instruction for the language model. Defaults to an empty string.	`''`
`embedding_model_id`	`str`	The embedding model to use. Defaults to "openai/text-embedding-3-small".	`'openai/text-embedding-3-small'`
`embedding_model_credentials`	`str \| None`	The credentials for the embedding model. Defaults to None.	`None`
`data_store_type`	`str`	The type of data store to use. Defaults to "chroma".	`'chroma'`
`data_store_index`	`str`	The index name for the data store. Defaults to "default".	`'default'`
`data_store_host`	`str \| None`	The host for the data store. Defaults to None.	`None`
`data_store_port`	`int \| None`	The port for the data store. Defaults to None.	`None`
`data_store_config`	`dict`	The configuration for the data store. Defaults to None, in which case a default configuration for the selected data store type will be used.	`None`

`build()`

Build the pipeline.

Returns:

Name	Type	Description
`Pipeline`	`Pipeline`	The built pipeline.

`build_config(query, attachments=None, config=None)`

Build the runtime configuration for the pipeline.

Parameters:

Name	Type	Description	Default
`query`	`str`	The input query.	required
`attachments`	`list[Any] \| None`	The attachments. Defaults to None.	`None`
`config`	`dict[str, Any] \| None`	The configuration. Defaults to None.	`None`

Returns:

Type	Description
`dict[str, Any]`	dict[str, Any]: The runtime configuration for the pipeline.

`build_initial_state(query, attachments=None, config=None)`

Build the initial state for the pipeline.

Parameters:

Name	Type	Description	Default
`query`	`str`	The input query.	required
`attachments`	`list[Any] \| None`	The attachments. Defaults to None.	`None`
`config`	`dict[str, Any] \| None`	The configuration. Defaults to None.	`None`

Returns:

Type	Description
`dict[str, Any]`	dict[str, Any]: The initial state for the pipeline.