Response Synthesizer
Modules concerning the response synthesizers used in Gen AI applications.
ResponseSynthesizer(strategy, streamable=True)
Bases: Component
A response synthesizer that uses a strategy to synthesize the response.
Attributes:
| Name | Type | Description |
|---|---|---|
strategy |
BaseSynthesisStrategy
|
The strategy used to synthesize the response. |
streamable |
bool
|
A flag to indicate whether the synthesized response will be streamed if an event emitter is provided. |
preset |
PresetFactory
|
Factory for creating ResponseSynthesizer instances with preset configurations. |
The ResponseSynthesizer class provides a unified interface for synthesizing the response
using different strategies:
Stuff strategy
This strategy utilize a language model to synthesize a response based on the provided inputs. It employs the "stuff" technique, where all the provided chunks are stuffed into the prompt altogether. This prompt is then passed to the language model to synthesize a response in a single language model call.
Usage example:
python
lm_request_processor = build_lm_request_processor(...)
synthesizer = ResponseSynthesizer.stuff(lm_request_processor=lm_request_processor)
response = await synthesizer.synthesize(query=query, chunks=chunks)
The stuff strategy can also be instantiated using a preset prompt template via the preset factory.
This allows ease of usage as only the model id is required:
python
synthesizer = ResponseSynthesizer.preset.stuff(model_id="openai/gpt-4.1-nano")
response = await synthesizer.synthesize(query=query, chunks=chunks)
Map-reduce strategy
This strategy implements a two-phase approach for processing large amounts of content. 1. Map phase: Iteratively map each batch of chunks into extracted contexts until either the max iterations is reached or the number of chunks is less than the batch size. 2. Reduce phase: Reduce all extracted contexts into a final answer. This approach is useful when dealing with large amounts of content that need to be processed efficiently.
Usage example:
python
map_processor = build_lm_request_processor(...)
reduce_processor = build_lm_request_processor(...)
synthesizer = ResponseSynthesizer.mapreduce(
lm_request_processor=map_processor,
reduce_lm_request_processor=reduce_processor
)
response = await synthesizer.synthesize(query=query, chunks=chunks)
The map-reduce strategy can also be instantiated using preset prompt templates via the preset factory:
python
synthesizer = ResponseSynthesizer.preset.map_reduce(
map_model_id="openai/gpt-4.1-nano",
reduce_model_id="openai/gpt-4.1-nano"
)
response = await synthesizer.synthesize(query=query, chunks=chunks)
Refine strategy
This strategy utilize a language model to iteratively refine a response based on multiple contexts. It processes contexts in batches, where each iteration refines the previous answer with new context(s). This approach is useful when dealing with large amounts of context that need to be processed incrementally.
Usage example:
python
lm_request_processor = build_lm_request_processor(...)
synthesizer = ResponseSynthesizer.refine(lm_request_processor=lm_request_processor, batch_size=2)
response = await synthesizer.synthesize(query=query, chunks=chunks)
The refine strategy can also be instantiated using a preset prompt template via the preset factory.
This allows ease of usage as only the model id is required:
python
synthesizer = ResponseSynthesizer.preset.refine(model_id="openai/gpt-4.1-nano", batch_size=2)
response = await synthesizer.synthesize(query=query, chunks=chunks)
Static list strategy
This strategy generates a response by formatting a list of context items. This strategy can be used when a response should be presented as a simple list without requiring language model processing. The response format is customizable by providing a function that formats the list of context as a response.
Usage example:
python
synthesizer = ResponseSynthesizer.static_list()
response = await synthesizer.synthesize(chunks=chunks)
Initializes a new instance of the BaseResponseSynthesizer class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
strategy
|
BaseSynthesisStrategy
|
The strategy used to synthesize the response. |
required |
streamable
|
bool
|
A flag to indicate whether the synthesized response will be streamed if an event emitter is provided. Defaults to True. |
True
|
map_reduce(map_lm_request_processor, reduce_lm_request_processor, chunks_repacker=None, extractor_func=None, batch_size=1, max_iterations=1, streamable=True)
classmethod
Creates a response synthesizer with the map-reduce strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
map_lm_request_processor
|
LMRequestProcessor
|
The request processor for the map phase. |
required |
reduce_lm_request_processor
|
LMRequestProcessor
|
The request processor for the reduce phase. |
required |
chunks_repacker
|
Repacker | None
|
The repacker used to repack chunks during the reduce phase. Defaults to None, in which case a repacker with mode "context" is used. |
None
|
extractor_func
|
Callable[[LMOutput], Any] | None
|
A function to extract the language model output. Defaults to None, in which case the default extractor function is used. The default extractor function extracts the response attribute from the language model output. |
None
|
batch_size
|
int
|
The number of context to include in each map step. Defaults to 1. |
1
|
max_iterations
|
int
|
The maximum number of map iterations to perform. Defaults to 1. |
1
|
streamable
|
bool
|
A flag to indicate whether the synthesized response will be streamed if an event emitter is provided. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
ResponseSynthesizer |
ResponseSynthesizer
|
A response synthesizer with the map-reduce strategy. |
refine(lm_request_processor, batch_size=1, extractor_func=None, stream_drafts=False, streamable=True)
classmethod
Creates a response synthesizer with the refine strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lm_request_processor
|
LMRequestProcessor
|
The request processor used to handle the response generation. |
required |
batch_size
|
int
|
The number of chunks to include in each step. Defaults to 1. |
1
|
extractor_func
|
Callable[[LMOutput], Any] | None
|
A function to extract the language model output. Defaults to None, in which case the default extractor function is used. The default extractor function extracts the response attribute from the language model output. |
None
|
stream_drafts
|
bool
|
Whether to stream the drafts of the response. If False, only the final response will be streamed. Defaults to False. |
False
|
streamable
|
bool
|
A flag to indicate whether the synthesized response will be streamed if an event emitter is provided. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
ResponseSynthesizer |
ResponseSynthesizer
|
A response synthesizer with the refine strategy. |
static_list(format_response_func=None, streamable=True)
classmethod
Creates a response synthesizer with the static list strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
format_response_func
|
Callable[[list[str]], str] | None
|
A function that formats a list of context as a response. Defaults to None, in which case the default formatter function will be used. |
None
|
streamable
|
bool
|
A flag to indicate whether the synthesized response will be streamed if an event emitter is provided. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
ResponseSynthesizer |
ResponseSynthesizer
|
A response synthesizer with the static list strategy. |
stuff(lm_request_processor, chunks_repacker=None, extractor_func=None, streamable=True)
classmethod
Creates a response synthesizer with the stuff strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lm_request_processor
|
LMRequestProcessor
|
The request processor used to handle the response generation. |
required |
chunks_repacker
|
Repacker | None
|
The repacker used to repack the chunks into a context string. Defaults to None, in which case a repacker with mode "context" is used. |
None
|
extractor_func
|
Callable[[LMOutput], Any] | None
|
A function to extract the language model output. Defaults to None, in which case the default extractor function is used. The default extractor function extracts the response attribute from the language model output. |
None
|
streamable
|
bool
|
A flag to indicate whether the synthesized response will be streamed if an event emitter is provided. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
ResponseSynthesizer |
ResponseSynthesizer
|
A response synthesizer with the stuff strategy. |
synthesize(query=None, chunks=None, history=None, extra_contents=None, hyperparameters=None, event_emitter=None, **kwargs)
async
Synthesizes a response using the assigned strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str | None
|
The input query used to synthesize the response. Defaults to None. |
None
|
chunks
|
list[Chunk] | None
|
The list of chunks to be used as context. Defaults to None. |
None
|
history
|
list[Message] | None
|
The conversation history to be considered in generating the response. Defaults to None. |
None
|
extra_contents
|
list[MessageContent] | None
|
A list of extra contents to be included when generating the response. Defaults to None. |
None
|
hyperparameters
|
dict[str, Any] | None
|
The hyperparameters to be passed to the language model. Defaults to None. |
None
|
context_list
|
list[str] | None
|
The list of context to be included in the response. Defaults to None. |
required |
event_emitter
|
EventEmitter | None
|
The event emitter for handling events during response synthesis. Defaults to None. |
None
|
**kwargs
|
Any
|
Additional keyword arguments that will be passed to the strategy. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The synthesized response. |