Context Manipulator
Utility modules used to manipulate LLM context.
LLMLinguaCompressor(model_name='NousResearch/Llama-2-7b-hf', device_map='cuda', rate=0.5, target_token=-1, use_sentence_level_filter=False, use_context_level_filter=True, use_token_level_filter=True, rank_method='longllmlingua')
Bases: BaseCompressor
LLMLinguaCompressor is a wrapper for LongLLMLingua's PromptCompressor.
This class provides a simplified interface for using LongLLMLingua's compression capabilities within the GLLM series of libraries, with a focus on the 'longllmlingua' ranking method.
Initialize the LLMLinguaCompressor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name |
str
|
The name of the language model to be used. Defaults to "NousResearch/Llama-2-7b-hf". |
'NousResearch/Llama-2-7b-hf'
|
device_map |
str
|
The device to load the model onto, e.g., "cuda" for GPU. Defaults to "cuda". |
'cuda'
|
rate |
float
|
The default compression rate to be used. Defaults to 0.5. |
0.5
|
target_token |
int
|
The default target token count. Defaults to -1 (no specific target). |
-1
|
use_sentence_level_filter |
bool
|
Whether to use sentence-level filtering. Defaults to False. |
False
|
use_context_level_filter |
bool
|
Whether to use context-level filtering. Defaults to True. |
True
|
use_token_level_filter |
bool
|
Whether to use token-level filtering. Defaults to True. |
True
|
rank_method |
str
|
The ranking method to use. Recommended is "longllmlingua". Defaults to "longllmlingua". |
'longllmlingua'
|
compress(context, query, instruction=None, options=None)
async
Compress the given context based on the query and optional instruction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context |
str
|
The context to be compressed. |
required |
query |
str
|
The query related to the context. |
required |
instruction |
str | None
|
An optional instruction to be considered during compression. Defaults to None. |
None
|
options |
dict[str, Any] | None
|
Additional options for fine-tuning the compression process. Supported keys: rate, target_token, use_sentence_level_filter, use_context_level_filter, use_token_level_filter, rank_method. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The compressed context string. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the compression process fails. |
LMBasedRelevanceFilter(lm_request_processor, batch_size=DEFAULT_BATCH_SIZE, on_failure_keep_all=True, metadata=None, chunk_format=DEFAULT_CHUNK_TEMPLATE)
Bases: BaseRelevanceFilter, UsesLM
Relevance filter that uses an LM to determine chunk relevance.
This filter processes chunks in batches, sending them to an LM for relevance determination. It handles potential LM processing failures with a simple strategy controlled by the 'on_failure_keep_all' parameter.
The LM is expected to return a specific output format for each chunk, indicating its relevance to the given query.
The expected LM output format is:
{
"results": [
{
"explanation": str,
"is_relevant": bool
},
...
]
}
The number of items in "results" should match the number of input chunks.
Attributes:
| Name | Type | Description |
|---|---|---|
lm_request_processor |
LMRequestProcessor
|
The LM request processor used for LM calls. |
batch_size |
int
|
The number of chunks to process in each LM call. |
on_failure_keep_all |
bool
|
If True, keep all chunks when LM processing fails. If False, discard all chunks from the failed batch. |
metadata |
list[str] | None
|
List of metadata fields to include. If None, no metadata is included. |
chunk_format |
str | Callable[[Chunk], str]
|
Either a format string or a callable for custom chunk formatting. If using a format string: - Use {content} for chunk content - Use {metadata} for auto-formatted metadata block - Or reference metadata fields directly: {field_name} |
Initialize the LMBasedRelevanceFilter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lm_request_processor |
LMRequestProcessor
|
The LM request processor to use for LM calls. |
required |
batch_size |
int
|
The number of chunks to process in each LM call. Defaults to DEFAULT_BATCH_SIZE. |
DEFAULT_BATCH_SIZE
|
on_failure_keep_all |
bool
|
If True, keep all chunks when LM processing fails. If False, discard all chunks from the failed batch. Defaults to True. |
True
|
metadata |
list[str] | None
|
List of metadata fields to include. If None, no metadata is included. |
None
|
chunk_format |
str | Callable[[Chunk], str]
|
Either a format string or a callable for custom chunk formatting. If using a format string: - Use {content} for chunk content - Use {metadata} for auto-formatted metadata block - Or reference metadata fields directly: {field_name} Defaults to DEFAULT_CHUNK_TEMPLATE. |
DEFAULT_CHUNK_TEMPLATE
|
filter(chunks, query)
async
Filter the given chunks based on their relevance to the query using an LM.
This method processes chunks in batches, sending each batch to the LM for relevance determination. If LM processing fails for a batch, the behavior is determined by the 'on_failure_keep_all' attribute.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks |
list[Chunk]
|
The list of chunks to filter. |
required |
query |
str
|
The query to compare chunks against. |
required |
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: A list of chunks deemed relevant by the LM. |
MetadataContextEnricher(metadata_fields, position=MetadataPosition.PREFIX, separator='\n---\n', field_template='- {field}: {value}', skip_empty=True, binary_handling=BinaryHandlingStrategy.BASE64)
Bases: BaseContextEnricher
A metadata context enricher that adds metadata to the chunk content.
This enricher formats metadata fields into a string and appends it to the chunk content based on the specified position (prefix or suffix).
Attributes:
| Name | Type | Description |
|---|---|---|
metadata_fields |
list[str]
|
List of metadata fields to include in the enriched content. |
position |
MetadataPosition
|
Position of the metadata in the content. Valid values are defined in the MetadataPosition enum: - PREFIX: Metadata block is placed before content - SUFFIX: Metadata block is placed after content |
separator |
str
|
Separator between the metadata and the content. |
field_template |
str
|
Template for formatting each metadata field. |
skip_empty |
bool
|
Whether to skip fields with empty values. |
binary_handling |
BinaryHandlingStrategy
|
Strategy for handling binary data. Valid values are defined in the BinaryHandlingStrategy: - BASE64: Binary data is converted to base64 (default) - HEX: Binary data is converted to hexadecimal - NONE: Binary data is not included in the metadata block |
Initialize the metadata context enricher.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metadata_fields |
list[str]
|
List of metadata field names to include. |
required |
position |
MetadataPosition
|
Where to place metadata block. Valid values are defined in the MetadataPosition enum: 1. "prefix": Metadata block is placed before content 2. "suffix": Metadata block is placed after content |
PREFIX
|
separator |
str
|
String to separate metadata from content. |
'\n---\n'
|
field_template |
str
|
Template for formatting each metadata field. Available fields: - {field}: Field name - {value}: Field value |
'- {field}: {value}'
|
skip_empty |
bool
|
Whether to skip empty fields. |
True
|
binary_handling |
BinaryHandlingStrategy
|
Strategy for handling binary data. Valid values are defined in the BinaryHandlingStrategy: 1. "base64": Binary data is converted to base64 (default) 2. "hex": Binary data is converted to hexadecimal 3. "none": Binary data is not included in the metadata block |
BASE64
|
Repacker(method='forward', mode='chunk', delimiter='\n\n', size_func=lambda chunk: len(chunk.content), size_limit=None)
Bases: Component
A class for repacking chunks using various strategies.
Attributes:
| Name | Type | Description |
|---|---|---|
method |
RepackMethod
|
The method used for repacking. |
mode |
RepackerMode
|
The mode of operation (chunk or context). |
delimiter |
str
|
The delimiter used in context mode. |
size_func |
Callable[[Chunk], int]
|
Function used to measure the size of chunks. |
size_limit |
int | None
|
The maximum allowed total size for the repacked chunks. |
Initialize the Repacker instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method |
str
|
The method used for repacking. Defaults to "forward". |
'forward'
|
mode |
str
|
The mode of operation (chunk or context). Defaults to "chunk". |
'chunk'
|
delimiter |
str
|
The delimiter used in context mode. Defaults to "\n\n". |
'\n\n'
|
size_func |
Callable[[Chunk], int]
|
Function used to measure the size of a single chunk. Defaults to the length of the chunk content. |
lambda chunk: len(content)
|
size_limit |
int | None
|
The maximum allowed total size for the repacked chunks. Defaults to None. Note: The size limit only accounts for the chunks, not including the delimiter in context mode. |
None
|
repack(chunks)
async
SimilarityBasedRelevanceFilter(em_invoker, threshold=0.5)
Bases: BaseRelevanceFilter
Relevance filter that uses semantic similarity to determine chunk relevance.
Attributes:
| Name | Type | Description |
|---|---|---|
em_invoker |
BaseEMInvoker
|
The embedding model invoker to use for vectorization. |
threshold |
float
|
The similarity threshold for relevance (0 to 1). Defaults to 0.5. |
Initialize the SimilarityBasedRelevanceFilter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
em_invoker |
BaseEMInvoker
|
The embedding model invoker to use for vectorization. |
required |
threshold |
float
|
The similarity threshold for relevance (0 to 1). Defaults to 0.5. |
0.5
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the threshold is not between 0 and 1. |
filter(chunks, query)
async
Filter the given chunks based on their semantic similarity to the query.
This method calculates the similarity between the query and each text chunk. For now, non-text chunks are excluded from processing and similarity calculation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks |
list[Chunk]
|
The list of chunks to filter. |
required |
query |
str
|
The query to compare chunks against. |
required |
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: A list of relevant text chunks. Non-text chunks are not included in the result. |