Context Manipulator

Utility modules used to manipulate LLM context.

`LLMLinguaCompressor(model_name='NousResearch/Llama-2-7b-hf', device_map='cuda', rate=0.5, target_token=-1, use_sentence_level_filter=False, use_context_level_filter=True, use_token_level_filter=True, rank_method='longllmlingua')`

Bases: BaseCompressor

LLMLinguaCompressor is a wrapper for LongLLMLingua's PromptCompressor.

This class provides a simplified interface for using LongLLMLingua's compression capabilities within the GLLM series of libraries, with a focus on the 'longllmlingua' ranking method.

Initialize the LLMLinguaCompressor.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	The name of the language model to be used. Defaults to "NousResearch/Llama-2-7b-hf".	`'NousResearch/Llama-2-7b-hf'`
`device_map`	`str`	The device to load the model onto, e.g., "cuda" for GPU. Defaults to "cuda".	`'cuda'`
`rate`	`float`	The default compression rate to be used. Defaults to 0.5.	`0.5`
`target_token`	`int`	The default target token count. Defaults to -1 (no specific target).	`-1`
`use_sentence_level_filter`	`bool`	Whether to use sentence-level filtering. Defaults to False.	`False`
`use_context_level_filter`	`bool`	Whether to use context-level filtering. Defaults to True.	`True`
`use_token_level_filter`	`bool`	Whether to use token-level filtering. Defaults to True.	`True`
`rank_method`	`str`	The ranking method to use. Recommended is "longllmlingua". Defaults to "longllmlingua".	`'longllmlingua'`

`compress(context, query, instruction=None, options=None)` `async`

Compress the given context based on the query and optional instruction.

Parameters:

Name	Type	Description	Default
`context`	`str`	The context to be compressed.	required
`query`	`str`	The query related to the context.	required
`instruction`	`str \| None`	An optional instruction to be considered during compression. Defaults to None.	`None`
`options`	`dict[str, Any] \| None`	Additional options for fine-tuning the compression process. Supported keys: rate, target_token, use_sentence_level_filter, use_context_level_filter, use_token_level_filter, rank_method. Defaults to None.	`None`

Returns:

Name	Type	Description
`str`	`str`	The compressed context string.

Raises:

Type	Description
`ValueError`	If the compression process fails.

`LMBasedRelevanceFilter(lm_request_processor, batch_size=DEFAULT_BATCH_SIZE, on_failure_keep_all=True, metadata=None, chunk_format=DEFAULT_CHUNK_TEMPLATE)`

Bases: BaseRelevanceFilter, UsesLM

Relevance filter that uses an LM to determine chunk relevance.

This filter processes chunks in batches, sending them to an LM for relevance determination. It handles potential LM processing failures with a simple strategy controlled by the 'on_failure_keep_all' parameter.

The LM is expected to return a specific output format for each chunk, indicating its relevance to the given query.

The expected LM output format is:

    {
        "results": [
            {
                "explanation": str,
                "is_relevant": bool
            },
            ...
        ]
    }

The number of items in "results" should match the number of input chunks.

Attributes:

Name	Type	Description
`lm_request_processor`	`LMRequestProcessor`	The LM request processor used for LM calls.
`batch_size`	`int`	The number of chunks to process in each LM call.
`on_failure_keep_all`	`bool`	If True, keep all chunks when LM processing fails. If False, discard all chunks from the failed batch.
`metadata`	`list[str] \| None`	List of metadata fields to include. If None, no metadata is included.
`chunk_format`	`str \| Callable[[Chunk], str]`	Either a format string or a callable for custom chunk formatting. If using a format string: - Use {content} for chunk content - Use {metadata} for auto-formatted metadata block - Or reference metadata fields directly: {field_name}

Initialize the LMBasedRelevanceFilter.

Parameters:

Name	Type	Description	Default
`lm_request_processor`	`LMRequestProcessor`	The LM request processor to use for LM calls.	required
`batch_size`	`int`	The number of chunks to process in each LM call. Defaults to DEFAULT_BATCH_SIZE.	`DEFAULT_BATCH_SIZE`
`on_failure_keep_all`	`bool`	If True, keep all chunks when LM processing fails. If False, discard all chunks from the failed batch. Defaults to True.	`True`
`metadata`	`list[str] \| None`	List of metadata fields to include. If None, no metadata is included.	`None`
`chunk_format`	`str \| Callable[[Chunk], str]`	Either a format string or a callable for custom chunk formatting. If using a format string: - Use {content} for chunk content - Use {metadata} for auto-formatted metadata block - Or reference metadata fields directly: {field_name} Defaults to DEFAULT_CHUNK_TEMPLATE.	`DEFAULT_CHUNK_TEMPLATE`

`filter(chunks, query)` `async`

Filter the given chunks based on their relevance to the query using an LM.

This method processes chunks in batches, sending each batch to the LM for relevance determination. If LM processing fails for a batch, the behavior is determined by the 'on_failure_keep_all' attribute.

Parameters:

Name	Type	Description	Default
`chunks`	`list[Chunk]`	The list of chunks to filter.	required
`query`	`str`	The query to compare chunks against.	required

Returns:

Type	Description
`list[Chunk]`	list[Chunk]: A list of chunks deemed relevant by the LM.

`MetadataContextEnricher(metadata_fields, position=MetadataPosition.PREFIX, separator='\n---\n', field_template='- {field}: {value}', skip_empty=True, binary_handling=BinaryHandlingStrategy.BASE64)`

Bases: BaseContextEnricher

A metadata context enricher that adds metadata to the chunk content.

This enricher formats metadata fields into a string and appends it to the chunk content based on the specified position (prefix or suffix).

Attributes:

Name	Type	Description
`metadata_fields`	`list[str]`	List of metadata fields to include in the enriched content.
`position`	`MetadataPosition`	Position of the metadata in the content. Valid values are defined in the MetadataPosition enum: - PREFIX: Metadata block is placed before content - SUFFIX: Metadata block is placed after content
`separator`	`str`	Separator between the metadata and the content.
`field_template`	`str`	Template for formatting each metadata field.
`skip_empty`	`bool`	Whether to skip fields with empty values.
`binary_handling`	`BinaryHandlingStrategy`	Strategy for handling binary data. Valid values are defined in the BinaryHandlingStrategy: - BASE64: Binary data is converted to base64 (default) - HEX: Binary data is converted to hexadecimal - NONE: Binary data is not included in the metadata block

Initialize the metadata context enricher.

Parameters:

Name	Type	Description	Default
`metadata_fields`	`list[str]`	List of metadata field names to include.	required
`position`	`MetadataPosition`	Where to place metadata block. Valid values are defined in the MetadataPosition enum: 1. "prefix": Metadata block is placed before content 2. "suffix": Metadata block is placed after content	`PREFIX`
`separator`	`str`	String to separate metadata from content.	`'\n---\n'`
`field_template`	`str`	Template for formatting each metadata field. Available fields: - {field}: Field name - {value}: Field value	`'- {field}: {value}'`
`skip_empty`	`bool`	Whether to skip empty fields.	`True`
`binary_handling`	`BinaryHandlingStrategy`	Strategy for handling binary data. Valid values are defined in the BinaryHandlingStrategy: 1. "base64": Binary data is converted to base64 (default) 2. "hex": Binary data is converted to hexadecimal 3. "none": Binary data is not included in the metadata block	`BASE64`

`enrich(chunks)` `async`

Enrich chunks with metadata.

Parameters:

Name	Type	Description	Default
`chunks`	`list[Chunk]`	List of chunks to enrich.	required

Returns:

Type	Description
`list[Chunk]`	list[Chunk]: List of enriched chunks.

`Repacker(method='forward', mode='chunk', delimiter='\n\n', size_func=lambda chunk: len(chunk.content), size_limit=None)`

Bases: Component

A class for repacking chunks using various strategies.

Attributes:

Name	Type	Description
`method`	`RepackMethod`	The method used for repacking.
`mode`	`RepackerMode`	The mode of operation (chunk or context).
`delimiter`	`str`	The delimiter used in context mode.
`size_func`	`Callable[[Chunk], int]`	Function used to measure the size of chunks.
`size_limit`	`int \| None`	The maximum allowed total size for the repacked chunks.

Initialize the Repacker instance.

Parameters:

Name	Type	Description	Default
`method`	`str`	The method used for repacking. Defaults to "forward".	`'forward'`
`mode`	`str`	The mode of operation (chunk or context). Defaults to "chunk".	`'chunk'`
`delimiter`	`str`	The delimiter used in context mode. Defaults to "\n\n".	`'\n\n'`
`size_func`	`Callable[[Chunk], int]`	Function used to measure the size of a single chunk. Defaults to the length of the chunk content.	`lambda chunk: len(content)`
`size_limit`	`int \| None`	The maximum allowed total size for the repacked chunks. Defaults to None. Note: The size limit only accounts for the chunks, not including the delimiter in context mode.	`None`

`repack(chunks)` `async`

Repack the input chunks based on the chosen method and mode.

Parameters:

Name	Type	Description	Default
`chunks`	`list[Chunk]`	The input chunks to repack.	required

Returns:

Type	Description
`list[Chunk] \| str`	list[Chunk] \| str: The repacked chunks as a list (in chunk mode) or a string (in context mode).

`SimilarityBasedRelevanceFilter(em_invoker, threshold=0.5)`

Bases: BaseRelevanceFilter

Relevance filter that uses semantic similarity to determine chunk relevance.

Attributes:

Name	Type	Description
`em_invoker`	`BaseEMInvoker`	The embedding model invoker to use for vectorization.
`threshold`	`float`	The similarity threshold for relevance (0 to 1). Defaults to 0.5.

Initialize the SimilarityBasedRelevanceFilter.

Parameters:

Name	Type	Description	Default
`em_invoker`	`BaseEMInvoker`	The embedding model invoker to use for vectorization.	required
`threshold`	`float`	The similarity threshold for relevance (0 to 1). Defaults to 0.5.	`0.5`

Raises:

Type	Description
`ValueError`	If the threshold is not between 0 and 1.

`filter(chunks, query)` `async`

Filter the given chunks based on their semantic similarity to the query.

This method calculates the similarity between the query and each text chunk. For now, non-text chunks are excluded from processing and similarity calculation.

Parameters:

Name	Type	Description	Default
`chunks`	`list[Chunk]`	The list of chunks to filter.	required
`query`	`str`	The query to compare chunks against.	required

Returns:

Type	Description
`list[Chunk]`	list[Chunk]: A list of relevant text chunks. Non-text chunks are not included in the result.

Context Manipulator

LLMLinguaCompressor(model_name='NousResearch/Llama-2-7b-hf', device_map='cuda', rate=0.5, target_token=-1, use_sentence_level_filter=False, use_context_level_filter=True, use_token_level_filter=True, rank_method='longllmlingua')

compress(context, query, instruction=None, options=None) async

LMBasedRelevanceFilter(lm_request_processor, batch_size=DEFAULT_BATCH_SIZE, on_failure_keep_all=True, metadata=None, chunk_format=DEFAULT_CHUNK_TEMPLATE)

filter(chunks, query) async

MetadataContextEnricher(metadata_fields, position=MetadataPosition.PREFIX, separator='\n---\n', field_template='- {field}: {value}', skip_empty=True, binary_handling=BinaryHandlingStrategy.BASE64)

enrich(chunks) async

Repacker(method='forward', mode='chunk', delimiter='\n\n', size_func=lambda chunk: len(chunk.content), size_limit=None)

repack(chunks) async

SimilarityBasedRelevanceFilter(em_invoker, threshold=0.5)

filter(chunks, query) async

`LLMLinguaCompressor(model_name='NousResearch/Llama-2-7b-hf', device_map='cuda', rate=0.5, target_token=-1, use_sentence_level_filter=False, use_context_level_filter=True, use_token_level_filter=True, rank_method='longllmlingua')`

`compress(context, query, instruction=None, options=None)` `async`

`LMBasedRelevanceFilter(lm_request_processor, batch_size=DEFAULT_BATCH_SIZE, on_failure_keep_all=True, metadata=None, chunk_format=DEFAULT_CHUNK_TEMPLATE)`

`filter(chunks, query)` `async`

`MetadataContextEnricher(metadata_fields, position=MetadataPosition.PREFIX, separator='\n---\n', field_template='- {field}: {value}', skip_empty=True, binary_handling=BinaryHandlingStrategy.BASE64)`

`enrich(chunks)` `async`

`Repacker(method='forward', mode='chunk', delimiter='\n\n', size_func=lambda chunk: len(chunk.content), size_limit=None)`

`repack(chunks)` `async`

`SimilarityBasedRelevanceFilter(em_invoker, threshold=0.5)`

`filter(chunks, query)` `async`