Compressor
Modules used to compress prompt components.
LLMLinguaCompressor(model_name='NousResearch/Llama-2-7b-hf', device_map='cuda', rate=0.5, target_token=-1, use_sentence_level_filter=False, use_context_level_filter=True, use_token_level_filter=True, rank_method='longllmlingua')
Bases: BaseCompressor
LLMLinguaCompressor is a wrapper for LongLLMLingua's PromptCompressor.
This class provides a simplified interface for using LongLLMLingua's compression capabilities within the GLLM series of libraries, with a focus on the 'longllmlingua' ranking method.
Initialize the LLMLinguaCompressor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name |
str
|
The name of the language model to be used. Defaults to "NousResearch/Llama-2-7b-hf". |
'NousResearch/Llama-2-7b-hf'
|
device_map |
str
|
The device to load the model onto, e.g., "cuda" for GPU. Defaults to "cuda". |
'cuda'
|
rate |
float
|
The default compression rate to be used. Defaults to 0.5. |
0.5
|
target_token |
int
|
The default target token count. Defaults to -1 (no specific target). |
-1
|
use_sentence_level_filter |
bool
|
Whether to use sentence-level filtering. Defaults to False. |
False
|
use_context_level_filter |
bool
|
Whether to use context-level filtering. Defaults to True. |
True
|
use_token_level_filter |
bool
|
Whether to use token-level filtering. Defaults to True. |
True
|
rank_method |
str
|
The ranking method to use. Recommended is "longllmlingua". Defaults to "longllmlingua". |
'longllmlingua'
|
compress(context, query, instruction=None, options=None)
async
Compress the given context based on the query and optional instruction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context |
str
|
The context to be compressed. |
required |
query |
str
|
The query related to the context. |
required |
instruction |
str | None
|
An optional instruction to be considered during compression. Defaults to None. |
None
|
options |
dict[str, Any] | None
|
Additional options for fine-tuning the compression process. Supported keys: rate, target_token, use_sentence_level_filter, use_context_level_filter, use_token_level_filter, rank_method. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The compressed context string. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the compression process fails. |