Utils
Defines utilities for gllm_inference.
get_basic_auth_headers(username, password)
Generates the headers required for Basic Authentication.
This method creates a header dictionary using Basic Authentication scheme. It encodes the username and password constants into Base64 format and prepares them for HTTP header authentication.
Returns:
Type | Description |
---|---|
dict[str, str] | None
|
dict[str, str] | None: A dictionary containing the 'Authorization' header with Base64 encoded credentials.
Returns |
get_mime_type(content)
Determines the MIME type of the provided content.
This method determines the MIME type of the provided content by using the magic
library.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
content |
bytes
|
The content to determine the MIME type. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The MIME type of the content. |
get_prompt_keys(template)
Extracts keys from a template string based on a regex pattern.
This function searches the template for placeholders enclosed in single curly braces {}
and ignores
any placeholders within double curly braces {{}}
. It returns a set of the unique keys found.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
template |
str
|
The template string containing placeholders. |
required |
Returns:
Type | Description |
---|---|
set[str]
|
set[str]: A set of keys extracted from the template. |
invoke_google_multimodal_lm(client, messages, hyperparameters, event_emitter)
async
Invokes the Google multimodal language model with the provided prompt and hyperparameters.
This method processes the prompt using the input prompt and hyperparameters. It handles both standard and streaming invocation. Streaming mode is enabled if an event emitter is provided.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client |
Any
|
The Google client instance. This could either be:
1. A |
required |
messages |
list[dict[str, Any]]
|
The input messages to be sent to the model. |
required |
hyperparameters |
dict[str, Any]
|
A dictionary of hyperparameters for the model. |
required |
event_emitter |
EventEmitter | None
|
The event emitter for streaming tokens. If provided, streaming invocation is enabled. Defaults to None. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The generated response from the model. |
is_local_file_path(content, valid_extensions)
Checks if the content is a local file path.
This method checks if the content is a local file path by verifying that the content:
1. Is a string.
2. Is a valid existing file path.
3. Has a valid extension defined in the valid_extensions
set.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
content |
Any
|
The content to check. |
required |
valid_extensions |
set[str]
|
The set of valid extensions. |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if the content is a local file path with a valid extension, False otherwise. |
is_remote_file_path(content, valid_extensions)
Checks if the content is a remote file path.
This method checks if the content is a remote file path by verifying that the content:
1. Is a string.
2. Is a URL with a valid scheme of http
or https
.
3. Has a valid extension defined in the valid_extensions
set.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
content |
Any
|
The content to check. |
required |
valid_extensions |
set[str]
|
The set of valid extensions. |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if the content is a remote file path with a valid extension, False otherwise. |
is_valid_extension(content, valid_extensions)
Checks if the content has a valid extension.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
content |
str
|
The content to check. |
required |
valid_extensions |
set[str]
|
The set of valid extensions. |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if the content has a valid extension, False otherwise. |
load_google_vertexai_project_id(credentials_path)
Loads the Google Vertex AI project ID from the credentials file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
credentials_path |
str
|
The path to the credentials file. |
required |
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
The Google Vertex AI project ID. |
load_langchain_model(model_class_path, model_name, model_kwargs)
Loads the LangChain's model instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_class_path |
str
|
The path to the LangChain's class, e.g. "langchain_openai.ChatOpenAI". |
required |
model_name |
str
|
The model name. |
required |
model_kwargs |
dict[str, Any]
|
The additional keyword arguments. |
required |
Returns:
Type | Description |
---|---|
BaseChatModel | Embeddings
|
BaseChatModel | Embeddings: The LangChain's model instance. |
parse_model_data(model)
Parses the model data from LangChain's BaseChatModel or Embeddings instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model |
BaseChatModel | Embeddings
|
The LangChain's BaseChatModel or Embeddings instance. |
required |
Returns:
Type | Description |
---|---|
dict[str, str]
|
dict[str, str]: The dictionary containing the model name and path. |
Raises:
Type | Description |
---|---|
ValueError
|
If the model name is not found in the model data. |
preprocess_tei_input(texts, prefix)
Preprocesses TEI input texts by replacing newline characters with spaces and adding the prefix to the text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts |
list[str]
|
The list of texts to preprocess. |
required |
prefix |
str
|
The prefix to add to the text. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
list[str]: The list of preprocessed texts. |
validate_prompt_builder_kwargs(prompt_key_set, ignore_extra_keys=False, **kwargs)
Validates that the provided kwargs match the expected prompt keys exactly.
This helper function checks if the provided kwargs contain all and only the keys required by the prompt templates.
If any required key is missing or there are extra keys, it raises a ValueError
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt_key_set |
set[str]
|
The set of required prompt keys. |
required |
ignore_extra_keys |
bool
|
Whether to ignore extra keys. Defaults to False. |
False
|
**kwargs |
Any
|
The keyword arguments to be validated against the required prompt keys. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If any required key is missing from or any extra key is present in the kwargs. |
validate_string_enum(enum_type, value)
Validates that the provided value is a valid string enum value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
enum_type |
type[StrEnum]
|
The type of the string enum. |
required |
value |
str
|
The value to validate. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If the provided value is not a valid string enum value. |