Multimodal Em Invoker
Modules concerning the multimodal embedding model invokers used in Gen AI applications.
GoogleVertexAIMultimodalEMInvoker(model_name, credentials_path, project_id=None, location='us-central1', embedding_dimension=1408)
Bases: BaseMultimodalEMInvoker[str | bytes]
A class to interact with multimodal embedding models hosted through Google's Vertex AI API endpoints.
The GoogleVertexAIMultimodalEMInvoker
class is responsible for invoking a multimodal embedding model using the
Google Vertex AI API. It uses the multimodal embedding model to transform a content or a list of contents
into their vector representations.
Attributes:
Name | Type | Description |
---|---|---|
model |
MultiModalEmbeddingModel
|
The multimodal embedding model to be used for embedding the input content. |
embedding_dimension |
int
|
The dimension of the embedding vector. |
Notes
In order to use the GoogleVertexAIMultimodalEMInvoker
, a credentials JSON file for a Google Cloud service
account with the Vertex AI API enabled must be provided. For more information on how to create the credentials
file, please refer to the following pages:
1. https://cloud.google.com/docs/authentication/application-default-credentials.
2. https://developers.google.com/workspace/guides/create-credentials.
The GoogleVertexAIMultimodalEMInvoker
currently supports the following contents:
1. Text, which can be passed as plain strings.
2. Image, which can be passed as:
1. Base64 encoded image bytes.
2. URL pointing to an image.
3. Local image file path.
4. Video, which can be passed as:
1. Base64 encoded video bytes.
2. URL pointing to a video.
3. Local video file path.
Initializes a new instance of the GoogleVertexAIMultimodalEMInvoker class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name |
str
|
The name of the multimodal embedding model to be used. |
required |
credentials_path |
str
|
The path to the Google Cloud service account credentials JSON file. |
required |
project_id |
str | None
|
The Google Cloud project ID. Defaults to None, in which case the project ID will be loaded from the credentials file. |
None
|
location |
str
|
The location of the Google Cloud project. Defaults to "us-central1". |
'us-central1'
|
embedding_dimension |
int
|
The dimension of the embedding vector. Defaults to 1408. |
1408
|
TwelveLabsMultimodalEMInvoker(model_name, api_key, video_status_check_interval=DEFAULT_VIDEO_STATUS_CHECK_INTERVAL)
Bases: BaseMultimodalEMInvoker[str | bytes]
A class to interact with multimodal embedding models hosted through TwelveLabs API endpoints.
The TwelveLabsMultimodalEMInvoker
class is responsible for invoking a multimodal embedding model using the
TwelveLabs API. It uses the multimodal embedding model to transform a content or a list of contents
into their vector representations.
Attributes:
Name | Type | Description |
---|---|---|
client |
TwelveLabs
|
The client for the TwelveLabs API. |
model_name |
str
|
The name of the multimodal embedding model to be used. |
Notes
The TwelveLabsMultimodalEMInvoker
currently supports the following contents:
1. Text, which can be passed as plain strings.
2. Audio, which can be passed as:
1. Base64 encoded audio bytes.
2. URL pointing to an audio file.
3. Local audio file path.
3. Image, which can be passed as:
1. Base64 encoded image bytes.
2. URL pointing to an image.
3. Local image file path.
4. Video, which can be passed as:
1. URL pointing to a video.
2. Local video file path.
Initializes a new instance of the TwelveLabsMultimodalEMInvoker class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name |
str
|
The name of the multimodal embedding model to be used. |
required |
api_key |
str
|
The API key for the TwelveLabs API. |
required |
video_status_check_interval |
int
|
The interval in seconds to check the status of the video embedding task. Defaults to DEFAULT_VIDEO_STATUS_CHECK_INTERVAL. |
DEFAULT_VIDEO_STATUS_CHECK_INTERVAL
|