Skip to content

Multimodal Em Invoker

Modules concerning the multimodal embedding model invokers used in Gen AI applications.

GoogleVertexAIMultimodalEMInvoker(model_name, credentials_path, project_id=None, location='us-central1', embedding_dimension=1408)

Bases: BaseMultimodalEMInvoker[str | bytes]

A class to interact with multimodal embedding models hosted through Google's Vertex AI API endpoints.

The GoogleVertexAIMultimodalEMInvoker class is responsible for invoking a multimodal embedding model using the Google Vertex AI API. It uses the multimodal embedding model to transform a content or a list of contents into their vector representations.

Attributes:

Name Type Description
model MultiModalEmbeddingModel

The multimodal embedding model to be used for embedding the input content.

embedding_dimension int

The dimension of the embedding vector.

Notes

In order to use the GoogleVertexAIMultimodalEMInvoker, a credentials JSON file for a Google Cloud service account with the Vertex AI API enabled must be provided. For more information on how to create the credentials file, please refer to the following pages: 1. https://cloud.google.com/docs/authentication/application-default-credentials. 2. https://developers.google.com/workspace/guides/create-credentials.

The GoogleVertexAIMultimodalEMInvoker currently supports the following contents: 1. Text, which can be passed as plain strings. 2. Image, which can be passed as: 1. Base64 encoded image bytes. 2. URL pointing to an image. 3. Local image file path. 4. Video, which can be passed as: 1. Base64 encoded video bytes. 2. URL pointing to a video. 3. Local video file path.

Initializes a new instance of the GoogleVertexAIMultimodalEMInvoker class.

Parameters:

Name Type Description Default
model_name str

The name of the multimodal embedding model to be used.

required
credentials_path str

The path to the Google Cloud service account credentials JSON file.

required
project_id str | None

The Google Cloud project ID. Defaults to None, in which case the project ID will be loaded from the credentials file.

None
location str

The location of the Google Cloud project. Defaults to "us-central1".

'us-central1'
embedding_dimension int

The dimension of the embedding vector. Defaults to 1408.

1408

TwelveLabsMultimodalEMInvoker(model_name, api_key, video_status_check_interval=DEFAULT_VIDEO_STATUS_CHECK_INTERVAL)

Bases: BaseMultimodalEMInvoker[str | bytes]

A class to interact with multimodal embedding models hosted through TwelveLabs API endpoints.

The TwelveLabsMultimodalEMInvoker class is responsible for invoking a multimodal embedding model using the TwelveLabs API. It uses the multimodal embedding model to transform a content or a list of contents into their vector representations.

Attributes:

Name Type Description
client TwelveLabs

The client for the TwelveLabs API.

model_name str

The name of the multimodal embedding model to be used.

Notes

The TwelveLabsMultimodalEMInvoker currently supports the following contents: 1. Text, which can be passed as plain strings. 2. Audio, which can be passed as: 1. Base64 encoded audio bytes. 2. URL pointing to an audio file. 3. Local audio file path. 3. Image, which can be passed as: 1. Base64 encoded image bytes. 2. URL pointing to an image. 3. Local image file path. 4. Video, which can be passed as: 1. URL pointing to a video. 2. Local video file path.

Initializes a new instance of the TwelveLabsMultimodalEMInvoker class.

Parameters:

Name Type Description Default
model_name str

The name of the multimodal embedding model to be used.

required
api_key str

The API key for the TwelveLabs API.

required
video_status_check_interval int

The interval in seconds to check the status of the video embedding task. Defaults to DEFAULT_VIDEO_STATUS_CHECK_INTERVAL.

DEFAULT_VIDEO_STATUS_CHECK_INTERVAL