Skip to content

Openai em invoker

Defines a module to interact with OpenAI embedding models.

Authors

Henry Wicaksono (henry.wicaksono@gdplabs.id)

References

[1] https://platform.openai.com/docs/api-reference/embeddings/create

OpenAIEMInvoker(model_name, api_key=None, model_kwargs=None, default_hyperparameters=None, retry_config=None)

Bases: BaseEMInvoker

An embedding model invoker to interact with OpenAI embedding models.

Attributes:

Name Type Description
model_id str

The model ID of the embedding model.

model_provider str

The provider of the embedding model.

model_name str

The name of the embedding model.

client AsyncOpenAI

The client for the OpenAI API.

default_hyperparameters dict[str, Any]

Default hyperparameters for invoking the embedding model.

retry_config RetryConfig

The retry configuration for the embedding model.

Input types

The OpenAIEMInvoker only supports text inputs.

Output format

The OpenAIEMInvoker can embed either: 1. A single content. 1. A single content is a single text. 2. The output will be a Vector, representing the embedding of the content.

# Example 1: Embedding a text content. python text = "This is a text" result = await em_invoker.invoke(text)

The above examples will return a Vector with a size of (embedding_size,).

  1. A list of contents.
  2. A list of contents is a list of texts.
  3. The output will be a list[Vector], where each element is a Vector representing the embedding of each single content.

# Example: Embedding a list of contents. python text1 = "This is a text" text2 = "This is another text" text3 = "This is yet another text" result = await em_invoker.invoke([text1, text2, text3])

The above examples will return a list[Vector] with a size of (3, embedding_size).

Retry and timeout

The OpenAIEMInvoker supports retry and timeout configuration. By default, the max retries is set to 0 and the timeout is set to 30.0 seconds. They can be customized by providing a custom RetryConfig object to the retry_config parameter.

Retry config examples:

retry_config = RetryConfig(max_retries=0, timeout=0.0)  # No retry, no timeout
retry_config = RetryConfig(max_retries=0, timeout=10.0)  # No retry, 10.0 seconds timeout
retry_config = RetryConfig(max_retries=5, timeout=0.0)  # 5 max retries, no timeout
retry_config = RetryConfig(max_retries=5, timeout=10.0)  # 5 max retries, 10.0 seconds timeout

Usage example:

em_invoker = OpenAIEMInvoker(..., retry_config=retry_config)

Initializes a new instance of the OpenAIEMInvoker class.

Parameters:

Name Type Description Default
model_name str

The name of the OpenAI embedding model to be used.

required
api_key str | None

The API key for the OpenAI API. Defaults to None, in which case the OPENAI_API_KEY environment variable will be used.

None
model_kwargs dict[str, Any] | None

Additional keyword arguments for the OpenAI client. Defaults to None.

None
default_hyperparameters dict[str, Any] | None

Default hyperparameters for invoking the model. Defaults to None.

None
retry_config RetryConfig | None

The retry configuration for the embedding model. Defaults to None, in which case a default config with no retry and 30.0 seconds timeout is used.

None