Skip to content

Multimodal lm invoker

Defines a base class for multimodal language model invokers used in Gen AI applications.

Authors

Henry Wicaksono (henry.wicaksono@gdplabs.id)

References

NONE

BaseMultimodalLMInvoker(default_hyperparameters=None)

Bases: ABC, Generic[InputType, OutputType]

A base class for multimodal language model invokers used in Gen AI applications.

The BaseMultimodalLMInvoker class provides a framework for invoking multimodal language models with the provided prompt and hyperparameters. The prompt may contain multimodal inputs that is defined by the type variable InputType, while the multimodal output is defined by the type variable OutputType. It handles both standard and streaming invocation.

Attributes:

Name Type Description
default_hyperparameters dict[str, Any]

Default hyperparameters for invoking the multimodal language model.

Initializes a new instance of the BaseMultimodalLMInvoker class.

Parameters:

Name Type Description Default
default_hyperparameters dict[str, Any] | None

Default hyperparameters for invoking the multimodal language model. Defaults to None, in which case an empty dictionary is used.

None

invoke(prompt, hyperparameters=None, event_emitter=None) async

Invokes the multimodal language model with the provided prompt and hyperparameters.

This method validates the prompt and invokes the multimodal language model with the provided prompt and hyperparameters. The prompt may contain multimodal inputs that is defined by the type variable InputType. It handles both standard and streaming invocation. Streaming mode is enabled if an event emitter is provided.

Parameters:

Name Type Description Default
prompt list[tuple[str, list[InputType]]]

The input prompt as a list of tuples containing a role-content list pair. The content list may contain multimodal inputs that is defined by the type variable InputType.

required
hyperparameters dict[str, Any] | None

A dictionary of hyperparameters for the multimodal language model. Defaults to None, in which case the default hyperparameters are used.

None
event_emitter EventEmitter | None

The event emitter for streaming tokens. If provided, streaming invocation is enabled. Defaults to None.

None

Returns:

Name Type Description
OutputType OutputType

The generated response from the multimodal language model.

Raises:

Type Description
ValueError

If the prompt is not in the correct format.