Skip to content

Google realtime session

[BETA] Defines a realtime session module to interact with Gemini Live models.

Authors

Henry Wicaksono (henry.wicaksono@gdplabs.id)

References

[1] https://ai.google.dev/gemini-api/docs/live

GoogleIOStreamer(session, task_group, input_queue, output_queue, input_streamers, output_streamers, post_output_audio_delay, logger)

[BETA] Defines the GoogleIOStreamer.

This class manages the realtime conversation lifecycle. It handles the IO operations between the model and the input/output streamers.

Attributes:

Name Type Description
session AsyncSession

The session of the GoogleIOStreamer.

task_group TaskGroup

The task group of the GoogleIOStreamer.

input_queue Queue

The input queue of the GoogleIOStreamer.

output_queue Queue

The output queue of the GoogleIOStreamer.

input_streamers list[BaseInputStreamer]

The input streamers of the GoogleIOStreamer.

output_streamers list[BaseOutputStreamer]

The output streamers of the GoogleIOStreamer.

post_output_audio_delay float

The delay in seconds to post the output audio.

Initializes a new instance of the GoogleIOStreamer class.

Parameters:

Name Type Description Default
session AsyncSession

The session of the GoogleIOStreamer.

required
task_group TaskGroup

The task group of the GoogleIOStreamer.

required
input_queue Queue

The input queue of the GoogleIOStreamer.

required
output_queue Queue

The output queue of the GoogleIOStreamer.

required
input_streamers list[BaseInputStreamer]

The input streamers of the GoogleIOStreamer.

required
output_streamers list[BaseOutputStreamer]

The output streamers of the GoogleIOStreamer.

required
post_output_audio_delay float

The delay in seconds to post the output audio.

required
logger Logger

The logger of the GoogleIOStreamer.

required

start() async

Processes the realtime conversation.

This method is used to start the realtime conversation. It initializes the input and output streamers, creates the necessary tasks, and starts the conversation. When the conversation is terminated, it cleans up the input and output streamers.

GoogleIOStreamerState

Bases: BaseModel

[BETA] Defines the state of the GoogleIOStreamer with thread-safe properties.

Attributes:

Name Type Description
is_streaming_output bool

Whether the output is streaming.

console_mode Literal['input', 'user', 'assistant']

The current console mode.

terminated bool

Whether the conversation is terminated.

get_console_mode() async

Thread-safe getter for console_mode.

Returns:

Type Description
Literal['input', 'user', 'assistant']

Literal["input", "user", "assistant"]: The value of console_mode.

get_streaming_output() async

Thread-safe getter for is_streaming_output.

Returns:

Name Type Description
bool bool

The value of is_streaming_output.

get_terminated() async

Thread-safe getter for terminated.

Returns:

Name Type Description
bool bool

The value of terminated.

set_console_mode(value) async

Thread-safe setter for console_mode.

Parameters:

Name Type Description Default
value Literal['input', 'user', 'assistant']

The value to set for console_mode.

required

set_streaming_output(value) async

Thread-safe setter for is_streaming_output.

Parameters:

Name Type Description Default
value bool

The value to set for is_streaming_output.

required

set_terminated(value) async

Thread-safe setter for terminated.

Parameters:

Name Type Description Default
value bool

The value to set for terminated.

required

GoogleRealtimeSession(model_name, api_key=None, credentials_path=None, project_id=None, location='us-central1')

Bases: BaseRealtimeSession

[BETA] A realtime session module to interact with Gemini Live models.

Warning

The 'GoogleRealtimeSession' class is currently in beta and may be subject to changes in the future. It is intended only for quick prototyping in local environments. Please avoid using it in production environments.

Attributes:

Name Type Description
model_name str

The name of the language model.

client_params dict[str, Any]

The Google client instance init parameters.

Basic usage

The GoogleRealtimeSession can be used as started as follows:

realtime_session = GoogleRealtimeSession(model_name="gemini-live-2.5-flash-preview")
await realtime_session.invoke()
Custom IO streamers

The GoogleRealtimeSession can be used with custom IO streamers.

input_streamers = [KeyboardInputStreamer(), LinuxMicInputStreamer()]
output_streamers = [ConsoleOutputStreamer(), LinuxSpeakerOutputStreamer()]
realtime_session = GoogleRealtimeSession(model_name="gemini-live-2.5-flash-preview")
await realtime_session.start(input_streamers=input_streamers, output_streamers=output_streamers)

In the above example, we added a capability to use a Linux system microphone and speaker, allowing realtime audio input and output to the model.

Authentication

The GoogleRealtimeSession can use either Google Gen AI or Google Vertex AI.

Google Gen AI is recommended for quick prototyping and development. It requires a Gemini API key for authentication.

Usage example:

realtime_session = GoogleRealtimeSession(
    model_name="gemini-2.5-flash-native-audio-preview-12-2025",
    api_key="your_api_key"
)

Google Vertex AI is recommended to build production-ready applications. It requires a service account JSON file for authentication.

Usage example:

realtime_session = GoogleRealtimeSession(
    model_name="gemini-2.5-flash-native-audio-preview-12-2025",
    credentials_path="path/to/service_account.json"
)

If neither api_key nor credentials_path is provided, Google Gen AI will be used by default. The GOOGLE_API_KEY environment variable will be used for authentication.

Initializes a new instance of the GoogleRealtimeChat class.

Parameters:

Name Type Description Default
model_name str

The name of the model to use.

required
api_key str | None

Required for Google Gen AI authentication. Cannot be used together with credentials_path. Defaults to None.

None
credentials_path str | None

Required for Google Vertex AI authentication. Path to the service account credentials JSON file. Cannot be used together with api_key. Defaults to None.

None
project_id str | None

The Google Cloud project ID for Vertex AI. Only used when authenticating with credentials_path. Defaults to None, in which case it will be loaded from the credentials file.

None
location str

The location of the Google Cloud project for Vertex AI. Only used when authenticating with credentials_path. Defaults to "us-central1".

'us-central1'
Note

If neither api_key nor credentials_path is provided, Google Gen AI will be used by default. The GOOGLE_API_KEY environment variable will be used for authentication.

start(input_streamers=None, output_streamers=None, post_output_audio_delay=DEFAULT_POST_OUTPUT_AUDIO_DELAY) async

Starts the realtime conversation using the provided input and output streamers.

This method is used to start the realtime conversation using a GoogleIOStreamer. The streamers are responsible for handling the input and output of the conversation.

Parameters:

Name Type Description Default
input_streamers list[BaseInputStreamer] | None

The input streamers to use. Defaults to None, in which case a KeyboardInputStreamer will be used.

None
output_streamers list[BaseOutputStreamer] | None

The output streamers to use. Defaults to None, in which case a ConsoleOutputStreamer will be used.

None
post_output_audio_delay float

The delay in seconds to post the output audio. Defaults to 0.5 seconds.

DEFAULT_POST_OUTPUT_AUDIO_DELAY

Raises:

Type Description
ValueError

If the input_streamers or output_streamers is an empty list.

ValueError

If the post_output_audio_delay is not greater than 0.

Exception

If the conversation fails to process.