Skip to content

Hybrid

Milvus implementation of hybrid search capability.

This module provides a Milvus implementation of the HybridCapability protocol, combining BM25-backed sparse retrieval with dense vector similarity search. Encryption and batching are handled by the base class.

MilvusHybridCapability(collection_name, client, uri, config, query_field=CHUNK_KEYS.CONTENT, index_type=DEFAULT_INDEX_TYPE, index_params=None, default_index_params_map=None, distance_metric=DEFAULT_DISTANCE_METRIC, id_max_length=100, content_max_length=65535, encryption=None, default_batch_size=None)

Bases: BaseHybridCapability

Milvus hybrid capability backed by BM25 sparse search and dense vector search.

Attributes:

Name Type Description
collection_name str

Name of the target Milvus collection.

client AsyncMilvusClient

Async Milvus client used for collection operations.

Initialize the Milvus hybrid capability.

Parameters:

Name Type Description Default
collection_name str

Name of the Milvus collection.

required
client AsyncMilvusClient

Async Milvus client instance.

required
uri str

Milvus URI used to create the sync client for index params.

required
config list[SearchConfig]

Hybrid search configuration entries.

required
query_field str

Field used to store chunk content. Defaults to content.

CONTENT
index_type str

Dense index type for vector fields. Defaults to "IVF_FLAT".

DEFAULT_INDEX_TYPE
index_params dict[str, Any] | None

Dense index-specific parameters. Defaults to None.

None
default_index_params_map dict[str, dict[str, Any]] | None

Default index parameters keyed by Milvus index type when index_params is not provided. Defaults to None.

None
distance_metric str

Dense vector metric type. Defaults to "IP".

DEFAULT_DISTANCE_METRIC
id_max_length int

Maximum length for chunk IDs. Defaults to 100.

100
content_max_length int

Maximum length for content fields. Defaults to 65535.

65535
encryption EncryptionCapability | None

Encryption capability. Defaults to None.

None
default_batch_size int | None

Default batch size for batched operations. Defaults to None.

None

Raises:

Type Description
ValueError

If no search configuration is provided.

fulltext_configs property

Return configured fulltext searches.

Returns:

Type Description
list[SearchConfig]

list[SearchConfig]: Fulltext search configurations.

vector_configs property

Return configured dense vector searches.

Returns:

Type Description
list[SearchConfig]

list[SearchConfig]: Dense vector search configurations.

create(chunks, batch_size=None, **kwargs) async

Create chunks, enriching them with dense vectors before persistence.

Parameters:

Name Type Description Default
chunks list[Chunk]

Chunks to persist.

required
batch_size int | None

Batch size override. Defaults to None.

None
**kwargs Any

Backend-specific arguments forwarded to upsert.

{}

ensure_index() async

Ensure the hybrid collection exists with BM25 and dense indexes.

Raises:

Type Description
ValueError

If a required vector dimension cannot be inferred.

update(update_values, filters=None, batch_size=None, **kwargs) async

Update chunks and regenerate dense vectors when text changes.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters FilterClause | QueryFilter | None

Filters selecting the target rows. Defaults to None.

None
batch_size int | None

Batch size override. Defaults to None.

None
**kwargs Any

Backend-specific arguments forwarded to upsert.

{}