Hybrid
Milvus implementation of hybrid search capability.
This module provides a Milvus implementation of the HybridCapability protocol, combining BM25-backed sparse retrieval with dense vector similarity search. Encryption and batching are handled by the base class.
MilvusHybridCapability(collection_name, client, uri, config, query_field=CHUNK_KEYS.CONTENT, index_type=DEFAULT_INDEX_TYPE, index_params=None, default_index_params_map=None, distance_metric=DEFAULT_DISTANCE_METRIC, id_max_length=100, content_max_length=65535, encryption=None, default_batch_size=None)
Bases: BaseHybridCapability
Milvus hybrid capability backed by BM25 sparse search and dense vector search.
Attributes:
| Name | Type | Description |
|---|---|---|
collection_name |
str
|
Name of the target Milvus collection. |
client |
AsyncMilvusClient
|
Async Milvus client used for collection operations. |
Initialize the Milvus hybrid capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
collection_name
|
str
|
Name of the Milvus collection. |
required |
client
|
AsyncMilvusClient
|
Async Milvus client instance. |
required |
uri
|
str
|
Milvus URI used to create the sync client for index params. |
required |
config
|
list[SearchConfig]
|
Hybrid search configuration entries. |
required |
query_field
|
str
|
Field used to store chunk content. Defaults to |
CONTENT
|
index_type
|
str
|
Dense index type for vector fields. Defaults to "IVF_FLAT". |
DEFAULT_INDEX_TYPE
|
index_params
|
dict[str, Any] | None
|
Dense index-specific parameters. Defaults to None. |
None
|
default_index_params_map
|
dict[str, dict[str, Any]] | None
|
Default index
parameters keyed by Milvus index type when |
None
|
distance_metric
|
str
|
Dense vector metric type. Defaults to "IP". |
DEFAULT_DISTANCE_METRIC
|
id_max_length
|
int
|
Maximum length for chunk IDs. Defaults to 100. |
100
|
content_max_length
|
int
|
Maximum length for content fields. Defaults to 65535. |
65535
|
encryption
|
EncryptionCapability | None
|
Encryption capability. Defaults to None. |
None
|
default_batch_size
|
int | None
|
Default batch size for batched operations. Defaults to None. |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If no search configuration is provided. |
fulltext_configs
property
Return configured fulltext searches.
Returns:
| Type | Description |
|---|---|
list[SearchConfig]
|
list[SearchConfig]: Fulltext search configurations. |
vector_configs
property
Return configured dense vector searches.
Returns:
| Type | Description |
|---|---|
list[SearchConfig]
|
list[SearchConfig]: Dense vector search configurations. |
create(chunks, batch_size=None, **kwargs)
async
Create chunks, enriching them with dense vectors before persistence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks
|
list[Chunk]
|
Chunks to persist. |
required |
batch_size
|
int | None
|
Batch size override. Defaults to None. |
None
|
**kwargs
|
Any
|
Backend-specific arguments forwarded to upsert. |
{}
|
ensure_index()
async
Ensure the hybrid collection exists with BM25 and dense indexes.
Raises:
| Type | Description |
|---|---|
ValueError
|
If a required vector dimension cannot be inferred. |
update(update_values, filters=None, batch_size=None, **kwargs)
async
Update chunks and regenerate dense vectors when text changes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Values to update. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters selecting the target rows. Defaults to None. |
None
|
batch_size
|
int | None
|
Batch size override. Defaults to None. |
None
|
**kwargs
|
Any
|
Backend-specific arguments forwarded to upsert. |
{}
|