Skip to content

Vector

Milvus implementation of vector similarity search capability.

This module provides a Milvus implementation of the VectorCapability protocol for vector-based semantic similarity search operations.

MilvusVectorCapability(collection_name, client, em_invoker, dimension, distance_metric='L2', vector_field='dense_vector', id_max_length=100, content_max_length=65535)

Milvus implementation of VectorCapability protocol.

This class provides vector similarity search operations using Milvus.

Attributes:

Name Type Description
collection_name str

The name of the Milvus collection.

client AsyncMilvusClient

Async Milvus client instance.

em_invoker BaseEMInvoker

Embedding model invoker for vectorization.

dimension int

Vector dimension.

distance_metric str

Distance metric ("L2", "IP", "COSINE").

vector_field str

Field name for dense vectors.

id_max_length int

Maximum length for ID field.

content_max_length int

Maximum length for content field.

Initialize the Milvus vector capability.

Parameters:

Name Type Description Default
collection_name str

The name of the Milvus collection.

required
client AsyncMilvusClient

The async Milvus client instance.

required
em_invoker BaseEMInvoker

The embedding model invoker.

required
dimension int

Vector dimension.

required
distance_metric str

Distance metric. Defaults to "L2". Supported: "L2", "IP", "COSINE".

'L2'
vector_field str

Field name for dense vectors. Defaults to "dense_vector".

'dense_vector'
id_max_length int

Maximum length for ID field. Defaults to 100.

100
content_max_length int

Maximum length for content field. Defaults to 65535.

65535

em_invoker property

Returns the EM Invoker instance.

Returns:

Name Type Description
BaseEMInvoker BaseEMInvoker

The EM Invoker instance.

clear() async

Clear all records from the datastore.

create(data, **kwargs) async

Add chunks to the vector store with automatic embedding generation.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Single chunk or list of chunks to add.

required
**kwargs Any

Backend-specific parameters (e.g., partition_name).

{}

Raises:

Type Description
ValueError

If vector dimension mismatch occurs.

create_from_vector(chunk_vectors, **kwargs) async

Add pre-computed vectors directly.

Parameters:

Name Type Description Default
chunk_vectors list[tuple[Chunk, Vector]]

List of tuples containing chunks and their corresponding vectors.

required
**kwargs Any

Backend-specific parameters (e.g., partition_name).

{}

Raises:

Type Description
ValueError

If vector dimension mismatch occurs.

delete(filters=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None, in which case no operation is performed (no-op).

None
**kwargs Any

Backend-specific parameters.

{}
Note

If filters is None, no operation is performed (no-op).

ensure_index(index_type='IVF_FLAT', index_params=None, query_field='content', **kwargs) async

Ensure collection and vector index exist, creating them if necessary.

This method is idempotent - if the collection and index already exist, it will skip creation and return early. Uses a lock to prevent race conditions when called concurrently from multiple coroutines.

Parameters:

Name Type Description Default
index_type str

Index type. Defaults to "IVF_FLAT". Supported: "IVF_FLAT", "HNSW".

'IVF_FLAT'
index_params dict[str, Any] | None

Index-specific parameters. Defaults to None, in which case default parameters are used.

None
query_field str

Field name for text content. Defaults to "content".

'content'
**kwargs Any

Additional parameters.

{}

Raises:

Type Description
RuntimeError

If collection or index creation fails.

retrieve(query, filters=None, options=None, **kwargs) async

Read records from the datastore using text-based similarity search with optional filtering.

Parameters:

Name Type Description Default
query str

Input text to embed and search with.

required
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Backend-specific parameters (e.g., search_params).

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results ordered by similarity score.

retrieve_by_vector(vector, filters=None, options=None, search_params=None, **kwargs) async

Direct vector similarity search.

Parameters:

Name Type Description Default
vector Vector

Query embedding vector.

required
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
search_params dict[str, Any] | None

Search parameters for Milvus. If None, default search parameters based on distance metric will be used. Defaults to None.

None
**kwargs Any

Backend-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: List of chunks ordered by similarity score.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update. Supports content for updating document content and metadata for updating metadata. If content is updated, vectors are re-embedded.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Backend-specific parameters (e.g., partition_name).

{}
Note

If filters is None, no operation is performed (no-op).