Skip to content

Fulltext

Milvus implementation of fulltext search and CRUD capability.

This module provides a Milvus implementation of the FulltextCapability protocol for basic text-based CRUD operations and filtering.

MilvusFulltextCapability(collection_name, client, query_field='content', id_max_length=100, content_max_length=65535)

Milvus implementation of FulltextCapability protocol.

This class provides document CRUD operations and filtering using Milvus.

Attributes:

Name Type Description
collection_name str

The name of the Milvus collection.

client AsyncMilvusClient

Async Milvus client instance.

query_field str

The field name to use for text content.

Initialize the Milvus fulltext capability.

Parameters:

Name Type Description Default
collection_name str

The name of the Milvus collection.

required
client AsyncMilvusClient

The async Milvus client instance.

required
query_field str

The field name to use for text content. Defaults to "content".

'content'
id_max_length int

Maximum length for ID field. Defaults to 100.

100
content_max_length int

Maximum length for content field. Defaults to 65535.

65535

clear(**kwargs) async

Clear all records from the datastore.

Parameters:

Name Type Description Default
**kwargs Any

Backend-specific parameters.

{}

create(data, **kwargs) async

Create new records in the datastore.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Data to create (single item or collection).

required
**kwargs Any

Backend-specific parameters (e.g., partition_name).

{}

Raises:

Type Description
ValueError

If data structure is invalid.

delete(filters=None, options=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None, in which case no operation is performed (no-op).

None
options QueryOptions | None

Query options for sorting and limiting deletions. Defaults to None.

None
**kwargs Any

Backend-specific parameters.

{}
Note

If filters is None, no operation is performed (no-op). When options with limit or order_by are provided, records are first retrieved and then deleted by ID. Otherwise, deletion uses filter expressions directly.

ensure_index() async

Ensure collection exists with proper schema for fulltext capability.

This method is idempotent - if the collection already exists, it will skip creation and return early.

Raises:

Type Description
RuntimeError

If collection creation fails.

retrieve(filters=None, options=None, **kwargs) async

Read records from the datastore with optional filtering.

Parameters:

Name Type Description Default
filters FilterClause | QueryFilter | None

Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Backend-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

retrieve_fuzzy(query, max_distance=2, filters=None, options=None, max_candidates=1000, **kwargs) async

Find records that fuzzy match the query within distance threshold.

This method retrieves candidates from Milvus using metadata filters first, then performs client-side fuzzy matching using Levenshtein distance. The max_candidates parameter limits the initial query to reduce processing time, and the final limit from options is applied after sorting by distance.

Parameters:

Name Type Description Default
query str

Text to fuzzy match against.

required
max_distance int

Maximum edit distance for matches (Levenshtein distance). Defaults to 2.

2
filters FilterClause | QueryFilter | None

Optional metadata filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
options QueryOptions | None

Query options (limit, sorting, etc.). Defaults to None. The limit is applied client-side after distance sorting.

None
max_candidates int

Maximum number of candidates to retrieve from Milvus before applying fuzzy matching. Defaults to 1000. This helps limit processing time for large datasets.

1000
**kwargs Any

Backend-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Matched chunks ordered by distance (ascending) or by options.order_by if specified.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update. Supports "content" for updating document content and "metadata" for updating metadata. Other keys are treated as direct metadata updates.

required
filters FilterClause | QueryFilter | None

Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.

None
**kwargs Any

Backend-specific parameters (e.g., partition_name).

{}
Note

If filters is None, no operation is performed (no-op).