Fulltext
Milvus implementation of fulltext search and CRUD capability.
This module provides a Milvus implementation of the FulltextCapability protocol for basic text-based CRUD operations and filtering.
MilvusFulltextCapability(collection_name, client, query_field='content', id_max_length=100, content_max_length=65535)
Milvus implementation of FulltextCapability protocol.
This class provides document CRUD operations and filtering using Milvus.
Attributes:
| Name | Type | Description |
|---|---|---|
collection_name |
str
|
The name of the Milvus collection. |
client |
AsyncMilvusClient
|
Async Milvus client instance. |
query_field |
str
|
The field name to use for text content. |
Initialize the Milvus fulltext capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
collection_name
|
str
|
The name of the Milvus collection. |
required |
client
|
AsyncMilvusClient
|
The async Milvus client instance. |
required |
query_field
|
str
|
The field name to use for text content. Defaults to "content". |
'content'
|
id_max_length
|
int
|
Maximum length for ID field. Defaults to 100. |
100
|
content_max_length
|
int
|
Maximum length for content field. Defaults to 65535. |
65535
|
clear(**kwargs)
async
Clear all records from the datastore.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Backend-specific parameters. |
{}
|
create(data, **kwargs)
async
delete(filters=None, options=None, **kwargs)
async
Delete records from the datastore.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filters
|
FilterClause | QueryFilter | None
|
Filters to select records to delete. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None, in which case no operation is performed (no-op). |
None
|
options
|
QueryOptions | None
|
Query options for sorting and limiting deletions. Defaults to None. |
None
|
**kwargs
|
Any
|
Backend-specific parameters. |
{}
|
Note
If filters is None, no operation is performed (no-op). When options with limit or order_by are provided, records are first retrieved and then deleted by ID. Otherwise, deletion uses filter expressions directly.
ensure_index()
async
Ensure collection exists with proper schema for fulltext capability.
This method is idempotent - if the collection already exists, it will skip creation and return early.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If collection creation fails. |
retrieve(filters=None, options=None, **kwargs)
async
Read records from the datastore with optional filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filters
|
FilterClause | QueryFilter | None
|
Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options like limit and sorting. Defaults to None. |
None
|
**kwargs
|
Any
|
Backend-specific parameters. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Query results. |
retrieve_fuzzy(query, max_distance=2, filters=None, options=None, max_candidates=1000, **kwargs)
async
Find records that fuzzy match the query within distance threshold.
This method retrieves candidates from Milvus using metadata filters first, then performs client-side fuzzy matching using Levenshtein distance. The max_candidates parameter limits the initial query to reduce processing time, and the final limit from options is applied after sorting by distance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Text to fuzzy match against. |
required |
max_distance
|
int
|
Maximum edit distance for matches (Levenshtein distance). Defaults to 2. |
2
|
filters
|
FilterClause | QueryFilter | None
|
Optional metadata filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options (limit, sorting, etc.). Defaults to None. The limit is applied client-side after distance sorting. |
None
|
max_candidates
|
int
|
Maximum number of candidates to retrieve from Milvus before applying fuzzy matching. Defaults to 1000. This helps limit processing time for large datasets. |
1000
|
**kwargs
|
Any
|
Backend-specific parameters. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Matched chunks ordered by distance (ascending) or by options.order_by if specified. |
update(update_values, filters=None, **kwargs)
async
Update existing records in the datastore.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Values to update. Supports "content" for updating document content and "metadata" for updating metadata. Other keys are treated as direct metadata updates. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None. |
None
|
**kwargs
|
Any
|
Backend-specific parameters (e.g., partition_name). |
{}
|
Note
If filters is None, no operation is performed (no-op).