Skip to content

Vector

In-memory implementation of vector similarity search capability.

This module provides an in-memory implementation of the VectorCapability protocol using dictionary-based storage optimized for development and testing scenarios.

Authors

Kadek Denaya (kadek.d.r.diana@gdplabs.id)

References

NONE

InMemoryVectorCapability(em_invoker, store=None)

In-memory implementation of VectorCapability protocol.

This class provides vector similarity search operations using pure Python data structures optimized for development and testing.

Attributes:

Name Type Description
store dict[str, Chunk]

Dictionary storing Chunk objects with their IDs as keys.

em_invoker BaseEMInvoker

em_invoker model for text-to-vector conversion.

Initialize the in-memory vector capability.

Parameters:

Name Type Description Default
em_invoker BaseEMInvoker

em_invoker model for text-to-vector conversion.

required
store dict[str, Any] | None

Dictionary storing Chunk objects with their IDs as keys. Defaults to None.

None

clear() async

Clear all vectors from the store.

create(data) async

Add chunks to the vector store with automatic embedding generation.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Single chunk or list of chunks to add.

required

create_from_vector(chunk_vectors) async

Add pre-computed vectors directly.

Parameters:

Name Type Description Default
chunk_vectors list[tuple[Chunk, Vector]]

List of tuples containing chunks and their corresponding vectors.

required

delete(filters=None) async

Delete records from the datastore.

Usage Example
await vector_capability.delete(
    filters=QueryFilter(conditions={"metadata.category": "AI"}),
)

This will delete all chunks from the vector store that match the filters.

Parameters:

Name Type Description Default
filters QueryFilter | None

Filters to select records to delete. Defaults to None, in which case no operation is performed (no-op).

None

retrieve(query, filters=None, options=None) async

Read records from the datastore using text-based similarity search with optional filtering.

Usage Example
await vector_capability.retrieve(
    query="What is the capital of France?",
    filters=QueryFilter(conditions={"metadata.source": "wikipedia"}),
    options=QueryOptions(limit=2),
)

This will retrieve the top 2 chunks by similarity score from the vector store that match the query and the filters. The chunks will be sorted by score in descending order.

Parameters:

Name Type Description Default
query str

Input text to embed and search with.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None, in which case, no sorting is applied and top 10 chunks are returned.

None

Returns:

Type Description
list[Chunk]

list[Chunk]: Top ranked chunks by similarity score.

retrieve_by_vector(vector, filters=None, options=None) async

Direct vector similarity search.

Parameters:

Name Type Description Default
vector Vector

Query embedding vector.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None, in which case, no sorting is applied and top 10 chunks are returned.

None

Returns:

Type Description
list[Chunk]

list[Chunk]: List of chunks ordered by similarity score.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Example
  1. Update certain metadata of a chunk with specific filters.
await vector_capability.update(
    update_values={"metadata": {"status": "published"}},
    filters=QueryFilter(conditions={"metadata.status": "draft"}),
)

2. Update certain content of a chunk with specific id.
This will also regenerate the vector of the chunk.
```python
await vector_capability.update(
    update_values={"content": "new_content"},
    filters=QueryFilter(conditions={"id": "unique_id"}),
)

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters QueryFilter | None

Filters to select records to update. Defaults to None, in which case no operation is performed (no-op).

None
**kwargs Any

Datastore-specific parameters.

{}

Raises:

Type Description
ValueError

If content is empty.