Vector

Redis implementation of vector similarity search capability.

This module provides a Redis implementation of the VectorCapability protocol using RedisVL AsyncSearchIndex for vector storage and similarity search.

`RedisVectorCapability(index_name, client, em_invoker, encryption=None)`

Redis implementation of VectorCapability protocol.

This class provides vector similarity search operations using RedisVL AsyncSearchIndex for vector storage and retrieval.

Attributes:

Name	Type	Description
`index_name`	`str`	Name of the Redis index.
`client`	`Redis`	Redis async client instance.
`em_invoker`	`BaseEMInvoker`	Embedding model for vectorization.
`index`	`Any`	RedisVL AsyncSearchIndex instance.

Initialize the Redis vector capability.

Schema will be automatically inferred from chunks when creating a new index, or auto-detected from an existing index when performing operations.

Parameters:

Name	Type	Description	Default
`index_name`	`str`	Name of the Redis index.	required
`client`	`Redis`	Redis async client instance.	required
`em_invoker`	`BaseEMInvoker`	Embedding model for vectorization.	required
`encryption`	`EncryptionCapability \| None`	Encryption capability for field-level encryption. Defaults to None.	`None`

`em_invoker` `property`

Returns the EM Invoker instance.

Returns:

Name	Type	Description
`BaseEMInvoker`	`BaseEMInvoker`	The EM Invoker instance.

`clear()` `async`

Clear all records from the datastore.

`create(data)` `async`

Add chunks to the vector store with automatic embedding generation.

This method will automatically encrypt the content and metadata of the chunks if encryption is enabled following the encryption configuration. When encryption is enabled, embeddings are generated from plaintext first, then chunks are encrypted, ensuring that embeddings represent the original content rather than encrypted ciphertext.

If the index does not exist, the schema will be inferred from the chunks being created.

Parameters:

Name	Type	Description	Default
`data`	`Chunk \| list[Chunk]`	Single chunk or list of chunks to add.	required

Raises:

Type	Description
`ValueError`	If data structure is invalid or chunk content is invalid.

`create_from_vector(chunk_vectors)` `async`

Add pre-computed vectors directly.

This method will automatically encrypt the content and metadata of the chunks if encryption is enabled following the encryption configuration.

If the index does not exist, the schema will be inferred from the chunks being created.

Parameters:

Name	Type	Description	Default
`chunk_vectors`	`list[tuple[Chunk, Vector]]`	List of tuples containing chunks and their corresponding vectors.	required

Raises:

Type	Description
`ValueError`	If chunk content is invalid.

`delete(filters=None)` `async`

Delete records from the datastore.

Processes deletions in batches to avoid loading all matching documents into memory. If filters is None, no operation is performed (no-op).

Parameters:

Name	Type	Description	Default
`filters`	`FilterClause \| QueryFilter \| None`	Filters to select records to delete. Defaults to None.	`None`

`ensure_index(filterable_fields=None)` `async`

Ensure Redis vector index exists, creating it if necessary.

This method is idempotent - if the index already exists, it will skip creation and return early.

Parameters:

Name	Type	Description	Default
`filterable_fields`	`list[dict[str, Any]] \| None`	List of filterable field configurations to use when creating a new index. Each field should be a dictionary with "name" and "type" keys. For example: [{"name": "metadata.category", "type": "tag"}, {"name": "metadata.score", "type": "numeric"}] If not provided and index doesn't exist, a default schema will be created with only basic fields (id, content, metadata, vector). Defaults to None.	`None`

Raises:

Type	Description
`RuntimeError`	If index creation fails.

`retrieve(query, filters=None, options=None)` `async`

Read records from the datastore using text-based similarity search with optional filtering.

Parameters:

Name	Type	Description	Default
`query`	`str`	Input text to embed and search with.	required
`filters`	`FilterClause \| QueryFilter \| None`	Query filters to apply. Defaults to None.	`None`
`options`	`QueryOptions \| None`	Query options like limit and sorting. Defaults to None.	`None`

Returns:

Type	Description
`list[Chunk]`	list[Chunk]: Query results ordered by similarity score.

`retrieve_by_vector(vector, filters=None, options=None)` `async`

Direct vector similarity search.

Parameters:

Name	Type	Description	Default
`vector`	`Vector`	Query embedding vector.	required
`filters`	`FilterClause \| QueryFilter \| None`	Query filters to apply. FilterClause objects are automatically converted to QueryFilter internally. Defaults to None.	`None`
`options`	`QueryOptions \| None`	Query options like limit and sorting. Defaults to None.	`None`

Returns:

Type	Description
`list[Chunk]`	list[Chunk]: List of chunks ordered by similarity score.

`update(update_values, filters=None)` `async`

Update existing records in the datastore.

This method will automatically encrypt the content and metadata in update_values if encryption is enabled following the encryption configuration.

Warning

Filters cannot target encrypted fields. While update_values are encrypted before being written, the filters used to identify which documents to update are NOT encrypted. If you try to update documents based on an encrypted metadata field (e.g., filters=F.eq("metadata.secret", "val")), the filter will fail to match because the filter value is not encrypted but the stored data is. Always use non-encrypted fields (like 'id') in filters when working with encrypted data.

Processes updates in batches to avoid loading all matching documents into memory. 1. Get document IDs matching the filters. 2. In batch, get document data via document IDs. 3. In batch, update the document data.

Examples:

Update metadata for chunks matching a filter: ```python from gllm_datastore.core.filters import filter as F

await vector_capability.update(
    update_values={"metadata": {"status": "published"}},
    filters=F.eq("id", "chunk_id")
)
```

Update encrypted data (encryption must be enabled): ```python from gllm_datastore.core.filters import filter as F

# Correct: Use non-encrypted 'id' field in filter
await vector_capability.update(
    update_values={"content": "new encrypted content"},
    filters=F.eq("id", "chunk_id")
)

# Incorrect: Using encrypted field in filter will fail to match
# await vector_capability.update(
#     update_values={"metadata": {"status": "published"}},
#     filters=F.eq("metadata.secret_key", "value")  # Won't match!
# )
```

Parameters:

Name	Type	Description	Default
`update_values`	`dict[str, Any]`	Values to update.	required
`filters`	`FilterClause \| QueryFilter \| None`	Filters to select records to update. FilterClause objects are automatically converted to QueryFilter internally. Cannot use encrypted fields in filters. Defaults to None.	`None`

Vector

RedisVectorCapability(index_name, client, em_invoker, encryption=None)

em_invoker property

clear() async

create(data) async

create_from_vector(chunk_vectors) async

delete(filters=None) async

ensure_index(filterable_fields=None) async

retrieve(query, filters=None, options=None) async

retrieve_by_vector(vector, filters=None, options=None) async

update(update_values, filters=None) async

`RedisVectorCapability(index_name, client, em_invoker, encryption=None)`

`em_invoker` `property`

`clear()` `async`

`create(data)` `async`

`create_from_vector(chunk_vectors)` `async`

`delete(filters=None)` `async`

`ensure_index(filterable_fields=None)` `async`

`retrieve(query, filters=None, options=None)` `async`

`retrieve_by_vector(vector, filters=None, options=None)` `async`

`update(update_values, filters=None)` `async`