Encryption capability
Unified encryption capability implementation.
This module defines the EncryptionCapability class that provides a centralized implementation for all encryption operations across datastores. It consolidates chunk-level encryption/decryption, metadata handling, and update value encryption into a single capability.
EncryptionCapability(encryptor, encrypted_fields)
Unified implementation of encryption capability.
This class provides the shared encryption and decryption logic that is identical across all backend implementations. It handles: - Chunk content and metadata encryption/decryption - Preparation of encrypted chunks with plaintext embeddings - Encryption of update values
Thread Safety
This class is designed to be thread-safe when used with thread-safe encryptors. The encryptor instance passed must be thread-safe for concurrent encryption/decryption operations. Methods in this class do not perform internal synchronization - thread safety is delegated to the underlying encryptor.
Attributes:
| Name | Type | Description |
|---|---|---|
encryptor |
BaseEncryptor
|
The encryptor instance to use for encryption/decryption. Must be thread-safe for concurrent operations. |
_encrypted_fields |
set[str]
|
The set of fields to encrypt. |
Initialize the encryption capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
encryptor
|
BaseEncryptor
|
The encryptor instance to use for encryption. |
required |
encrypted_fields
|
set[str]
|
The set of fields to encrypt. Supports:
1. Content field: "content"
2. Metadata fields using dot notation: "metadata.secret_key", "metadata.secret_value"
Example: |
required |
encryption_config
property
Get the current encryption configuration.
Returns:
| Type | Description |
|---|---|
set[str]
|
set[str]: Set of encrypted field names. |
is_enabled
property
Check if encryption is enabled (has configured fields).
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if encryption fields are configured, False otherwise. |
decrypt_chunks(chunks)
encrypt_chunks(chunks)
encrypt_embedded_chunks(chunks, em_invoker)
async
Encrypt chunks and generate embeddings from plaintext before encryption.
Generates embeddings from plaintext content to ensure embeddings represent the original content rather than encrypted ciphertext. This is used when encryption is enabled.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunks
|
list[Chunk]
|
List of chunks to encrypt and generate embeddings for. |
required |
em_invoker
|
BaseEMInvoker
|
Embedding model invoker to generate embeddings. |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[Chunk, Vector]]
|
list[tuple[Chunk, Vector]]: List of tuples containing encrypted chunks and their corresponding vectors generated from plaintext. |
encrypt_update_values(update_values, content_field_name=CHUNK_KEYS.CONTENT)
Encrypt update values if encryption is enabled.
This method encrypts content and metadata values in update_values according to the encryption configuration. It handles type conversion for non-string values before encryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Dictionary of values to encrypt. Supports "content" and "metadata" keys. |
required |
content_field_name
|
str
|
The field name to use for content in the output. Defaults to CHUNK_KEYS.CONTENT. Useful for datastores like Elasticsearch that use "text" instead of "content". |
CONTENT
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: Dictionary with encrypted values where applicable. The "content" key is mapped to content_field_name in the output. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If encryption fails for any field. |