Vector
OpenSearch implementation of vector search and CRUD capability.
OpenSearchVectorCapability(index_name, em_invoker, client, opensearch_url=None, query_field='text', vector_query_field='vector', retrieval_strategy=None, distance_strategy=None, connection_params=None, encryption=None, default_batch_size=None)
Bases: BaseVectorCapability, ElasticLikeCore
OpenSearch implementation of VectorCapability protocol.
This class provides document CRUD operations and vector search using OpenSearch. Uses LangChain's OpenSearchVectorSearch for create and retrieve operations, and direct OpenSearch client for update and delete operations.
Attributes:
| Name | Type | Description |
|---|---|---|
index_name |
str
|
The name of the OpenSearch index. |
vector_store |
OpenSearchVectorSearch
|
The vector store instance. |
client |
AsyncOpenSearch
|
AsyncOpenSearch client for direct operations. |
em_invoker |
BaseEMInvoker
|
The embedding model to perform vectorization. |
Initialize the OpenSearch vector capability.
OpenSearchVectorSearch creates its own sync and async clients internally based on the provided connection parameters. The async client is used for operations like update, delete, and clear.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index_name
|
str
|
The name of the OpenSearch index. |
required |
em_invoker
|
BaseEMInvoker
|
The embedding model to perform vectorization. |
required |
client
|
AsyncOpenSearch
|
The OpenSearch client for direct operations. |
required |
opensearch_url
|
str | None
|
The URL of the OpenSearch server. Used for LangChain's OpenSearchVectorSearch initialization. If None, will be extracted from client connection info. Defaults to None. |
None
|
query_field
|
str
|
The field name for text queries. Defaults to "text". |
'text'
|
vector_query_field
|
str
|
The field name for vector queries. Defaults to "vector". |
'vector'
|
retrieval_strategy
|
Any
|
Not used with OpenSearchVectorSearch (kept for API compatibility). |
None
|
distance_strategy
|
str | None
|
The distance strategy for retrieval. For example, "l2" for Euclidean distance, "l2squared" for squared Euclidean distance, "cosine" for cosine similarity, etc. Defaults to None. |
None
|
connection_params
|
dict[str, Any] | None
|
Additional connection parameters to override defaults. These will be merged with automatically detected parameters (authentication, SSL settings). User-provided params take precedence. Defaults to None. Available parameters include: 1. http_auth (tuple[str, str] | None): HTTP authentication tuple (username, password). 2. use_ssl (bool): Whether to use SSL/TLS. Defaults to True for HTTPS URLs. 3. verify_certs (bool): Whether to verify SSL certificates. Defaults to True for HTTPS URLs. 4. ssl_show_warn (bool): Whether to show SSL warnings. Defaults to True for HTTPS URLs. 5. ssl_assert_hostname (str | None): SSL hostname assertion. Defaults to None. 6. max_retries (int): Maximum number of retries for requests. Defaults to 3. 7. retry_on_timeout (bool): Whether to retry on timeouts. Defaults to True. 8. client_cert (str | None): Path to the client certificate file. Defaults to None. 9. client_key (str | None): Path to the client private key file. Defaults to None. 10. root_cert (str | None): Path to the root certificate file. Defaults to None. 11. Additional kwargs: Any other parameters accepted by OpenSearch client constructor. |
None
|
encryption
|
EncryptionCapability | None
|
Encryption capability. Defaults to None. |
None
|
default_batch_size
|
int | None
|
Default batch size. Defaults to None. |
None
|
delete_by_id(id, **kwargs)
async
Delete records from the data store based on IDs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id
|
str | list[str]
|
ID or list of IDs to delete. |
required |
**kwargs
|
Any
|
Additional arguments passed to the delete operation. |
{}
|
ensure_index(mapping=None, index_settings=None, dimension=None, distance_strategy=None, **kwargs)
async
Ensure OpenSearch index exists, creating it if necessary.
This method is idempotent - if the index already exists, it will skip creation and return early.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mapping
|
dict[str, Any] | None
|
Custom mapping dictionary to use for index creation. If provided, this mapping will be used directly. The mapping should follow OpenSearch mapping format. Defaults to None, in which default mapping will be used. |
None
|
index_settings
|
dict[str, Any] | None
|
Custom index settings. These settings will be merged with any default settings. Defaults to None. |
None
|
dimension
|
int | None
|
Vector dimension. If not provided and mapping is not provided, will be inferred from em_invoker by generating a test embedding. |
None
|
distance_strategy
|
str | None
|
Distance strategy for vector similarity. Supported values: "l2", "l2squared", "cosine", "innerproduct", etc. Only used when building default mapping. Defaults to "l2" if not specified. |
None
|
**kwargs
|
Any
|
Additional arguments. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If mapping is invalid or required parameters are missing. |
RuntimeError
|
If index creation fails due to backend errors. |