Core
Core abstractions and utilities for GLLM Datastore.
BaseFulltextCapability(encryption=None, default_batch_size=None)
Bases: DataStoreCapability
Base class for fulltext capability implementations.
Handles encryption and batching transparently. Subclasses implement internal CRUD methods that operate on plaintext data (or receive already-encrypted data when encryption is enabled).
create(data, batch_size=None, **kwargs)
async
retrieve_fuzzy(query, max_distance=2, filters=None, options=None, **kwargs)
async
Find records that fuzzy match the query within distance threshold, with automatic decryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Text to fuzzy match against. |
required |
max_distance
|
int
|
Maximum edit distance for matches (e.g. Levenshtein). Defaults to 2. |
2
|
filters
|
FilterClause | QueryFilter | None
|
Optional metadata filters. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options (limit, etc.). Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to subclass _retrieve_fuzzy. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Matched chunks ordered by relevance/distance, decrypted when encryption is enabled. |
BaseGraphCapability
Bases: ABC
Base class for graph database operations.
This base class defines the interface for datastores that support graph-based data operations. This includes node and relationship management as well as graph queries.
delete_node(label, identifier_key, identifier_value)
abstractmethod
async
Delete a node and its relationships.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
label
|
str
|
Node label/type. |
required |
identifier_key
|
str
|
Node identifier key. |
required |
identifier_value
|
str
|
Node identifier value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Deletion result information. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
delete_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value)
abstractmethod
async
Delete a relationship between nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_source_key
|
str
|
Source node identifier key. |
required |
node_source_value
|
str
|
Source node identifier value. |
required |
relation
|
str
|
Relationship type. |
required |
node_target_key
|
str
|
Target node identifier key. |
required |
node_target_value
|
str
|
Target node identifier value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Deletion result information. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
retrieve(query, parameters=None)
abstractmethod
async
Retrieve data from the graph with specific query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Query to retrieve data from the graph. |
required |
parameters
|
dict[str, Any] | None
|
Query parameters. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: Query results as list of dictionaries. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
upsert_node(label, identifier_key, identifier_value, properties=None)
abstractmethod
async
Create or update a node in the graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
label
|
str
|
Node label/type. |
required |
identifier_key
|
str
|
Key field for node identification. |
required |
identifier_value
|
str
|
Value for node identification. |
required |
properties
|
dict[str, Any] | None
|
Additional node properties. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Created/updated node information. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
upsert_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value, properties=None)
abstractmethod
async
Create or update a relationship between nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
node_source_key
|
str
|
Source node identifier key. |
required |
node_source_value
|
str
|
Source node identifier value. |
required |
relation
|
str
|
Relationship type. |
required |
node_target_key
|
str
|
Target node identifier key. |
required |
node_target_value
|
str
|
Target node identifier value. |
required |
properties
|
dict[str, Any] | None
|
Relationship properties. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Created/updated relationship information. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
BaseVectorCapability(em_invoker, encryption=None, default_batch_size=None)
Bases: DataStoreCapability
Base class for vector capability implementations.
Provides default batching/encryption flows for create, create_from_vector, retrieve, retrieve_by_vector, update, delete, and clear.
Initialize the base vector capability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
em_invoker
|
BaseEMInvoker
|
Embedding model invoker (required). |
required |
encryption
|
EncryptionCapability | None
|
Encryption capability. Defaults to None. |
None
|
default_batch_size
|
int | None
|
Default batch size. Defaults to None. |
None
|
em_invoker
property
Return the embedding model invoker.
Returns:
| Name | Type | Description |
|---|---|---|
BaseEMInvoker |
BaseEMInvoker
|
The EM invoker instance. |
create(data, batch_size=None, **kwargs)
async
create_from_vector(chunk_vectors, batch_size=None, **kwargs)
async
Create from pre-computed vectors with encryption and batching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chunk_vectors
|
list[tuple[Chunk, Vector]]
|
Chunks and their vectors. |
required |
batch_size
|
int | None
|
Override batch size. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to subclass. |
{}
|
ensure_index(**kwargs)
abstractmethod
async
Ensure vector index exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Datastore-specific parameters. |
{}
|
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not implemented in the subclass. |
retrieve_by_vector(vector, filters=None, options=None, **kwargs)
async
Retrieve by vector with automatic decryption.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vector
|
Vector
|
Query vector. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters. Defaults to None. |
None
|
options
|
QueryOptions | None
|
Query options. Defaults to None. |
None
|
**kwargs
|
Any
|
Passed to _retrieve_by_vector. |
{}
|
Returns:
| Type | Description |
|---|---|
list[Chunk]
|
list[Chunk]: Decrypted chunks. |
update(update_values, filters=None, batch_size=None, **kwargs)
async
Update records with centralized encryption and content embedding refresh.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update_values
|
dict[str, Any]
|
Fields to update. |
required |
filters
|
FilterClause | QueryFilter | None
|
Filters. Defaults to None. |
None
|
batch_size
|
int | None
|
Optional batch size override. Defaults to DefaultBatchSize.UPDATE when not configured at request/capability level. |
None
|
**kwargs
|
Any
|
Passed to backend-specific _update implementation. |
{}
|
FilterClause
Bases: BaseModel
Single filter criterion with operator support.
Examples:
FilterClause(key="metadata.age", value=25, operator=FilterOperator.GT)
FilterClause(key="metadata.status", value=["active", "pending"], operator=FilterOperator.IN)
Attributes:
| Name | Type | Description |
|---|---|---|
key |
str
|
The field path to filter on (supports dot notation for nested fields). |
value |
int | float | str | bool | list[str] | list[float] | list[int] | list[bool] | None
|
The value to compare against. |
operator |
FilterOperator
|
The comparison operator. |
to_query_filter()
Convert FilterClause to QueryFilter.
This method enables automatic conversion of FilterClause to QueryFilter.
Example
clause = FilterClause(key="metadata.status", value="active", operator=FilterOperator.EQ)
query_filter = clause.to_query_filter()
# Results in: QueryFilter(filters=[clause], condition=FilterCondition.AND)
Returns:
| Name | Type | Description |
|---|---|---|
QueryFilter |
QueryFilter
|
A QueryFilter wrapping this FilterClause with AND condition. |
FilterCondition
Bases: StrEnum
Logical conditions for combining filters.
FilterOperator
Bases: StrEnum
Operators for comparing field values.
QueryFilter
Bases: BaseModel
Composite filter supporting multiple conditions and logical operators.
Attributes:
| Name | Type | Description |
|---|---|---|
filters |
list[FilterClause | QueryFilter]
|
List of filters to combine. Can include nested QueryFilter for complex logic. |
condition |
FilterCondition
|
Logical operator to combine filters. Defaults to AND. |
Examples:
-
Simple AND: age > 25 AND status == "active"
python QueryFilter( filters=[ FilterClause(key="metadata.age", value=25, operator=FilterOperator.GT), FilterClause(key="metadata.status", value="active", operator=FilterOperator.EQ) ], condition=FilterCondition.AND ) -
Complex OR: (status == "active" OR status == "pending") AND age >= 18
python QueryFilter( filters=[ QueryFilter( filters=[ FilterClause(key="metadata.status", value="active"), FilterClause(key="metadata.status", value="pending") ], condition=FilterCondition.OR ), FilterClause(key="metadata.age", value=18, operator=FilterOperator.GTE) ], condition=FilterCondition.AND ) -
NOT: NOT (status == "deleted")
python QueryFilter( filters=[ FilterClause(key="metadata.status", value="deleted") ], condition=FilterCondition.NOT )
from_dicts(filter_dicts, condition=FilterCondition.AND)
classmethod
Create QueryFilter from list of filter dictionaries.
Example
QueryFilter.from_dicts(
[
{"key": "metadata.age", "value": 25, "operator": ">"},
{"key": "metadata.status", "value": "active"}
],
condition=FilterCondition.AND
)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_dicts
|
list[dict[str, Any]]
|
List of filter dictionaries. Contains the key, value, and operator. |
required |
condition
|
FilterCondition
|
Logical operator to combine filters. Defaults to AND. |
AND
|
Returns:
| Name | Type | Description |
|---|---|---|
QueryFilter |
'QueryFilter'
|
Composite filter instance. |
QueryOptions
Bases: BaseModel
Model for query options.
Attributes:
| Name | Type | Description |
|---|---|---|
include_fields |
Sequence[str] | None
|
The fields to include in the query result. Defaults to None. |
order_by |
str | None
|
The column to order the query result by. Defaults to None. |
order_desc |
bool
|
Whether to order the query result in descending order. Defaults to False. |
limit |
int | None
|
The maximum number of rows to return. Must be >= 0. Defaults to None. |
offset |
int | None
|
The number of rows to skip before returning results. Must be >= 0. Defaults to None. |
Examples:
-
Basic query with limit:
python QueryOptions(limit=10) -
Pagination - first page (results 0-9):
python QueryOptions(limit=10, offset=0) -
Pagination - second page (results 10-19):
python QueryOptions(limit=10, offset=10) -
Complex query with ordering and pagination:
python QueryOptions( include_fields=["field1", "field2"], order_by="created_at", order_desc=True, limit=10, offset=20 )
validate_offset_limit()
Validate that offset is not used without limit.
all_(key, values)
Create an ALL filter (array field contains all of the values).
This operator checks if an array field contains all of the values in the provided list. The field must be an array/list, and every value in the values list must be present as an element in the array. The array may contain additional elements.
Example
Filter for documents where the tags array contains both "python" and "javascript". This will match only if metadata.tags contains both values. For example, if metadata.tags = ["python", "javascript", "rust"], this will match. If metadata.tags = ["python", "rust"], this will not match (missing "javascript").
from gllm_datastore.core.filters import all_
filter = all_("metadata.tags", ["python", "javascript"])
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be an array field). |
required |
values
|
list
|
List of values. All must be present in the array. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
ALL filter. |
and_(*filters)
Combine filters with AND condition.
This logical operator combines multiple filters such that all conditions must be satisfied. A document matches only if it satisfies every filter in the list.
Example
Filter for documents where status is "active" AND age is at least 18. This will match documents that satisfy both conditions simultaneously.
from gllm_datastore.core.filters import and_, eq, gte
filter = and_(eq("metadata.status", "active"), gte("metadata.age", 18))
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*filters
|
FilterClause | QueryFilter
|
Variable number of filters to combine. All filters must match for a document to be included. |
()
|
Returns:
| Name | Type | Description |
|---|---|---|
QueryFilter |
QueryFilter
|
Combined filter with AND condition. |
any_(key, values)
Create an ANY filter (array field contains any of the values).
This operator checks if an array field contains at least one of the values in the provided list. The field must be an array/list, and at least one element from the values list must be present in the array. This is similar to checking if the arrays have any intersection.
Example
Filter for documents where the tags array contains at least one of "python" or "javascript". This will match if metadata.tags contains "python", "javascript", or both. For example, if metadata.tags = ["python", "rust"], this will match (because of "python").
from gllm_datastore.core.filters import any_
filter = any_("metadata.tags", ["python", "javascript"])
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be an array field). |
required |
values
|
list
|
List of values. At least one must be present in the array. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
ANY filter. |
array_contains(key, value)
Create an ARRAY_CONTAINS filter (array field contains value).
This operator checks if an array field contains the specified value as an element. The field must be an array/list, and the value must be present in that array. Use this for checking array membership.
Example
Filter for documents where the tags array contains "python". This will match documents where "python" is an element in metadata.tags. For example, if metadata.tags = ["python", "javascript"], this will match.
from gllm_datastore.core.filters import array_contains
filter = array_contains("metadata.tags", "python")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be an array field). |
required |
value
|
Any
|
Value to check if it exists as an element in the array. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
ARRAY_CONTAINS filter. |
eq(key, value)
Create an equality filter.
This operator checks if the field value is exactly equal to the specified value. Works with strings, numbers, booleans, and other scalar types.
Example
Filter for documents where metadata.status == active.
from gllm_datastore.core.filters import eq
filter = eq("metadata.status", "active")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on. |
required |
value
|
Any
|
Value to compare. Matches field values exactly equal to this value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
Equality filter. |
gt(key, value)
Create a greater-than filter.
This operator checks if the field value is strictly greater than the specified value. Only works with numeric fields (int or float).
Example
Filter for documents where metadata.price > 100.
from gllm_datastore.core.filters import gt
filter = gt("metadata.price", 100)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be numeric). |
required |
value
|
int | float
|
Threshold value. Matches field values greater than this. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
Greater-than filter. |
gte(key, value)
Create a greater-than-or-equal filter.
This operator checks if the field value is greater than or equal to the specified value. Only works with numeric fields (int or float).
Example
Filter for documents where metadata.price >= 100.
from gllm_datastore.core.filters import gte
filter = gte("metadata.price", 100)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be numeric). |
required |
value
|
int | float
|
Threshold value. Matches field values greater than or equal to this. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
Greater-than-or-equal filter. |
in_(key, values)
Create an IN filter.
This operator checks if the field value is one of the values in the provided list. Works with scalar fields (string, number, boolean). The field value must exactly match one of the values in the list.
Example
Filter for documents where metadata.status in ["active", "pending"].
from gllm_datastore.core.filters import in_
filter = in_("metadata.status", ["active", "pending"])
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be a scalar field). |
required |
values
|
list
|
List of possible values. Matches field values that match one of these exactly. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
IN filter. |
lt(key, value)
Create a less-than filter.
This operator checks if the field value is strictly less than the specified value. Only works with numeric fields (int or float).
Example
Filter for documents where metadata.price < 100.
from gllm_datastore.core.filters import lt
filter = lt("metadata.price", 100)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be numeric). |
required |
value
|
int | float
|
Threshold value. Matches field values less than this. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
Less-than filter. |
lte(key, value)
Create a less-than-or-equal filter.
This operator checks if the field value is less than or equal to the specified value. Only works with numeric fields (int or float).
Example
Filter for documents where metadata.price <= 100.
from gllm_datastore.core.filters import lte
filter = lte("metadata.price", 100)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be numeric). |
required |
value
|
int | float
|
Threshold value. Matches field values less than or equal to this. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
Less-than-or-equal filter. |
ne(key, value)
Create a not-equal filter.
This operator checks if the field value is not equal to the specified value. Works with strings, numbers, booleans, and other scalar types.
Example
Filter for documents where metadata.status != active.
from gllm_datastore.core.filters import ne
filter = ne("metadata.status", "active")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on. |
required |
value
|
Any
|
Value to exclude. Matches all values except this one. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
Not-equal filter. |
nin(key, values)
Create a NOT IN filter.
This operator checks if the field value is not in the provided list. Works with scalar fields (string, number, boolean). The field value must not match any of the values in the list.
Example
Filter for documents where metadata.status not in ["deleted", "archived"].
from gllm_datastore.core.filters import nin
filter = nin("metadata.status", ["deleted", "archived"])
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be a scalar field). |
required |
values
|
list
|
List of excluded values. Matches field values that do not match any of these. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
NOT IN filter. |
not_(filter)
Negate a filter.
This logical operator inverts the result of a filter. A document matches if it does not satisfy the specified filter condition. Useful for exclusion criteria.
This operator only supports NOT with a single filter. Multiple filters in NOT condition are not supported.
Example
Filter for documents where status is NOT "deleted". This will match all documents except those with status == "deleted". Can also be used with other operators, e.g., not_(text_contains("content", "spam")) to exclude documents containing a specific substring.
from gllm_datastore.core.filters import not_, eq
filter = not_(eq("metadata.status", "deleted"))
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter
|
FilterClause | QueryFilter
|
Filter to negate. Documents matching this filter will be excluded from results. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
QueryFilter |
QueryFilter
|
Negated filter. |
or_(*filters)
Combine filters with OR condition.
This logical operator combines multiple filters such that at least one condition must be satisfied. A document matches if it satisfies any of the filters in the list.
Example
Filter for documents where status is "active" OR status is "pending". This will match documents that satisfy either condition (or both).
from gllm_datastore.core.filters import or_, eq
filter = or_(eq("metadata.status", "active"), eq("metadata.status", "pending"))
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*filters
|
FilterClause | QueryFilter
|
Variable number of filters to combine. At least one filter must match for a document to be included. |
()
|
Returns:
| Name | Type | Description |
|---|---|---|
QueryFilter |
QueryFilter
|
Combined filter with OR condition. |
text_contains(key, value)
Create a TEXT_CONTAINS filter (text field contains substring).
This operator checks if a text/string field contains the specified substring. The field must be a string, and the value must appear as a substring within that string. Use this for substring matching in text content.
Example
Filter for documents where the content field contains "machine learning". This will match documents where "machine learning" appears anywhere in the content. For example, if content = "This is about machine learning algorithms", this will match.
from gllm_datastore.core.filters import text_contains
filter = text_contains("content", "machine learning")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Field path to filter on (must be a string/text field). |
required |
value
|
str
|
Substring to search for in the text. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
FilterClause |
FilterClause
|
TEXT_CONTAINS filter. |