Skip to content

Core

Core abstractions and utilities for GLLM Datastore.

Authors

Kadek Denaya (kadek.d.r.diana@gdplabs.id)

References

NONE

CacheCapability

Bases: Protocol

Protocol for caching operations with advanced matching.

This protocol defines the interface for datastores that support caching operations. This includes storage, retrieval with different matching strategies, and various eviction policies.

create(key, value, metadata=None, **kwargs) async

Insert a value into cache with optional metadata.

Parameters:

Name Type Description Default
key str

Cache key.

required
value Any

Value to cache.

required
metadata dict[str, Any] | None

Optional metadata for advanced matching. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

delete(key, filters=None) async

Delete entries by key match with optional filters.

Parameters:

Name Type Description Default
key str | list[str]

Single key or list of keys to delete.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None

delete_expired(now=None) async

Delete expired entries and enforce size limits.

Parameters:

Name Type Description Default
now datetime | None

Current timestamp for expiration checks. Defaults to None, in which case the current timestamp is used.

None

delete_lfu(num_entries) async

Delete least frequently used entries.

Parameters:

Name Type Description Default
num_entries int

Number of entries to delete.

required

delete_lru(num_entries) async

Delete least recently used entries.

Parameters:

Name Type Description Default
num_entries int

Number of entries to delete.

required

retrieve_exact(key, filters=None, **kwargs) async

Retrieve exact key match from cache with optional filters.

Parameters:

Name Type Description Default
key str

Cache key to retrieve.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Name Type Description
Any Any

Cached value if found, None otherwise.

retrieve_fuzzy(key, max_distance=DEFAULT_FUZZY_MATCH_MAX_DISTANCE, filters=None, **kwargs) async

Find fuzzy matches for key within distance threshold.

Parameters:

Name Type Description Default
key str

Base key for fuzzy matching.

required
max_distance int

Maximum edit distance for matches. Defaults to 2.

DEFAULT_FUZZY_MATCH_MAX_DISTANCE
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Name Type Description
Any Any

Best matching value or None.

retrieve_semantic(key, min_similarity=0.8, filters=None, **kwargs) async

Find semantically similar matches using embeddings.

Parameters:

Name Type Description Default
key str

Base key for semantic matching.

required
min_similarity float

Minimum similarity threshold (0.0-1.0). Defaults to 0.8.

0.8
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Name Type Description
Any Any

Best matching value or None.

update(key, value, filters=None) async

Update an entry in the cache with optional filters.

Parameters:

Name Type Description Default
key str

Cache key to update.

required
value Any

Value to update.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None

FilterClause

Bases: BaseModel

Single filter criterion with operator support.

Examples:

FilterClause(key="metadata.age", value=25, operator=FilterOperator.GT)
FilterClause(key="metadata.status", value=["active", "pending"], operator=FilterOperator.IN)

Attributes:

Name Type Description
key str

The field path to filter on (supports dot notation for nested fields).

value int | float | str | list[str] | list[float] | list[int] | None

The value to compare against.

operator FilterOperator

The comparison operator.

FilterCondition

Bases: StrEnum

Logical conditions for combining filters.

FilterOperator

Bases: StrEnum

Operators for comparing field values.

FulltextCapability

Bases: Protocol

Protocol for full-text search and document operations.

This protocol defines the interface for datastores that support CRUD operations and flexible querying mechanisms for document data.

clear(**kwargs) async

Clear all records from the datastore.

Parameters:

Name Type Description Default
**kwargs

Datastore-specific parameters.

{}

create(data, **kwargs) async

Create new records in the datastore.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Data to create (single item or collection).

required
**kwargs

Datastore-specific parameters.

{}

delete(filters=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters QueryFilter | None

Filters to select records to delete. Defaults to None, in which case no operation is performed (no-op).

None
**kwargs

Datastore-specific parameters.

{}

retrieve(filters=None, options=None, **kwargs) async

Read records from the datastore with optional filtering.

Parameters:

Name Type Description Default
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters QueryFilter | None

Filters to select records to update. Defaults to None.

None
**kwargs

Datastore-specific parameters.

{}

GraphCapability

Bases: Protocol

Protocol for graph database operations.

This protocol defines the interface for datastores that support graph-based data operations. This includes node and relationship management as well as graph queries.

delete_node(label, identifier_key, identifier_value) async

Delete a node and its relationships.

Parameters:

Name Type Description Default
label str

Node label/type.

required
identifier_key str

Node identifier key.

required
identifier_value str

Node identifier value.

required

Returns:

Name Type Description
Any Any

Deletion result information.

delete_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value) async

Delete a relationship between nodes.

Parameters:

Name Type Description Default
node_source_key str

Source node identifier key.

required
node_source_value str

Source node identifier value.

required
relation str

Relationship type.

required
node_target_key str

Target node identifier key.

required
node_target_value str

Target node identifier value.

required

Returns:

Name Type Description
Any Any

Deletion result information.

retrieve(query, parameters=None) async

Retrieve data from the graph with specific query.

Parameters:

Name Type Description Default
query str

Query to retrieve data from the graph.

required
parameters dict[str, Any] | None

Query parameters. Defaults to None.

None

Returns:

Type Description
list[dict[str, Any]]

list[dict[str, Any]]: Query results as list of dictionaries.

upsert_node(label, identifier_key, identifier_value, properties=None) async

Create or update a node in the graph.

Parameters:

Name Type Description Default
label str

Node label/type.

required
identifier_key str

Key field for node identification.

required
identifier_value str

Value for node identification.

required
properties dict[str, Any] | None

Additional node properties. Defaults to None.

None

Returns:

Name Type Description
Any Any

Created/updated node information.

upsert_relationship(node_source_key, node_source_value, relation, node_target_key, node_target_value, properties=None) async

Create or update a relationship between nodes.

Parameters:

Name Type Description Default
node_source_key str

Source node identifier key.

required
node_source_value str

Source node identifier value.

required
relation str

Relationship type.

required
node_target_key str

Target node identifier key.

required
node_target_value str

Target node identifier value.

required
properties dict[str, Any] | None

Relationship properties. Defaults to None.

None

Returns:

Name Type Description
Any Any

Created/updated relationship information.

QueryFilter

Bases: BaseModel

Composite filter supporting multiple conditions and logical operators.

Attributes:

Name Type Description
filters list[FilterClause | QueryFilter]

List of filters to combine. Can include nested QueryFilter for complex logic.

condition FilterCondition

Logical operator to combine filters. Defaults to AND.

Examples:

  1. Simple AND: age > 25 AND status == "active" python QueryFilter( filters=[ FilterClause(key="metadata.age", value=25, operator=FilterOperator.GT), FilterClause(key="metadata.status", value="active", operator=FilterOperator.EQ) ], condition=FilterCondition.AND )

  2. Complex OR: (status == "active" OR status == "pending") AND age >= 18 python QueryFilter( filters=[ QueryFilter( filters=[ FilterClause(key="metadata.status", value="active"), FilterClause(key="metadata.status", value="pending") ], condition=FilterCondition.OR ), FilterClause(key="metadata.age", value=18, operator=FilterOperator.GTE) ], condition=FilterCondition.AND )

  3. NOT: NOT (status == "deleted") python QueryFilter( filters=[ FilterClause(key="metadata.status", value="deleted") ], condition=FilterCondition.NOT )

from_dicts(filter_dicts, condition=FilterCondition.AND) classmethod

Create QueryFilter from list of filter dictionaries.

Example
QueryFilter.from_dicts(
    [
        {"key": "metadata.age", "value": 25, "operator": ">"},
        {"key": "metadata.status", "value": "active"}
    ],
    condition=FilterCondition.AND
)

Parameters:

Name Type Description Default
filter_dicts list[dict[str, Any]]

List of filter dictionaries. Contains the key, value, and operator.

required
condition FilterCondition

Logical operator to combine filters. Defaults to AND.

AND

Returns:

Name Type Description
QueryFilter 'QueryFilter'

Composite filter instance.

QueryOptions

Bases: BaseModel

Model for query options.

Attributes:

Name Type Description
include_fields Sequence[str] | None

The fields to include in the query result. Defaults to None.

order_by str | None

The column to order the query result by. Defaults to None.

order_desc bool

Whether to order the query result in descending order. Defaults to False.

limit int | None

The maximum number of rows to return. Defaults to None.

Example
QueryOptions(include_fields=["field1", "field2"], order_by="column1", order_desc=True, limit=10)

VectorCapability

Bases: Protocol

Protocol for vector similarity search operations.

This protocol defines the interface for datastores that support vector-based retrieval operations. This includes similarity search, ID-based lookup as well as vector storage.

clear() async

Clear all records from the datastore.

create(data) async

Add chunks to the vector store with automatic embedding generation.

Parameters:

Name Type Description Default
data Chunk | list[Chunk]

Single chunk or list of chunks to add.

required

create_from_vector(chunk_vectors, **kwargs) async

Add pre-computed vectors directly.

Parameters:

Name Type Description Default
chunk_vectors list[tuple[Chunk, Vector]]

List of tuples containing chunks and their corresponding vectors.

required
**kwargs Any

Datastore-specific parameters.

{}

delete(filters=None, **kwargs) async

Delete records from the datastore.

Parameters:

Name Type Description Default
filters QueryFilter | None

Filters to select records to delete. Defaults to None.

None
**kwargs Any

Datastore-specific parameters

{}
Note

If filters is None, no operation is performed (no-op).

retrieve(query, filters=None, options=None, **kwargs) async

Read records from the datastore using text-based similarity search with optional filtering.

Parameters:

Name Type Description Default
query str

Input text to embed and search with.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: Query results.

retrieve_by_vector(vector, filters=None, options=None, **kwargs) async

Direct vector similarity search.

Parameters:

Name Type Description Default
vector Vector

Query embedding vector.

required
filters QueryFilter | None

Query filters to apply. Defaults to None.

None
options QueryOptions | None

Query options like limit and sorting. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

Returns:

Type Description
list[Chunk]

list[Chunk]: List of chunks ordered by similarity score.

update(update_values, filters=None, **kwargs) async

Update existing records in the datastore.

Parameters:

Name Type Description Default
update_values dict[str, Any]

Values to update.

required
filters QueryFilter | None

Filters to select records to update. Defaults to None.

None
**kwargs Any

Datastore-specific parameters.

{}

all_(key, values)

Create an ALL filter (field contains all of the values).

Example
from gllm_datastore.core.filters import all_

filter = all_("metadata.tags", ["python", "javascript"])

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
values list

Values to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

ALL filter.

and_(*filters)

Combine filters with AND condition.

Example
from gllm_datastore.core.filters import and_, eq, gte

filter = and_(eq("status", "active"), gte("age", 18))

Parameters:

Name Type Description Default
*filters FilterClause | QueryFilter

Variable number of filters to combine.

()

Returns:

Name Type Description
QueryFilter QueryFilter

Combined filter with AND condition.

any_(key, values)

Create an ANY filter (field contains any of the values).

Example
from gllm_datastore.core.filters import any_

filter = any_("metadata.tags", ["python", "javascript"])

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
values list

Values to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

ANY filter.

contains(key, value)

Create a CONTAINS filter (field is array containing value).

Example
from gllm_datastore.core.filters import contains

filter = contains("metadata.tags", "python")

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value Any

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

CONTAINS filter.

eq(key, value)

Create an equality filter.

Example
from gllm_datastore.core.filters import eq

filter = eq("metadata.status", "active")

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value Any

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

Equality filter.

gt(key, value)

Create a greater-than filter.

Example
from gllm_datastore.core.filters import gt

filter = gt("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value int | float

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

Greater-than filter.

gte(key, value)

Create a greater-than-or-equal filter.

Example
from gllm_datastore.core.filters import gte

filter = gte("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value int | float

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

Greater-than-or-equal filter.

in_(key, values)

Create an IN filter.

Example
from gllm_datastore.core.filters import in_

filter = in_("metadata.tags", ["python", "javascript"])

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
values list

Values to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

IN filter.

lt(key, value)

Create a less-than filter.

Example
from gllm_datastore.core.filters import lt

filter = lt("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value int | float

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

Less-than filter.

lte(key, value)

Create a less-than-or-equal filter.

Example
from gllm_datastore.core.filters import lte

filter = lte("metadata.price", 100)

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value int | float

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

Less-than-or-equal filter.

ne(key, value)

Create a not-equal filter.

Example
from gllm_datastore.core.filters import ne

filter = ne("metadata.status", "active")

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
value Any

Value to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

Not-equal filter.

nin(key, values)

Create a NOT IN filter.

Example
from gllm_datastore.core.filters import nin

filter = nin("metadata.tags", ["python", "javascript"])

Parameters:

Name Type Description Default
key str

Field path to filter on.

required
values list

Values to compare.

required

Returns:

Name Type Description
FilterClause FilterClause

NOT IN filter.

not_(filter)

Negate a filter.

Example
from gllm_datastore.core.filters import not_, eq

filter = not_(eq("status", "deleted"))

Parameters:

Name Type Description Default
filter FilterClause | QueryFilter

Filter to negate.

required

Returns:

Name Type Description
QueryFilter QueryFilter

Negated filter.

or_(*filters)

Combine filters with OR condition.

Example
from gllm_datastore.core.filters import or_, eq

filter = or_(eq("status", "active"), eq("status", "pending"))

Parameters:

Name Type Description Default
*filters FilterClause | QueryFilter

Variable number of filters to combine.

()

Returns:

Name Type Description
QueryFilter QueryFilter

Combined filter with OR condition.