Skip to content

Retrieval parameter extractor

Defines the base class for retrieval parameter extractors used in Gen AI application.

Author

Resti Febriana (resti.febriana@gdplabs.id)

References

NONE

BaseRetrievalParameterExtractor(validator=None)

Bases: Component, ABC

An abstract base class for retrieval parameter extractors used in Gen AI apllication.

This class defines the interface for retrieval parameter extractors, which are responsible for extracting retrieval parameters from a given query string.

The class supports two types of validators: 1. JSON Schema validator using a dictionary 2. Pydantic model validator using a BaseModel class

Example usage with default JSON Schema validator
from gllm_core.rules import DEFAULT_RETRIEVAL_SCHEMA

# Initialize extractor with default schema
extractor = MyExtractor(validator=DEFAULT_RETRIEVAL_SCHEMA)

# The default schema supports:
# - Query string
# - Filters with operations (eq, neq, gt, gte, lt, lte, in, nin, like)
# - Sort conditions (asc, desc)
# Example valid parameters:
{
    "query": "search text",
    "filters": [
        {"field": "category", "operator": "eq", "value": "books"},
        {"field": "price", "operator": "lte", "value": 100}
    ],
    "sort": [
        {"field": "date", "order": "desc"}
    ]
}
Example usage with default Pydantic validator
from gllm_core.rules import DefaultRetrievalSchema

# Initialize extractor with default schema
extractor = MyExtractor(validator=DefaultRetrievalSchema)

# The default schema supports:
# - Query string (required)
# - Optional filter conditions with FilterOperator enum
# - Optional sort conditions with SortOrder enum
# Example valid parameters:
{
    "query": "search text",
    "filters": [
        {
            "field": "category",
            "operator": FilterOperator.EQUALS,
            "value": "books"
        }
    ],
    "sort": [
        {
            "field": "date",
            "order": SortOrder.DESCENDING
        }
    ]
}

For custom validation requirements, you can also define your own schemas:

Example with custom JSON Schema
# Define custom JSON Schema
schema = {
    "type": "object",
    "properties": {
        "top_k": {"type": "integer", "minimum": 1},
        "threshold": {"type": "number", "minimum": 0, "maximum": 1}
    },
    "required": ["top_k", "threshold"]
}

# Initialize extractor with custom schema
extractor = MyExtractor(validator=schema)
Example with custom Pydantic model
from pydantic import BaseModel, Field

# Define custom Pydantic model
class SearchParams(BaseModel):
    top_k: int = Field(gt=0)
    threshold: float = Field(ge=0, le=1)

# Initialize extractor with custom model
extractor = MyExtractor(validator=SearchParams)

The validator will automatically run after parameter extraction to ensure the returned parameters meet the specified schema/model requirements.

Attributes:

Name Type Description
validator dict | type[BaseModel] | None

The validator to use for validating the extracted parameters. Can be either a JSON Schema or a Pydantic model class.

Initializes the BaseRetrievalParameterExtractor object.

Parameters:

Name Type Description Default
validator(dict | type[BaseModel] | None

The validator to use for validating the extracted parameters. Can be either a JSON Schema or a Pydantic model class. Defaults to None.

required

Raises:

Type Description
TypeError

If the validator is not a dict or Pydantic model.

extract_parameters(query, **kwargs) async

Extracts retrieval parameters from the input query.

This method is a wrapper around the _extract_parameters method, which performs the actual parameter extraction. It also includes validation of the extracted parameters using the _validate_parameters method.

Parameters:

Name Type Description Default
query(str)

The input query string.

required
**kwargs(Any)

Additional keyword arguments to pass to the extractor.

required

Returns:

Type Description
dict[str, Any]

dict[str, Any]: A dictionary containing the extracted parameters.

Raises:

Type Description
RuntimeError

If an error occurs during the extraction process.