Skip to content

Chunk

Defines the Chunk schema, which represents a chunk of content retrieved from a vector store.

Authors

Dimitrij Ray (dimitrij.ray@gdplabs.id)

References

NONE

Chunk

Bases: BaseModel

Represents a chunk of content retrieved from a vector store.

Attributes:

Name Type Description
id str

A unique identifier for the chunk. Defaults to a random UUID.

content str | bytes

The content of the chunk, either text or binary.

metadata dict[str, Any]

Additional metadata associated with the chunk. Defaults to an empty dictionary.

score float | None

Similarity score of the chunk (if available). Defaults to None.

__repr__()

Return a string representation of the Chunk.

Returns:

Name Type Description
str str

The string representation of the Chunk.

is_binary()

Check if the content is binary.

Returns:

Name Type Description
bool bool

True if the content is binary, False otherwise.

is_text()

Check if the content is text.

Returns:

Name Type Description
bool bool

True if the content is text, False otherwise.

validate_content(value) classmethod

Validate the content of the Chunk.

This is a class method required by Pydantic validators. As such, it follows its signature and conventions.

Parameters:

Name Type Description Default
value str | bytes

The content to validate.

required

Returns:

Type Description
str | bytes

str | bytes: The validated content.

Raises:

Type Description
ValueError

If the content is empty or not a string or bytes.