Skip to main content

Developer-facing retrieval SDK over Valkey Search: index lifecycle, upsert, and vector + filtered query.

Project description

@betterdb/retrieval (Python)

betterdb-retrieval — developer-facing retrieval SDK over Valkey Search (FT.*): typed index schema, idempotent index lifecycle, upsert/delete, and vector + filtered + hybrid query. This is the Python equivalent of the TypeScript @betterdb/retrieval package, built on betterdb-valkey-search-kit.

Installation

pip install betterdb-retrieval valkey

Requires a Valkey server with the Valkey Search module loaded.

Quick start

from valkey.asyncio import Valkey

from betterdb_retrieval import Retriever, UpsertEntry

client = Valkey.from_url("redis://localhost:6379")


async def embed(text: str) -> list[float]:
    ...  # return an embedding


retriever = Retriever(
    client=client,
    name="docs",
    schema={
        "fields": {
            "category": {"type": "tag"},
            "year": {"type": "numeric", "sortable": True},
        },
        "vector": {"algorithm": "hnsw", "metric": "cosine"},
    },
    embed_fn=embed,
)

# Create the index if it doesn't exist (idempotent; dims resolved from embed_fn).
await retriever.create_index()

await retriever.upsert([
    UpsertEntry(
        id="doc1",
        text="Valkey is a high-performance key-value store",
        fields={"category": "db", "year": 2024},
    ),
])

hits = await retriever.query(
    text="fast in-memory database",
    k=5,
    filter={"category": "db"},
)

Retriever API

  • create_index() — create the index if absent (idempotent). Vector dimension is taken from schema["vector"]["dims"] or resolved by probing embed_fn.
  • upsert(entries) — embed each entry's text and write it as a hash with its fields.
  • delete(ids) — delete documents by id.
  • query(*, k, text=None, vector=None, filter=None, hybrid=None) — KNN search. Provide text (embedded for you) or a precomputed vector, a positive k, an optional filter (tag/numeric fields), and hybrid="rerank" to post-process hits through a rerank_fn. Returns list[QueryHit].
  • describe_index() / health() — index stats: doc count, indexing state, dimension, percent indexed, and an optional estimated recall.
  • drop_index() — drop the index (no-op if it doesn't exist).
  • register() / unregister() — publish/remove a discovery marker in the shared __betterdb:caches registry, ownership-checked so it never clobbers a foreign cache type.

QueryHit.score is the raw KNN vector distance (lower is closer), not a similarity — rank ascending.

Observability

Pass metrics (a RetrievalMetrics) and/or tracer (a RetrievalTracer) to instrument every operation. create_prometheus_metrics() provides a ready-made prometheus-client implementation.

Development

uv run --extra dev pytest tests -q

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betterdb_retrieval-0.2.0.tar.gz (36.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

betterdb_retrieval-0.2.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file betterdb_retrieval-0.2.0.tar.gz.

File metadata

  • Download URL: betterdb_retrieval-0.2.0.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for betterdb_retrieval-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6498b31a7babfc36cd63205f2017c3434d595408cf4bfb3f794dbf34ac56638a
MD5 7a94f59b8d6aec7e4f0e4d41849e8e9b
BLAKE2b-256 32efbded60f19b6e7efbd8e62d413aa23e6fd736c9f3a2babd581c9bab1892b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.2.0.tar.gz:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file betterdb_retrieval-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for betterdb_retrieval-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f753d8ed0b204e937d5e82ed409532fabfcade6f5f3ee61b930e419cb1b623d7
MD5 dae7ca2665eacefa261ba33d5b342e79
BLAKE2b-256 d5e8a379e997a2f232def020524a29f3ae54965de6997b3570312280c6ff0d40

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.2.0-py3-none-any.whl:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page