Skip to main content

Developer-facing retrieval SDK over Valkey Search: index lifecycle, upsert, and vector + filtered query.

Project description

@betterdb/retrieval (Python)

betterdb-retrieval — developer-facing retrieval SDK over Valkey Search (FT.*): typed index schema, idempotent index lifecycle, upsert/delete, and vector + filtered + hybrid query. This is the Python equivalent of the TypeScript @betterdb/retrieval package, built on betterdb-valkey-search-kit.

Installation

pip install betterdb-retrieval valkey

Requires a Valkey server with the Valkey Search module loaded.

Quick start

from valkey.asyncio import Valkey

from betterdb_retrieval import Retriever, UpsertEntry

client = Valkey.from_url("redis://localhost:6379")


async def embed(text: str) -> list[float]:
    ...  # return an embedding


retriever = Retriever(
    client=client,
    name="docs",
    schema={
        "fields": {
            "category": {"type": "tag"},
            "year": {"type": "numeric", "sortable": True},
        },
        "vector": {"algorithm": "hnsw", "metric": "cosine"},
    },
    embed_fn=embed,
)

# Create the index if it doesn't exist (idempotent; dims resolved from embed_fn).
await retriever.create_index()

await retriever.upsert([
    UpsertEntry(
        id="doc1",
        text="Valkey is a high-performance key-value store",
        fields={"category": "db", "year": 2024},
    ),
])

hits = await retriever.query(
    text="fast in-memory database",
    k=5,
    filter={"category": "db"},
)

Retriever API

  • create_index() — create the index if absent (idempotent). Vector dimension is taken from schema["vector"]["dims"] or resolved by probing embed_fn.
  • upsert(entries) — embed each entry's text and write it as a hash with its fields.
  • delete(ids) — delete documents by id.
  • query(*, k, text=None, vector=None, filter=None, hybrid=None) — KNN search. Provide text (embedded for you) or a precomputed vector, a positive k, an optional filter (tag/numeric fields), and hybrid="rerank" to post-process hits through a rerank_fn. Returns list[QueryHit].
  • describe_index() / health() — index stats: doc count, indexing state, dimension, percent indexed, and an optional estimated recall.
  • drop_index() — drop the index (no-op if it doesn't exist).
  • register() / unregister() — publish/remove a discovery marker in the shared __betterdb:caches registry, ownership-checked so it never clobbers a foreign cache type.

QueryHit.score is the raw KNN vector distance (lower is closer), not a similarity — rank ascending.

Observability

Pass metrics (a RetrievalMetrics) and/or tracer (a RetrievalTracer) to instrument every operation. create_prometheus_metrics() provides a ready-made prometheus-client implementation.

Development

uv run --extra dev pytest tests -q

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betterdb_retrieval-0.1.0.tar.gz (33.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

betterdb_retrieval-0.1.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file betterdb_retrieval-0.1.0.tar.gz.

File metadata

  • Download URL: betterdb_retrieval-0.1.0.tar.gz
  • Upload date:
  • Size: 33.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for betterdb_retrieval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8aa713e2792105450dc8aa691c15fd2d1050a703c33cb3f18d6099502e7ef032
MD5 373dfab003855066b7c8af966b957eb7
BLAKE2b-256 09e3acf06ea8bc3a1abfe904e79c01f62691cad923ad0661f0fe4da26e114614

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.1.0.tar.gz:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file betterdb_retrieval-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for betterdb_retrieval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5462f698194912af8db4e7e15c60e7686081bf98f91c26b255cea771e3f92a9f
MD5 4214dc522062bc08c95cfd4db3cb3550
BLAKE2b-256 2ce0eb504aa333f25e6b9058e653a4685122844b04717026f64008dbbafe6b62

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.1.0-py3-none-any.whl:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page