Skip to main content

Developer-facing retrieval SDK over Valkey Search: index lifecycle, upsert, and vector + filtered query.

Project description

@betterdb/retrieval (Python)

betterdb-retrieval — developer-facing retrieval SDK over Valkey Search (FT.*): typed index schema, idempotent index lifecycle, upsert/delete, and vector + filtered + hybrid query. This is the Python equivalent of the TypeScript @betterdb/retrieval package, built on betterdb-valkey-search-kit.

Installation

pip install betterdb-retrieval valkey

Requires a Valkey server with the Valkey Search module loaded.

Quick start

from valkey.asyncio import Valkey

from betterdb_retrieval import Retriever, UpsertEntry

client = Valkey.from_url("redis://localhost:6379")


async def embed(text: str) -> list[float]:
    ...  # return an embedding


retriever = Retriever(
    client=client,
    name="docs",
    schema={
        "fields": {
            "category": {"type": "tag"},
            "year": {"type": "numeric", "sortable": True},
        },
        "vector": {"algorithm": "hnsw", "metric": "cosine"},
    },
    embed_fn=embed,
)

# Create the index if it doesn't exist (idempotent; dims resolved from embed_fn).
await retriever.create_index()

await retriever.upsert([
    UpsertEntry(
        id="doc1",
        text="Valkey is a high-performance key-value store",
        fields={"category": "db", "year": 2024},
    ),
])

hits = await retriever.query(
    text="fast in-memory database",
    k=5,
    filter={"category": "db"},
)

Retriever API

  • create_index() — create the index if absent (idempotent). Vector dimension is taken from schema["vector"]["dims"] or resolved by probing embed_fn.
  • upsert(entries) — embed each entry's text and write it as a hash with its fields.
  • delete(ids) — delete documents by id.
  • query(*, k, text=None, vector=None, filter=None, hybrid=None) — KNN search. Provide text (embedded for you) or a precomputed vector, a positive k, an optional filter (tag/numeric fields), and hybrid="rerank" to post-process hits through a rerank_fn. Returns list[QueryHit].
  • describe_index() / health() — index stats: doc count, indexing state, dimension, percent indexed, and an optional estimated recall.
  • drop_index() — drop the index (no-op if it doesn't exist).
  • register() / unregister() — publish/remove a discovery marker in the shared __betterdb:caches registry, ownership-checked so it never clobbers a foreign cache type.

QueryHit.score is the raw KNN vector distance (lower is closer), not a similarity — rank ascending.

Observability

Pass metrics (a RetrievalMetrics) and/or tracer (a RetrievalTracer) to instrument every operation. create_prometheus_metrics() provides a ready-made prometheus-client implementation.

Development

uv run --extra dev pytest tests -q

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betterdb_retrieval-0.4.0.tar.gz (38.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

betterdb_retrieval-0.4.0-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file betterdb_retrieval-0.4.0.tar.gz.

File metadata

  • Download URL: betterdb_retrieval-0.4.0.tar.gz
  • Upload date:
  • Size: 38.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for betterdb_retrieval-0.4.0.tar.gz
Algorithm Hash digest
SHA256 ffe3ed21f6250cd868991be07643fc1e44fc8f5c85fe3147bae806963f3eb037
MD5 c5d54e4b64e1c08f04937b415daf131a
BLAKE2b-256 3d33de9ba701de0741ac833644b3050a91d56c42f848fb1c74c4603b462f3c3a

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.4.0.tar.gz:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file betterdb_retrieval-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for betterdb_retrieval-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2e042474712f43845759fe9f19f27f7184bd5bcb3dfa09bb34ec4782a2d7133
MD5 acd26f00fc93a121a4cda73968c2ff00
BLAKE2b-256 d9d29842615151e391e15f92e79ba9449783e096607be9cc7b01c40566958290

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.4.0-py3-none-any.whl:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page