Skip to main content

Developer-facing retrieval SDK over Valkey Search: index lifecycle, upsert, and vector + filtered query.

Project description

@betterdb/retrieval (Python)

betterdb-retrieval — developer-facing retrieval SDK over Valkey Search (FT.*): typed index schema, idempotent index lifecycle, upsert/delete, and vector + filtered + hybrid query. This is the Python equivalent of the TypeScript @betterdb/retrieval package, built on betterdb-valkey-search-kit.

Installation

pip install betterdb-retrieval valkey

Requires a Valkey server with the Valkey Search module loaded.

Quick start

from valkey.asyncio import Valkey

from betterdb_retrieval import Retriever, UpsertEntry

client = Valkey.from_url("redis://localhost:6379")


async def embed(text: str) -> list[float]:
    ...  # return an embedding


retriever = Retriever(
    client=client,
    name="docs",
    schema={
        "fields": {
            "category": {"type": "tag"},
            "year": {"type": "numeric", "sortable": True},
        },
        "vector": {"algorithm": "hnsw", "metric": "cosine"},
    },
    embed_fn=embed,
)

# Create the index if it doesn't exist (idempotent; dims resolved from embed_fn).
await retriever.create_index()

await retriever.upsert([
    UpsertEntry(
        id="doc1",
        text="Valkey is a high-performance key-value store",
        fields={"category": "db", "year": 2024},
    ),
])

hits = await retriever.query(
    text="fast in-memory database",
    k=5,
    filter={"category": "db"},
)

Retriever API

  • create_index() — create the index if absent (idempotent). Vector dimension is taken from schema["vector"]["dims"] or resolved by probing embed_fn.
  • upsert(entries) — embed each entry's text and write it as a hash with its fields.
  • delete(ids) — delete documents by id.
  • query(*, k, text=None, vector=None, filter=None, hybrid=None) — KNN search. Provide text (embedded for you) or a precomputed vector, a positive k, an optional filter (tag/numeric fields), and hybrid="rerank" to post-process hits through a rerank_fn. Returns list[QueryHit].
  • describe_index() / health() — index stats: doc count, indexing state, dimension, percent indexed, and an optional estimated recall.
  • drop_index() — drop the index (no-op if it doesn't exist).
  • register() / unregister() — publish/remove a discovery marker in the shared __betterdb:caches registry, ownership-checked so it never clobbers a foreign cache type.

QueryHit.score is the raw KNN vector distance (lower is closer), not a similarity — rank ascending.

Observability

Pass metrics (a RetrievalMetrics) and/or tracer (a RetrievalTracer) to instrument every operation. create_prometheus_metrics() provides a ready-made prometheus-client implementation.

Development

uv run --extra dev pytest tests -q

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betterdb_retrieval-0.3.0.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

betterdb_retrieval-0.3.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file betterdb_retrieval-0.3.0.tar.gz.

File metadata

  • Download URL: betterdb_retrieval-0.3.0.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for betterdb_retrieval-0.3.0.tar.gz
Algorithm Hash digest
SHA256 634a039fcba686f60121eb218c63533b5047fcc1be92f18bc34f2428219f64da
MD5 eb1519b519d6ad35414c3167492612cc
BLAKE2b-256 1088d0019fb0f46c0e3dec411be772ffc9cde54539f48c7c3105ab5dd6fb7dcf

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.3.0.tar.gz:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file betterdb_retrieval-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for betterdb_retrieval-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c87625217b0d1d0f9a72513889378f6e60cf50a2eb9d4bbd18664e653aabb8d4
MD5 6c0111befc8f478b7285e7324210da4a
BLAKE2b-256 12d59960c7df26cd7fcf1af2c5c463c97217ce0c0f5f23f6805d8ab92bab7a03

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_retrieval-0.3.0-py3-none-any.whl:

Publisher: retrieval-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page