Semantic cache for AI workloads backed by Valkey vector search. Embeddings-based similarity matching with OpenTelemetry and Prometheus instrumentation.

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

kivanow

These details have not been verified by PyPI

Project description

betterdb-semantic-cache

Semantic cache for AI workloads backed by Valkey vector search. Embeddings-based similarity matching with OpenTelemetry and Prometheus instrumentation.

Installation

pip install betterdb-semantic-cache
# With OpenAI embeddings:
pip install betterdb-semantic-cache[openai]
# All extras:
pip install betterdb-semantic-cache[all]

Quick start

import asyncio
import valkey.asyncio as valkey
from betterdb_semantic_cache import SemanticCache, SemanticCacheOptions
from betterdb_semantic_cache.embed.openai import create_openai_embed

async def main():
    client = valkey.Valkey(host="localhost", port=6399)
    cache = SemanticCache(SemanticCacheOptions(
        client=client,
        embed_fn=create_openai_embed(),
        default_threshold=0.12,
    ))
    await cache.initialize()

    result = await cache.check("What is the capital of France?")
    if not result.hit:
        await cache.store("What is the capital of France?", "Paris")

asyncio.run(main())

LLM-as-judge

When a hit lands in the uncertainty band (threshold - uncertainty_band < score <= threshold), you can supply a judge_fn to adjudicate automatically instead of handling confidence == 'uncertain' yourself.

from betterdb_semantic_cache import JudgeOptions
from betterdb_semantic_cache.types import CacheCheckOptions

result = await cache.check(user_prompt, CacheCheckOptions(
    judge=JudgeOptions(
        judge_fn=my_judge,
        on_error="accept",   # fail-open on judge errors (default)
        timeout_ms=2000,     # per-call timeout (default)
    )
))

A minimal OpenAI judge:

from openai import AsyncOpenAI

openai = AsyncOpenAI()

async def my_judge(inp: dict) -> bool:
    # Return True to accept (confidence → 'high')
    # Return False to reject (treated as miss with nearest_miss)
    verdict = await openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Reply YES or NO only."},
            {"role": "user", "content": (
                f"Does this cached response correctly answer the prompt?\n"
                f"Prompt: {inp['prompt']}\nResponse: {inp['response']}"
            )},
        ],
    )
    return (verdict.choices[0].message.content or "").startswith("YES")

When the judge is invoked: only for confidence == 'uncertain' hits. High-confidence hits, misses, and the zero-candidates case bypass the judge entirely.

Accept path: result.hit == True, result.confidence == 'high'.

Reject path: result.hit == False, result.nearest_miss populated with delta_to_threshold <= 0 (use this to distinguish judge rejections from regular misses where delta_to_threshold > 0).

Composing with rerank: when both rerank and judge are set, the judge receives the reranked pick's response and similarity score.

check_batch() does not support judge. Call check() individually for prompts that need adjudication.

CacheCheckOptions reference

Option	Type	Default	Description
`threshold`	`float`	`default_threshold`	Per-request cosine distance threshold override
`category`	`str`	`""`	Category tag for per-category thresholds and metric labels
`filter`	`str`	`None`	FT.SEARCH pre-filter expression (trusted input only)
`k`	`int`	`1`	KNN neighbours to fetch (ignored when `rerank` is set)
`stale_after_model_change`	`bool`	`False`	Evict and miss when stored model differs from `current_model`
`current_model`	`str`	`None`	Model to compare against stored entries
`rerank`	`RerankOptions`	`None`	Rerank hook; see `RerankOptions`
`judge`	`JudgeOptions`	`None`	LLM-as-judge for borderline hits. Not supported by `check_batch()`; raises `SemanticCacheUsageError`

Telemetry

The published wheel includes anonymous product analytics powered by PostHog. When a baked API key is present in the package (injected at publish time), aggregate usage statistics (hit rate, cost saved) are collected on a per-instance basis — no prompt text, responses, or personally-identifiable information is ever sent.

To opt out, set the environment variable before starting your process:

export BETTERDB_TELEMETRY=false   # also accepts: 0, no, off

You can also disable it programmatically:

from betterdb_semantic_cache.types import AnalyticsOptions
cache = SemanticCache(SemanticCacheOptions(
    ...,
    analytics=AnalyticsOptions(disabled=True),
))

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

kivanow

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.5.0

Jun 11, 2026

0.4.1

May 31, 2026

0.4.0

May 16, 2026

0.3.0

May 5, 2026

0.1.3

Apr 25, 2026

0.1.2

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betterdb_semantic_cache-0.5.0.tar.gz (96.9 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

betterdb_semantic_cache-0.5.0-py3-none-any.whl (73.3 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file betterdb_semantic_cache-0.5.0.tar.gz.

File metadata

Download URL: betterdb_semantic_cache-0.5.0.tar.gz
Upload date: Jun 11, 2026
Size: 96.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for betterdb_semantic_cache-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`f6fca77cb89dbf2d174e38a25f162348d57b606ba7ad46f10ac3da396094daa5`
MD5	`b7cd0ac26b8f947ae704f2049259b3d2`
BLAKE2b-256	`8bcfb8f2ba7045e3ade182351747f3eb4f12ec569d0d333edf38e9f79aa7892b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_semantic_cache-0.5.0.tar.gz:

Publisher: semantic-cache-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: betterdb_semantic_cache-0.5.0.tar.gz
- Subject digest: f6fca77cb89dbf2d174e38a25f162348d57b606ba7ad46f10ac3da396094daa5
- Sigstore transparency entry: 1789879272
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: BetterDB-inc/monitor@461a52205ae06cc490b6b1643b99cee83e064ba0
- Branch / Tag: refs/tags/semantic-cache-py-v0.5.0
- Owner: https://github.com/BetterDB-inc
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: semantic-cache-py-release.yml@461a52205ae06cc490b6b1643b99cee83e064ba0
- Trigger Event: push

File details

Details for the file betterdb_semantic_cache-0.5.0-py3-none-any.whl.

File metadata

Download URL: betterdb_semantic_cache-0.5.0-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 73.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for betterdb_semantic_cache-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`84cd07d418eb20cd646dbc51a89b5af71ae87f69402d4a635ceb672fd4135b21`
MD5	`b2c354e40a507129293ae478cc06551b`
BLAKE2b-256	`5214f1e0707924bdfa62649a09e65ae356fcfa578d429c3c8fcb0544fb37673d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for betterdb_semantic_cache-0.5.0-py3-none-any.whl:

Publisher: semantic-cache-py-release.yml on BetterDB-inc/monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: betterdb_semantic_cache-0.5.0-py3-none-any.whl
- Subject digest: 84cd07d418eb20cd646dbc51a89b5af71ae87f69402d4a635ceb672fd4135b21
- Sigstore transparency entry: 1789879351
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: BetterDB-inc/monitor@461a52205ae06cc490b6b1643b99cee83e064ba0
- Branch / Tag: refs/tags/semantic-cache-py-v0.5.0
- Owner: https://github.com/BetterDB-inc
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: semantic-cache-py-release.yml@461a52205ae06cc490b6b1643b99cee83e064ba0
- Trigger Event: push

betterdb-semantic-cache 0.5.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

betterdb-semantic-cache

Installation

Quick start

LLM-as-judge

CacheCheckOptions reference

Telemetry

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance