Semantic memory for LLM agent calls with an equivalence-first cache architecture.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

SmartMemo

SmartMemo is a semantic memory and caching layer for LLM agent calls. Its core thesis is simple: cosine similarity is a useful candidate selector, but it is not semantic equivalence. SmartMemo uses embedding search to find likely cache candidates, then uses a learned equivalence classifier to decide whether a cached response is safe to reuse.

As of 0.2.0, SmartMemo ships a pretrained classifier, so that decision works out of the box — no training required.

async SmartMemo.get_or_call(...)
a bundled pretrained equivalence classifier (classifier-v2), opt-in with one line
SQLite persistence
embedding provider protocol with SentenceTransformers embeddings and FAISS vector search
a reproducible local-LLM training-data pipeline and a hand-curated gold test set
classifier training, evaluation, checkpoint inference, and classifier-gated cache hits
explicit and opt-in implicit feedback capture, durable export, and gated retraining

Without a classifier, SmartMemo decides cache hits with a cosine threshold — the measured baseline. With the bundled classifier, cosine search becomes the candidate selector and the learned classifier makes the final cache-hit decision.

Install

SmartMemo's embedding and classifier stack depends on PyTorch, FAISS, and SentenceTransformers, so install the ml extra:

pip install "smartmemo[ml]"

For local development:

uv sync --all-extras
uv run pytest
uv run ruff check
uv run pyright

Minimal Example

from smartmemo import ClassifierConfig, SmartMemo

cache = SmartMemo(
    domain="customer-support",
    classifier=ClassifierConfig.bundled(),
)

async def call_llm(prompt: str) -> str:
    return "fresh LLM response"

result = await cache.get_or_call(
    prompt="Summarize this customer's latest billing ticket",
    llm_function=call_llm,
)

print(result.response)
print(result.was_cache_hit)
print(result.classifier_score)

The Bundled Classifier

classifier-v2 is a generic, cross-domain equivalence classifier shipped inside the package at smartmemo/_models/classifier-v2.pt. It is a small MLP over all-MiniLM-L6-v2 embeddings, trained on 16,576 labeled prompt pairs across nine domains, built by a local LLM paraphraser (positives) and templated same-object/opposite-action swaps including negation (hard negatives). The whole pipeline is scripts/generate_training_data.py.

Measured on a hand-curated gold set of 84 held-out prompt pairs (31 equivalent, 53 not). The set deliberately includes opposite-action pairs — the case a fixed cosine threshold gets wrong:

Decision method	Precision	Recall	F1
Cosine baseline (at equal recall)	0.53	0.94	0.67
`classifier-v2` (threshold 0.95)	0.83	0.94	0.88

That is +30 precision points at equal recall: on this gold set the cosine baseline makes 26 false-positive cache hits where classifier-v2 makes 6. The full, auditable model card — including the in-distribution validation metrics — is smartmemo/_models/classifier-v2.report.json.

classifier-v2 is still a generic cold-start model. On out-of-distribution, adversarial prompts it beats the cosine baseline but is not infallible — see the high-stakes benchmark below. It is bound to the all-MiniLM-L6-v2 embedding space (384 dimensions), and per-domain accuracy improves with the feedback-driven retraining loop below.

Benchmarks

uv run python benchmarks/cosine_baseline_customer_support.py
uv run python benchmarks/classifier_vs_cosine.py
uv run python benchmarks/false_positive_eval.py

The first benchmark shows the cosine baseline's false-positive failure mode on customer-support prompts. The second scores the bundled classifier against the cosine baseline on the gold set and writes benchmarks/results/classifier_vs_cosine.json.

The third runs a small, hand-authored set of high-stakes medical/legal/finance opposite-action prompts. On that adversarial set the cosine baseline wrongly serves 8 of 16 opposite-action pairs from cache; classifier-v2 wrongly serves 6 — better than cosine, but a reminder that a generic classifier is not infallible on out-of-distribution prompts and that domain retraining still matters. GPTCache and similar semantic caches decide hits by embedding similarity, so the cosine baseline here represents that class of tool.

Training Your Own Classifier

SmartMemo includes a trainable pair classifier over prompt embeddings. To reproduce the shipped model from the committed dataset:

uv run python scripts/train_classifier.py

To train on your own JSONL prompt pairs:

uv run smartmemo train-classifier \
  --data data/fixtures/customer_support_pairs.jsonl \
  --out models/classifier-custom.pt \
  --domain customer-support \
  --epochs 5

Then point SmartMemo at the checkpoint:

from smartmemo import ClassifierConfig, SmartMemo

cache = SmartMemo(
    domain="customer-support",
    classifier=ClassifierConfig(model_path="models/classifier-custom.pt"),
)

Feedback Export

SmartMemo records cache-hit lookups so explicit feedback can become training data:

result = await cache.get_or_call(
    prompt="Approve the customer's refund request",
    llm_function=call_llm,
)

if result.was_cache_hit and user_rejected_answer:
    await cache.report_bad_hit(result.query_id, reason="wrong refund decision")

written = cache.export_feedback_pairs("data/feedback_pairs.jsonl")
print(written)

The exported JSONL uses the same prompt-pair shape accepted by smartmemo train-classifier.

Implicit Feedback

Users rarely file explicit feedback, but they do re-ask a question when the answer was unhelpful. Implicit feedback — opt-in, off by default — treats re-issuing the same prompt shortly after a cache hit as a signal that the earlier hit was bad, and records it automatically:

from smartmemo import CacheConfig, ImplicitFeedbackConfig, SmartMemo

cache = SmartMemo(
    domain="customer-support",
    config=CacheConfig(
        implicit_feedback=ImplicitFeedbackConfig(window_seconds=30.0),
    ),
)

When a re-issue is detected, CacheResult.implicit_bad_hit_recorded is True and an auto-labeled bad-hit event is recorded (told apart from explicit feedback by its metadata). Matching is exact — a re-phrased re-issue is not detected — and explicit feedback always takes precedence. See docs/feedback.md.

Manual Retraining

Use smartmemo retrain to turn durable feedback into a candidate classifier checkpoint:

uv run smartmemo --db-path .smartmemo/cache.db retrain \
  --out models/classifier-candidate.pt \
  --validation-data data/validation_pairs.jsonl \
  --seed-data data/fixtures/customer_support_pairs.jsonl \
  --domain customer-support \
  --min-precision 0.95 \
  --promote-to models/classifier-active.pt

The command always trains a candidate and writes an auditable <checkpoint>.report.json. Promotion only copies the candidate to --promote-to when the validation gates pass. SmartMemo does not run background retraining or automatically reload classifiers at runtime.

Release

Version 0.2.0 is configured for PyPI as smartmemo. The repository publishes through GitHub Actions trusted publishing from .github/workflows/publish-pypi.yml with the pypi environment.

git tag v0.2.0
git push origin v0.2.0

That tag builds the source distribution and wheel, then uploads them to PyPI.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

abhibuilds

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

May 20, 2026

This version

0.2.0

May 20, 2026

0.1.0

May 20, 2026

0.0.4

May 19, 2026

0.0.3

May 13, 2026

0.0.2

May 13, 2026

0.0.1

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartmemo-0.2.0.tar.gz (1.4 MB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

smartmemo-0.2.0-py3-none-any.whl (800.3 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file smartmemo-0.2.0.tar.gz.

File metadata

Download URL: smartmemo-0.2.0.tar.gz
Upload date: May 20, 2026
Size: 1.4 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for smartmemo-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`a905f4d9ce0547ab3cbca26d5f35f47bc49864cd70749b4d8fbf338ab5a7685c`
MD5	`e59039f529f3212966fa616035a1feea`
BLAKE2b-256	`db47f1f95404832bdcb84dbbed3b18ff6e88ff21046cceaa993c4083c14c6dbe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for smartmemo-0.2.0.tar.gz:

Publisher: publish-pypi.yml on awesome-pro/smartmemo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: smartmemo-0.2.0.tar.gz
- Subject digest: a905f4d9ce0547ab3cbca26d5f35f47bc49864cd70749b4d8fbf338ab5a7685c
- Sigstore transparency entry: 1581902321
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: awesome-pro/smartmemo@749b1812c27a0f0ffd8a51c34916f669cb492176
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/awesome-pro
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@749b1812c27a0f0ffd8a51c34916f669cb492176
- Trigger Event: push

File details

Details for the file smartmemo-0.2.0-py3-none-any.whl.

File metadata

Download URL: smartmemo-0.2.0-py3-none-any.whl
Upload date: May 20, 2026
Size: 800.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for smartmemo-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c60107c51e5bd02e851932960f5e8d328a1b239cc1c28837e707ce337f6d1180`
MD5	`d7c811cc2392314b1b250eb66e03524a`
BLAKE2b-256	`82109997a91806a80426965fd7f8886d27f1510ec81dda6ddc3720b6b3690a59`

See more details on using hashes here.

Provenance

The following attestation bundles were made for smartmemo-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on awesome-pro/smartmemo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: smartmemo-0.2.0-py3-none-any.whl
- Subject digest: c60107c51e5bd02e851932960f5e8d328a1b239cc1c28837e707ce337f6d1180
- Sigstore transparency entry: 1581902494
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: awesome-pro/smartmemo@749b1812c27a0f0ffd8a51c34916f669cb492176
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/awesome-pro
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@749b1812c27a0f0ffd8a51c34916f669cb492176
- Trigger Event: push

smartmemo 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

SmartMemo

Install

Minimal Example

The Bundled Classifier

Benchmarks

Training Your Own Classifier

Feedback Export

Implicit Feedback

Manual Retraining

Release

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance