Test: record an agent run once, replay it forever — deterministic, offline, free. The vcrpy of the agent era.
Project description
cendor-cassette
Record an agent run once; replay it forever — deterministic, offline, and free. Unlike vcrpy
(HTTP-only), it captures the whole run: every LLM call and tool call, in order.
Agent tests that run in 0.2s with no API key.
·
pip install cendor-cassette
from cendor.core import instrument
from cendor import cassette
client = instrument(OpenAI()) # the same instrumented seam used in production
@cassette.use("triage_happy_path.json") # record first run, replay after (auto mode)
def test_triage():
result = my_agent.run("My card was charged twice")
assert "refund" in result.tools_called
assert cassette.semantic_match(result.answer, "offers a refund")
Highlights
- Whole-run capture — every LLM and tool call, in order (not just HTTP, like
vcrpy). - Four modes —
auto(record then replay) ·record·replay(fail on an unrecorded call) ·rerecord(run live, reportdrift()without overwriting the committed cassette). - Decorator or context manager —
@cassette.use("run.json")/with cassette.using(...)(handy in pytest fixtures). - Meaning-based assertions —
semantic_match(actual, expected)(offline lexical default; opt into a free offline local-embedding scorer, a BYO-provider embedder, or an LLM judge).semantic_drift()filtersrerecordnoise down to real regressions. - Pluggable matching + redaction — a
normalizerignores volatile fields; secrets/PII redacted on write, but matching hashes the un-redacted request so redaction never collapses two distinct calls (redact=True|False|callable). - Parallel-safe — recording is scoped to the active
using()/use()context (aContextVar), so concurrent blocks never capture each other's calls; cassettes are written atomically. Under pytest-xdist, give each worker its own cassette path (e.g. suffix withPYTEST_XDIST_WORKER) so workers don't race on one file. - Faithful replay — dict-response providers (Ollama/Bedrock) replay as dicts and SDK-object providers as attribute objects;
stream=Trueandstream=Falsecalls match their own recordings (cassette format v2; committed v1 cassettes still replay). promote()turns a production JSONL trace into a replayable regression test (LLM and tool calls).
Semantic matching (opt-in)
semantic_match defaults to lexical_score — offline, deterministic, zero-dependency. For
meaning-aware (negation-sensitive) checks, pass a scorer into the existing hook. cassette binds no
model and adds no dependency unless you ask for one. Four tiers, hermetic-and-free → meaning-aware-but-costly:
- Lexical (default) —
lexical_score. Hermetic, deterministic, free, zero-dep. - Local embeddings (recommended) —
local_embedding_scorer(), free/offline/deterministic via model2vec static embeddings (numpy-only, no torch, ~8–30 MB). Behindpip install 'cendor-cassette[embeddings]'. - BYO provider embeddings —
embedding_scorer(embed_fn)wraps any provider (OpenAItext-embedding-3-small/large, Googlegemini-embedding, Cohereembed-v3; Anthropic has no embeddings API → use Voyage). Non-hermetic: a cloud embedder calls the network at score time.openai_embedding_scorer(client, model="text-embedding-3-small")is a thin convenience over an already-built OpenAI-shaped client. - LLM-judge — a
scorerthat calls your own instrumented client (a documented recipe, never a shipped dependency). Non-hermetic, non-deterministic, costs money.
from cendor import cassette
score = cassette.local_embedding_scorer() # free, offline, deterministic
assert cassette.semantic_match(result.answer, "offers a refund", scorer=score)
assert not cassette.semantic_match("we will not offer a refund", "offers a refund", scorer=score)
drift() stays byte-exact; at temperature > 0 it flags every run. semantic_drift(threshold=0.8, scorer=None) re-scores each divergence's recorded-vs-live text and keeps only those below the
threshold (real regressions, with a score), so cosmetic rewording is ignored. The alternative for
byte-stable drift: record/replay at temperature=0.
Wrap-around, test-time only — records via core's bus, replays via a core interceptor; no second patch, no network.
See docs/cassette.md · CHANGELOG. Part of the Cendor stack — github.com/cendorhq/Cendor. Powered by PowerAI Labs. Apache-2.0; provided "as is", without warranty — use at your own risk (LICENSE §7–8).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cendor_cassette-1.0.0.tar.gz.
File metadata
- Download URL: cendor_cassette-1.0.0.tar.gz
- Upload date:
- Size: 22.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fc40ce51168614ee8ae85b8a84a247166f3b1872b9649991554c11cd8ec07bb
|
|
| MD5 |
5e6c2f6fbbdb4a1317c4d1e2418a240f
|
|
| BLAKE2b-256 |
977972a03198af86211d64b8390069d610058602c184af877c9b8effa2c70715
|
Provenance
The following attestation bundles were made for cendor_cassette-1.0.0.tar.gz:
Publisher:
release.yml on cendorhq/Cendor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cendor_cassette-1.0.0.tar.gz -
Subject digest:
4fc40ce51168614ee8ae85b8a84a247166f3b1872b9649991554c11cd8ec07bb - Sigstore transparency entry: 2063270641
- Sigstore integration time:
-
Permalink:
cendorhq/Cendor@1733d9d073230ac9448221f660fce4ab07a42c33 -
Branch / Tag:
refs/tags/cassette-v1.0.0 - Owner: https://github.com/cendorhq
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1733d9d073230ac9448221f660fce4ab07a42c33 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cendor_cassette-1.0.0-py3-none-any.whl.
File metadata
- Download URL: cendor_cassette-1.0.0-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d012ed1f02c2798722b0041500fea2049911650bef9fbda74fb7bd3962aef9ba
|
|
| MD5 |
f66750ea845c08f49725f033e5a07283
|
|
| BLAKE2b-256 |
01d8e5785d7adb64e270509357fa6a7106bded0dc18de85849973b3c5aeafc34
|
Provenance
The following attestation bundles were made for cendor_cassette-1.0.0-py3-none-any.whl:
Publisher:
release.yml on cendorhq/Cendor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cendor_cassette-1.0.0-py3-none-any.whl -
Subject digest:
d012ed1f02c2798722b0041500fea2049911650bef9fbda74fb7bd3962aef9ba - Sigstore transparency entry: 2063270769
- Sigstore integration time:
-
Permalink:
cendorhq/Cendor@1733d9d073230ac9448221f660fce4ab07a42c33 -
Branch / Tag:
refs/tags/cassette-v1.0.0 - Owner: https://github.com/cendorhq
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1733d9d073230ac9448221f660fce4ab07a42c33 -
Trigger Event:
push
-
Statement type: