Receipt-native AI safety toolkit
Project description
Assay
Verifiable evidence for AI systems. Independently verifiable, offline, without server access.
Logs record what you say happened. Assay makes the record tamper-evident, completeness-checkable, and independently verifiable -- including by someone who does not trust you. Two lines of code. Four exit codes.
pip install assay-ai && assay quickstart
Not this: Assay is not a logging framework, an observability dashboard, or a monitoring tool. It produces signed evidence bundles that a third party can verify offline. If you need Datadog, this isn't it.
See It -- Then Understand It
No API key needed. Runs on synthetic data:
assay demo-incident # two-act scenario: honest PASS vs honest FAIL
Act 1: Agent uses gpt-4 with a guardian check. Integrity PASS, claims PASS. Act 2: Someone swaps the model and drops the guardian. Integrity PASS, claims FAIL.
That second result is an honest failure -- authentic evidence proving the run violated its declared standards. Not a cover-up. Exit code 1.
How that works
Assay separates two questions on purpose:
- Integrity: "Were these bytes tampered with after creation?" (signatures, hashes, required files)
- Claims: "Does this evidence satisfy our declared governance checks?" (receipt types, counts, field values)
| Integrity | Claims | Exit | Meaning |
|---|---|---|---|
| PASS | PASS | 0 | Evidence checks out, behavior meets standards |
| PASS | FAIL | 1 | Honest failure: authentic evidence of a standards violation |
| FAIL | -- | 2 | Tampered evidence |
| -- | -- | 3 | Bad input (missing files, invalid arguments) |
The split is the point. Systems that can prove they failed honestly are more trustworthy than systems that always claim to pass.
Add to Your Project
# 1. Find uninstrumented LLM calls
assay scan . --report
# 2. Patch (one line per SDK, or auto-patch all)
assay patch .
# 3. Run + build a signed proof pack
assay run -c receipt_completeness -- python my_app.py
# 4. Verify
assay verify-pack ./proof_pack_*/
assay scan . --report finds every LLM call site (OpenAI, Anthropic, LangChain)
and generates a self-contained HTML gap report. assay patch inserts the
two-line integration. assay run wraps your command, collects receipts, and
produces a signed 5-file proof pack. assay verify-pack checks integrity +
claims and exits with one of the four codes above. Then run assay explain
on any pack for a plain-English summary.
Why now: EU AI Act Articles 12 and 19 require logging and traceability for high-risk AI systems. SOC 2 CC7.2 requires evidence of monitoring. "We have logs on our server" is not independently verifiable evidence. Assay produces evidence that is.
CI Gate
Three commands, three exit codes, one lockfile:
assay run -c receipt_completeness -- python my_app.py
assay verify-pack ./proof_pack_*/ --lock assay.lock --require-claim-pass
assay diff ./baseline_pack/ ./proof_pack_*/ --gate-cost-pct 25 --gate-errors 0 --gate-strict
The lockfile catches config drift. Verify-pack catches tampering. Diff catches regressions and budget overruns. See Decision Escrow for the protocol model.
# Lock your verification contract
assay lock write --cards receipt_completeness -o assay.lock
Daily use after CI is green
Regression forensics:
assay diff ./proof_pack_*/ --against-previous --why
--against-previous auto-discovers the baseline pack.
--why traces receipt chains to explain what regressed and which call sites caused it.
Cost/latency drift (from receipts):
assay analyze --history --since 7
Shows cost, latency percentiles, error rates, and per-model breakdowns from your local trace history.
Trust Model
What Assay proves, what it doesn't, and how to strengthen guarantees.
Assay detects:
- Retroactive tampering (edit one byte, verification fails)
- Selective omission under a completeness contract
- Claiming checks that were never run
- Policy drift from a locked baseline
Assay does not prevent:
- A fully fabricated false run (attacker controls the machine)
- Dishonest receipt content (receipts are self-attested)
- Timestamp fraud without an external time anchor
To strengthen guarantees:
- Transparency ledger (independent witness)
- CI-held org key + branch protection (separation of signer and committer)
- External timestamping (RFC 3161)
The cost of cheating scales with the complexity of the lie. Assay doesn't make fraud impossible -- it makes fraud expensive.
Commands
| Command | Purpose |
|---|---|
assay quickstart |
One command: demo + scan + next steps |
assay demo-incident |
Two-act scenario: passing run vs failing run |
assay demo-challenge |
CTF-style good + tampered pack pair |
assay demo-pack |
Generate demo packs (no config needed) |
assay onboard |
Guided setup: doctor -> scan -> first run plan |
assay scan |
Find uninstrumented LLM call sites (--report for HTML) |
assay patch |
Auto-insert SDK integration patches into your entrypoint |
assay run |
Wrap command, collect receipts, build signed pack |
assay verify-pack |
Verify a Proof Pack (integrity + claims) |
assay explain |
Plain-English summary of a proof pack |
assay analyze |
Cost, latency, error breakdown from pack or --history |
assay diff |
Compare packs: claims, cost, latency (--against-previous, --why, --gate-*) |
assay key list |
List local signing keys and active signer |
assay key rotate |
Generate a new signer key and switch active signer |
assay key set-active |
Set active signing key for future runs |
assay ci init github |
Generate a GitHub Actions workflow |
assay lock write |
Freeze verification contract to lockfile |
assay lock check |
Validate lockfile against current card definitions |
assay cards list |
List built-in run cards and their claims |
assay cards show |
Show card details, claims, and parameters |
assay mcp-proxy |
Transparent MCP proxy: intercept tool calls, emit receipts |
assay doctor |
Preflight check: is Assay ready here? |
Documentation
- Quickstart -- install, golden path, command reference
- Roadmap -- phases, product boundary, execution stack
- Decision Escrow -- protocol model: agent actions don't settle until verified
- For Compliance Teams -- what auditors see, evidence artifacts, framework alignment
- Repo Map -- what lives where across the Assay ecosystem
- Pilot Program -- early adopter program details
Scan Study
We scanned 30 popular open-source AI projects for tamper-evident audit trails. Found 202 high-confidence LLM SDK call sites across 21 projects. Zero had evidence emission at any call site. Full results.
Common Issues
-
"No receipts emitted" after
assay run: First, check whether your code has call sites:assay scan .-- if scan finds 0 sites, you may not be using a supported SDK yet. If scan finds sites, check: (1) Is# assay:patchedin the file? Runassay scan . --reportto see patch status per file. (2) Did you install the SDK extra (pip install assay-ai[openai])? (3) Did you use--before your command (assay run -- python app.py)? Runassay doctorfor a full diagnostic. -
LangChain projects:
assay patchauto-instruments OpenAI and Anthropic SDKs but not LangChain (which uses callbacks, not monkey-patching). For LangChain, addAssayCallbackHandler()to your chain'scallbacksparameter manually. Seesrc/assay/integrations/langchain.pyfor the handler. -
assay run python app.pygives "No command provided": You need the--separator:assay run -c receipt_completeness -- python app.py. Everything after--is passed to the subprocess. -
Quickstart blocked on large directories:
assay quickstartguards against scanning system directories (>10K Python files). Use--forceto bypass:assay quickstart --force.
Get Involved
- Try it:
pip install assay-ai && assay quickstart - Questions / feedback: GitHub Discussions
- Bug reports: Issues
- Want this in your stack in 2 weeks? Pilot program -- we instrument your AI workflows, set up CI gates, and hand you a working evidence pipeline. Open a pilot inquiry.
Related Repos
| Repo | Purpose |
|---|---|
| assay | Core CLI, SDK, conformance corpus (this repo) |
| assay-verify-action | GitHub Action for CI verification |
| assay-ledger | Public transparency ledger |
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file assay_ai-1.5.1.tar.gz.
File metadata
- Download URL: assay_ai-1.5.1.tar.gz
- Upload date:
- Size: 157.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
535e48061d3b06783724f02fa680e31d6caf5e76f532d495d14954cd57e20d17
|
|
| MD5 |
2a83a5f411aecf9f3f4b216e2bda0ec4
|
|
| BLAKE2b-256 |
3837957296c610a9d6f4c883d5a3cc5d55663b7fdc13d5e9fa3e2a1c9d36ddb0
|
File details
Details for the file assay_ai-1.5.1-py3-none-any.whl.
File metadata
- Download URL: assay_ai-1.5.1-py3-none-any.whl
- Upload date:
- Size: 173.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
410526279bfd1966f70f115050f06128526f3492a7d2fa964c4f5035752d96f6
|
|
| MD5 |
f220edeb297bed28dba8bcd122faa3f0
|
|
| BLAKE2b-256 |
0c19a80e9e41c9e2321b75b8b20ff33f3c39ff3092acfc328c8d3bcf9e4e4526
|