Tamper-evident audit trails for AI systems
Project description
Assay
Tamper-evident audit trails for AI systems.
We scanned 30 popular AI projects and found 202 high-confidence LLM call sites. Zero had tamper-evident audit trails. Full results.
Assay adds independently verifiable execution evidence to AI systems: cryptographically signed receipt bundles that a third party can verify offline without trusting your server logs. Two lines of code. Four exit codes.
pip install assay-ai && assay quickstart # requires Python 3.9+
Boundary: Assay proves tamper-evident internal consistency and completeness relative to scanned call sites. It does not prevent a fully compromised machine from fabricating a consistent story. That's what trust tiers are for.
Not this: Assay is not a logging framework, an observability dashboard, or a monitoring tool. It produces signed evidence bundles that a third party can verify offline. If you need Datadog, this isn't it.
See It -- Then Understand It
Try it now (no API key needed -- demos use synthetic data; with real calls, Assay instruments OpenAI, Anthropic, Gemini, LiteLLM, LangChain, and local models):
pip install assay-ai
assay demo-challenge # tamper detection: one valid pack, one with a single byte changed
Two packs, one byte changed ("gpt-4" -> "gpt-5" in the receipts). Here's what happens (pack IDs and timestamps will differ on your machine):
$ assay verify-pack challenge_pack/good/
VERIFICATION PASSED
Pack ID: pack_20260222_ca2bb665
Integrity: PASS
Claims: PASS
Receipts: 3
Signature: Ed25519 valid
Exit code: 0
$ assay verify-pack challenge_pack/tampered/
VERIFICATION FAILED
Pack ID: pack_20260222_ca2bb665
Integrity: FAIL
Error: Hash mismatch for receipt_pack.jsonl
Exit code: 2
One byte changed. Verification fails. No server access needed. No trust required. Just math.
Now try the policy violation demo:
assay demo-incident # two-act scenario: honest PASS vs honest FAIL
Act 1: Agent uses gpt-4 with guardian check
Integrity: PASS Claims: PASS Exit code: 0
Act 2: Someone swaps model to gpt-3.5-turbo, removes guardian
Integrity: PASS Claims: FAIL Exit code: 1
Act 2 is an honest failure -- authentic evidence proving the run violated its declared standards. The evidence is real. The failure is real. Nobody can edit the history. Exit code 1.
Exit 1 is audit gold: a control failed, but the failure is detectable and retained. Auditors love that more than systems that always claim to pass.
How that works
Assay separates two questions on purpose:
- Integrity: "Were these bytes tampered with after creation?" (signatures, hashes, required files)
- Claims: "Does this evidence satisfy our declared governance checks?" (receipt types, counts, field values)
| Integrity | Claims | Exit | Meaning |
|---|---|---|---|
| PASS | PASS | 0 | Evidence checks out, behavior meets standards |
| PASS | FAIL | 1 | Honest failure: authentic evidence of a standards violation |
| FAIL | -- | 2 | Tampered evidence |
| -- | -- | 3 | Bad input (missing files, invalid arguments) |
The split is the point. Systems that can prove they failed honestly are more trustworthy than systems that always claim to pass.
With real calls: assay scan . finds your actual OpenAI / Anthropic / Gemini / LiteLLM / LangChain call sites. assay patch . instruments them. Every real LLM call emits a signed receipt. The demos above use synthetic data so you can see verification without configuring anything.
Add to Your Project
# 1. Find uninstrumented LLM calls
assay scan . --report
# 2. Patch (one line per SDK, or auto-patch all)
assay patch .
# 3. Run + build a signed evidence pack
# -c receipt_completeness runs the built-in completeness check (see `assay cards list` for all options)
# everything after -- is your normal run command
assay run -c receipt_completeness -- python my_app.py
# 4. Verify
assay verify-pack ./proof_pack_*/
assay scan . --report finds every LLM call site (OpenAI, Anthropic, Google
Gemini, LiteLLM, LangChain) and generates a self-contained HTML gap report.
assay patch inserts the two-line integration. assay run wraps your command,
collects receipts, and produces a signed 5-file evidence pack. assay verify-pack
checks integrity + claims and exits with one of the four codes above. Then run
assay explain on any pack for a plain-English summary.
Local models: Any OpenAI-compatible server (Ollama, LM Studio, vLLM,
llama.cpp) works automatically -- Assay patches the OpenAI SDK at the class
level, so OpenAI(base_url="http://localhost:11434/v1") emits receipts like
any other provider. LiteLLM users get the same coverage via the LiteLLM
integration (ollama/llama3, etc.).
Why now: EU AI Act Article 12 requires automatic logging for high-risk AI systems; Article 19 requires providers to retain automatically generated logs for at least 6 months. High-risk obligations apply from 2 Aug 2026 (Annex III) and 2 Aug 2027 (regulated products). SOC 2 CC7.2 requires monitoring of system components and analysis of anomalies as security events. "We have logs on our server" is not independently verifiable evidence. Assay produces evidence that is. See compliance citations for exact references.
CI Gate
Three commands, one lockfile:
assay run -c receipt_completeness -- python my_app.py
assay verify-pack ./proof_pack_*/ --lock assay.lock --require-claim-pass
assay diff ./baseline_pack/ ./proof_pack_*/ --gate-cost-pct 25 --gate-errors 0 --gate-strict
The lockfile catches config drift. Verify-pack catches tampering. Diff catches regressions and budget overruns. See Decision Escrow for the protocol model.
# Lock your verification contract
assay lock write --cards receipt_completeness -o assay.lock
Daily use after CI is green
Regression forensics:
assay diff ./proof_pack_*/ --against-previous --why
--against-previous auto-discovers the baseline pack.
--why traces receipt chains to explain what regressed and which call sites caused it.
Cost/latency drift (from receipts):
assay analyze --history --since 7
Shows cost, latency percentiles, error rates, and per-model breakdowns from your local trace history.
Trust Model
What Assay detects, what it doesn't, and how to strengthen guarantees.
Assay detects:
- Retroactive tampering (edit one byte, verification fails)
- Selective omission under a completeness contract
- Claiming checks that were never run
- Policy drift from a locked baseline
Assay does not prevent:
- A fully fabricated false run (attacker controls the machine)
- Dishonest receipt content (receipts are self-attested)
- Timestamp fraud without an external time anchor
Completeness is enforced relative to the call sites enumerated by the scanner and/or declared by policy. Undetected call sites are a known residual risk, reduced via multi-detector scanning and CI gating.
To strengthen guarantees:
- Transparency ledger (independent witness)
- CI-held org key + branch protection (separation of signer and committer)
- External timestamping (RFC 3161)
The cost of cheating scales with the complexity of the lie. Assay doesn't make fraud impossible -- it makes fraud expensive.
The Evidence Compiler
Assay is an evidence compiler for AI execution. If you've used a build system, you already know the mental model:
| Concept | Build System | Assay |
|---|---|---|
| Source | .c / .ts files |
Receipts (one per LLM call) |
| Artifact | Binary / bundle | Evidence pack (5 files, 1 signature) |
| Tests | Unit / integration tests | Verification (integrity + claims) |
| Lock | package-lock.json |
assay.lock |
| Gate | CI deploy check | CI evidence gate |
Commands
The core path is 6 commands:
assay quickstart # discover
assay scan / assay patch # instrument
assay run # produce evidence
assay verify-pack # verify evidence
assay diff # catch regressions
assay score # evidence readiness (0-100, A-F)
Full command reference:
Getting started
| Command | Purpose |
|---|---|
assay quickstart |
One command: demo + scan + next steps |
assay status |
One-screen operational dashboard |
assay start demo|ci|mcp |
Guided entrypoints for trying, CI setup, or MCP auditing |
assay onboard |
Guided setup: doctor -> scan -> first run plan |
assay doctor |
Preflight check: is Assay ready here? |
assay version |
Print installed version |
Instrument + produce evidence
| Command | Purpose |
|---|---|
assay scan |
Find uninstrumented LLM call sites (--report for HTML) |
assay patch |
Auto-insert SDK integration patches into your entrypoint |
assay run |
Wrap command, collect receipts, build signed evidence pack |
Verify + analyze
| Command | Purpose |
|---|---|
assay verify-pack |
Verify integrity + claims (the 4 exit codes) |
assay verify-signer |
Extract and verify signer identity from a pack manifest |
assay explain |
Plain-English summary of an evidence pack |
assay analyze |
Cost, latency, error breakdown from pack or --history |
assay diff |
Compare packs: claims, cost, latency (--against-previous, --why, --gate-*) |
assay score |
Evidence Readiness Score (0-100, A-F) with anti-gaming caps |
Workflows + CI
| Command | Purpose |
|---|---|
assay flow try|adopt|ci|mcp|audit |
Guided workflow executor (dry-run by default, --apply to execute) |
assay ci init github |
Generate a GitHub Actions workflow |
assay ci doctor |
CI-profile preflight checks |
assay audit bundle |
Create portable audit bundle (tar.gz with verify instructions) |
assay compliance report |
Generate compliance evidence report |
Pack + baseline management
| Command | Purpose |
|---|---|
assay packs list |
List local proof packs |
assay packs show |
Show pack details |
assay packs pin-baseline |
Pin a pack as the diff baseline |
assay baseline set|get |
Set or get the baseline pack for diff |
Key management
| Command | Purpose |
|---|---|
assay key generate |
Generate a new Ed25519 signing key |
assay key list |
List local signing keys and active signer |
assay key info |
Show key details (fingerprint, creation date) |
assay key set-active |
Set active signing key for future runs |
assay key rotate |
Generate a new key and switch active signer |
assay key export|import |
Export or import keys for CI or team sharing |
assay key revoke |
Revoke a signing key |
Lockfile + cards
| Command | Purpose |
|---|---|
assay lock write |
Freeze verification contract to lockfile |
assay lock check |
Validate lockfile against current card definitions |
assay lock init |
Initialize a new lockfile interactively |
assay cards list |
List built-in run cards and their claims |
assay cards show |
Show card details, claims, and parameters |
MCP + policy
| Command | Purpose |
|---|---|
assay mcp-proxy |
Transparent MCP proxy: intercept tool calls, emit receipts |
assay mcp policy init |
Generate a starter MCP policy YAML file |
assay mcp policy validate |
Validate a policy file against the schema |
assay policy impact |
Analyze policy impact on existing evidence |
Incident forensics
| Command | Purpose |
|---|---|
assay incident timeline |
Build incident timeline from receipts |
assay incident replay |
Replay an incident from receipt chain |
Demos
| Command | Purpose |
|---|---|
assay demo-incident |
Two-act scenario: passing run vs failing run |
assay demo-challenge |
CTF-style good + tampered pack pair |
assay demo-pack |
Generate demo packs (no config needed) |
Documentation
- Full Picture -- architecture, trust tiers, repo boundaries, release history
- Quickstart -- install, golden path, command reference
- For Compliance Teams -- what auditors see, evidence artifacts, framework alignment
- Compliance Citations -- exact regulatory references (EU AI Act, SOC 2, ISO 42001)
- Decision Escrow -- protocol model: agent actions don't settle until verified
- Roadmap -- phases, product boundary, execution stack
- Repo Map -- what lives where across the Assay ecosystem
- Pilot Program -- early adopter program details
Common Issues
-
"No receipts emitted" after
assay run: First, check whether your code has call sites:assay scan .-- if scan finds 0 sites, you may not be using a supported SDK yet. If scan finds sites, check: (1) Is# assay:patchedin the file? Runassay scan . --reportto see patch status per file. (2) Did you install the SDK extra (pip install assay-ai[openai])? (3) Did you use--before your command (assay run -- python app.py)? Runassay doctorfor a full diagnostic. -
LangChain projects:
assay patchauto-instruments OpenAI and Anthropic SDKs but not LangChain (which uses callbacks, not monkey-patching). For LangChain, addAssayCallbackHandler()to your chain'scallbacksparameter manually. Seesrc/assay/integrations/langchain.pyfor the handler. -
assay run python app.pygives "No command provided": You need the--separator:assay run -c receipt_completeness -- python app.py. Everything after--is passed to the subprocess. -
Quickstart blocked on large directories:
assay quickstartguards against scanning system directories (>10K Python files). Use--forceto bypass:assay quickstart --force. -
macOS:
ModuleNotFoundErrorinsideassay runbut works outside it: On macOS,python3on PATH may point to a different Python version than where assay and your SDK are installed (e.g.python3→ 3.14, but packages are in 3.11). Use a virtual environment (recommended), or specify the exact interpreter:assay run -- python3.11 app.py. Check withpython3 --versionand compare to the Python where you ranpip install.
Get Involved
- Try it:
pip install assay-ai && assay quickstart - Questions / feedback: GitHub Discussions
- Bug reports: Issues
- Want this in your stack in 2 weeks? Pilot program -- we instrument your AI workflows, set up CI gates, and hand you a working evidence pipeline. Open a pilot inquiry.
Related Repos
| Repo | Purpose |
|---|---|
| assay | Core CLI, SDK, conformance corpus (this repo) |
| assay-verify-action | GitHub Action for CI verification |
| assay-ledger | Public transparency ledger |
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file assay_ai-1.10.1.tar.gz.
File metadata
- Download URL: assay_ai-1.10.1.tar.gz
- Upload date:
- Size: 217.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97b04e84cd34361d00f0b308e5d643923ea31a2bb07ccff789eab4b8d5bf9caf
|
|
| MD5 |
73bae680cf874451d91c90df14f9965e
|
|
| BLAKE2b-256 |
a1240e5d10427a7ae5a8245377a9a0a2f5eb47ab28c0db0526a11b5745b247bd
|
Provenance
The following attestation bundles were made for assay_ai-1.10.1.tar.gz:
Publisher:
publish.yml on Haserjian/assay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
assay_ai-1.10.1.tar.gz -
Subject digest:
97b04e84cd34361d00f0b308e5d643923ea31a2bb07ccff789eab4b8d5bf9caf - Sigstore transparency entry: 990935461
- Sigstore integration time:
-
Permalink:
Haserjian/assay@395a7b14eed3160264e962d8e29cd939315f3325 -
Branch / Tag:
refs/tags/v1.10.1 - Owner: https://github.com/Haserjian
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@395a7b14eed3160264e962d8e29cd939315f3325 -
Trigger Event:
release
-
Statement type:
File details
Details for the file assay_ai-1.10.1-py3-none-any.whl.
File metadata
- Download URL: assay_ai-1.10.1-py3-none-any.whl
- Upload date:
- Size: 240.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cbc3093019ffb4cc55eb0ecfe74a85656a7b5324aaac6933d000db7b687a7dda
|
|
| MD5 |
c54cb532312e57002f11fb0bd184e58f
|
|
| BLAKE2b-256 |
7181c9fe18b17c0f7fbd4136e4fd659da0aa9f0f1ab61bea017ba2587743c777
|
Provenance
The following attestation bundles were made for assay_ai-1.10.1-py3-none-any.whl:
Publisher:
publish.yml on Haserjian/assay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
assay_ai-1.10.1-py3-none-any.whl -
Subject digest:
cbc3093019ffb4cc55eb0ecfe74a85656a7b5324aaac6933d000db7b687a7dda - Sigstore transparency entry: 990935467
- Sigstore integration time:
-
Permalink:
Haserjian/assay@395a7b14eed3160264e962d8e29cd939315f3325 -
Branch / Tag:
refs/tags/v1.10.1 - Owner: https://github.com/Haserjian
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@395a7b14eed3160264e962d8e29cd939315f3325 -
Trigger Event:
release
-
Statement type: