Tamper-evident evidence bundles for AI outputs
Project description
AELITIUM
Detect when LLM behavior silently changes — verifiable, offline, no server.
The problem
You run the same prompt in production. One week later, the output is different.
The model changed — but your logs just show two JSON blobs. There's no proof of when it changed, or which call started returning different results.
AELITIUM gives you cryptographic evidence for every LLM call — request hash, response hash, tamper-evident bundle — so you can prove exactly when behavior changed, and that your records haven't been altered.
30-second demo
pip install aelitium
# Capture two runs of the same request:
aelitium compare ./bundle_last_week ./bundle_today
# STATUS=CHANGED rc=2
# REQUEST_HASH=SAME
# RESPONSE_HASH=DIFFERENT
# INTERPRETATION=Same request produced a different response
Same request. Different output. That means the change came from the model — not your code.
# Scan your codebase for unprotected LLM calls:
aelitium scan ./src
# LLM call sites detected: 4
# Missing evidence capture:
# ⚠ openai — worker.py:42
# ⚠ anthropic — agent.py:17
# Coverage: 2/4 (50%)
# STATUS=INCOMPLETE rc=2
All commands accept --json for structured output.
How it works
API call (OpenAI / Anthropic)
↓
capture adapter ← records request_hash + response_hash at call time
↓
evidence bundle ← canonical JSON + ai_manifest.json + binding_hash
↓
aelitium verify-bundle ← STATUS=VALID / STATUS=INVALID
aelitium compare ← UNCHANGED / CHANGED / NOT_COMPARABLE
Each bundle contains a deterministic SHA-256 hash of the payload, a manifest with timestamp and schema, and a cryptographic binding_hash linking the exact request to the exact response. Anyone with the bundle can verify it — no network required.
Capture adapter (OpenAI / Anthropic)
No manual JSON. The capture adapter intercepts the API call and writes the bundle automatically.
from openai import OpenAI
from aelitium import capture_openai
client = OpenAI()
result = capture_openai(
client, "gpt-4o",
[{"role": "user", "content": "What is the capital of France?"}],
out_dir="./evidence",
)
print(result.ai_hash_sha256) # deterministic proof of this exact call
aelitium verify-bundle ./evidence
# STATUS=VALID rc=0
# AI_HASH_SHA256=...
# BINDING_HASH=... ← cryptographic link between request and response
See Capture layer for Anthropic, streaming, and signing.
Detect when the model changed
aelitium compare ./bundle_last_week ./bundle_today
# STATUS=CHANGED rc=2
# REQUEST_HASH=SAME a=3f4a8c1d... b=3f4a8c1d...
# RESPONSE_HASH=DIFFERENT a=9b2e7f1a... b=c41d8e3b...
# INTERPRETATION=Same request produced a different response
If REQUEST_HASH=SAME and RESPONSE_HASH=DIFFERENT, the change came from the model — not your code.
Run the full example (requires OpenAI API key):
python examples/model_drift_detector.py
Scan for unprotected LLM calls
Find every LLM call in your codebase that isn't wrapped in a capture adapter:
aelitium scan ./src
# LLM call sites detected: 12
# Instrumented with capture adapter: 9
# ✓ openai — api/worker.py:14
# ✓ openai — api/worker.py:38
# Missing evidence capture: 3
# ⚠ openai — jobs/batch.py:22
# ⚠ anthropic — agents/classifier.py:11
# ⚠ litellm — utils/fallback.py:7
# Coverage: 9/12 (75%)
# STATUS=INCOMPLETE rc=2
Add to CI/CD to enforce evidence coverage:
- name: Check LLM evidence coverage
run: aelitium scan ./src
For CI-friendly key=value output:
aelitium scan ./src --ci
# AELITIUM_SCAN_STATUS=INCOMPLETE
# AELITIUM_SCAN_TOTAL=12
# AELITIUM_SCAN_INSTRUMENTED=9
# AELITIUM_SCAN_MISSING=3
# AELITIUM_SCAN_COVERAGE=75
Reproducibility
The same AI output always produces the same hash, on any machine:
bash scripts/verify_repro.sh
# === RESULT: PASS ===
# AI_HASH_SHA256=8b647717...
Validated on two independent machines (A + B) with identical hashes.
CLI reference
aelitium
| Command | Description |
|---|---|
scan <path> |
Scan Python files for uninstrumented LLM call sites |
compare <bundle_a> <bundle_b> |
Compare two bundles — detect model behavior change |
verify-bundle <dir> |
Verify bundle: hash + signature + binding hash |
pack --input <file> --out <dir> |
Generate canonical JSON + manifest |
verify --out <dir> |
Verify integrity of a pack output dir |
validate --input <file> |
Validate against ai_output_v1 schema |
canonicalize --input <file> |
Print deterministic hash |
verify-receipt --receipt <file> --pubkey <file> |
Verify Ed25519 authority receipt offline |
export --bundle <dir> |
Export bundle in compliance format (EU AI Act Art.12) |
Exit codes: 0 = success, 2 = failure. Designed for CI/CD pipelines.
Documentation
- Why AELITIUM — problem statement, positioning, and what this is for
- Architecture — canonicalization pipeline, evidence bundle, module map
- Security model — threats addressed, guarantees, limitations
- Trust boundary — what AELITIUM proves and what it does not
- 5-minute demo — full walkthrough with expected output
- Python integration — drop-in helper + FastAPI example
- Capture layer — OpenAI adapter, auto-packing, trust gap explanation
- Engine contract — bundle schema and guarantees
- Evidence Bundle Spec — open draft standard for verifiable AI output bundles; AELITIUM is the reference implementation
Design principles
- Deterministic — same input always produces the same hash, on any machine
- Offline-first — verification never requires network access
- Fail-closed — any verification error returns
rc=2; no silent failures - Auditable — every pack includes a manifest with schema, timestamp, and hash
- Pipeline-friendly — all output parseable (
STATUS=,AI_HASH_SHA256=,--json)
Trust boundary
AELITIUM provides tamper-evidence, not truth guarantees.
What AELITIUM proves:
- the bundle contents have not changed since packing
- the canonicalized payload matches the recorded hash
- (with capture adapter) the request hash matches the exact API payload sent
What AELITIUM does not prove:
- that the model output was correct or safe
- that the system that packed the bundle was trustworthy
- that the model actually produced the output (without capture adapter)
Integrity ≠ completeness. AELITIUM proves that captured events were not altered. It does not guarantee that all events were captured. Capture completeness depends on the integration layer — SDK wrapper, proxy, or observer. If the agent controls its own logging, an observer-based capture pattern provides stronger guarantees. See TRUST_BOUNDARY.md for the full analysis.
Stronger provenance — signing authorities, hardware-backed keys — is the direction of P3.
Compliance alignment
AELITIUM provides tamper-evident evidence bundles that support the following regulatory and audit requirements:
| Framework | Requirement | How AELITIUM helps |
|---|---|---|
| EU AI Act — Article 12 | Logging and traceability of high-risk AI system outputs | Evidence bundles provide immutable, verifiable records of AI outputs with deterministic hashes |
| SOC 2 — CC7 | System monitoring and integrity controls | Independent offline verification confirms records have not been altered after creation |
| ISO 42001 | AI management system auditability | Canonical bundles with schema versioning support third-party audits without infrastructure access |
| NIST AI RMF — MG 2.2 | Traceability of AI decisions and outputs | Each bundle contains a complete, reproducible record: payload, hash, timestamp, and optional signature |
AELITIUM does not replace logging infrastructure. It adds cryptographic integrity on top of any existing pipeline — offline, without a server, without a blockchain.
License
Apache-2.0. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aelitium-0.2.4.tar.gz.
File metadata
- Download URL: aelitium-0.2.4.tar.gz
- Upload date:
- Size: 40.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee73286dd911206296a9e7d3ec575c01b8a8ca179e28bd4ecb29687928d94c7f
|
|
| MD5 |
77d930ae43dbb5a8919de06adb3d89aa
|
|
| BLAKE2b-256 |
8b26d0d825ae600f61bc9d4c9b0ef5774c5041ea5d624686048e843664076081
|
File details
Details for the file aelitium-0.2.4-py3-none-any.whl.
File metadata
- Download URL: aelitium-0.2.4-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
429dcbc77a7f99fe8abf646bacfa5839236ae9c0d81743ddd27a4d760f8f44d7
|
|
| MD5 |
e80d003695239e8cd644a6ae7bcf5dd7
|
|
| BLAKE2b-256 |
f74f4ccb76d6c035facaa00ecadd4ce051445c35792b96a58301a541290dabfc
|