Skip to main content

Evidence capture primitives for autonomous agent systems.

Project description

agent-evidence

把 AI Agent / service operation 转换为可验证、可审计、可复核的 evidence object。
Turn AI agent operations into auditable and verifiable evidence objects.

DOI: 10.5281/zenodo.19334061

CI Python 3.11+ Semantic Events Status

Why this matters

普通 AI workflow 通常只留下聊天记录、trace 页面或零散日志。它们能帮助开发者排查问题,但很难直接交给审查者、客户、治理团队或后续系统复核。

agent-evidence 关注的是一次 Agent / service operation 结束之后,能不能留下结构化证据:operation、policy、provenance、evidence、validation、hashes、verification result,以及可以被 validator 检查的 evidence object。

这个仓库的目标不是再做一个通用 Agent 平台,而是提供一个最小、可运行、可验证的 operation evidence 路径。

What it provides

  • Execution Evidence and Operation Accountability Profile v0.1
  • JSON Schema
  • profile-aware validator
  • valid / invalid examples
  • runnable single-path demo
  • LangChain-first evidence exporter and offline bundle verification
  • FDO-style mapping material for discussion, not a claim of official standard adoption
  • v0.3.1 DOI metadata patch preparation, with the primary citation DOI using the Zenodo concept DOI and exact release DOI documented separately

Quick Start

Install from source:

python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Validate a minimal valid evidence profile:

agent-evidence validate-profile examples/minimal-valid-evidence.json

Check that an intentionally invalid example fails:

agent-evidence validate-profile examples/invalid-missing-required.json

Run the single-path demo:

python3 demo/run_operation_accountability_demo.py

Expected result:

  • the valid example returns JSON with "ok": true
  • the invalid example returns JSON with "ok": false and one primary error code
  • the demo writes artifacts under demo/artifacts/ and ends with one PASS summary line

Fastest LangChain proof

For the current LangChain-first path:

pip install -e ".[dev,langchain,sql]"
python integrations/langchain/export_evidence.py
agent-evidence verify-bundle --bundle-dir integrations/langchain/langchain-evidence-bundle

This runs the documented LangChain exporter and verifies the emitted bundle offline.

For a smaller callback/export recipe aimed at external readers, see LangChain minimal evidence cookbook.

Example Evidence Fields

The minimal profile binds these parts into one reviewable object:

  • statement_id — one accountable operation statement
  • operation — what operation happened, on which subject, with which inputs and outputs
  • policy — which rule or constraint set was referenced
  • provenance — which actor and references connect the operation to inputs and outputs
  • evidence — artifacts, references, and integrity digests
  • validation — method, validator, status, and linked evidence/provenance/policy
  • subject and actor — object and runtime identity used by the statement

Demo Screenshots

Validator proof:

Validator output showing valid pass and invalid schema violation

FDO Testbed registration proof:

FDO Testbed Type Registry entry for FDO_OPERATION_EVIDENCE_PROFILE_V0_1

Relation to FDO

agent-evidence is an experimental, minimal, discussion-oriented operation evidence profile. It explores how AI / Agent operation evidence can be expressed with an FDO-style object shape: identity, metadata, references, provenance, integrity, and validation.

It is not an official FDO standard. The current public claim is narrower: this repository provides a working profile, schema, examples, validator, demo, and FDO-facing mapping material for discussion.

FDO-facing reading path:

For hiring managers

This repository shows that I can:

  • turn LangChain / Agent workflow thinking into a concrete evidence boundary
  • design JSON Schema and validator logic for high-responsibility AI workflows
  • model audit trail, provenance, hashes, and verification results as deliverable artifacts
  • connect trustworthy AI governance ideas to runnable examples
  • package technical work as open-source documentation, examples, release artifacts, and CLI validation

Canonical package

The current canonical package is Execution Evidence and Operation Accountability Profile v0.1.

Core entry points:

What You Get

After one run, the primary outputs are intentionally narrow:

  • bundle — exported evidence package that can be handed off, verified, and retained outside the original runtime
  • receipt — machine-readable verification result returned by agent-evidence validate-profile, agent-evidence verify-bundle, or agent-evidence verify-export
  • summary — reviewer-facing summary output produced by the current demo and example surfaces

Why this is not just tracing

Tracing and logs help operators inspect a run. Agent Evidence packages runtime events into portable artifacts that another party can verify later, including offline.

Evidence path:

runtime events -> evidence bundle -> signed manifest -> detached anchor (when present) -> offline verify

External anchoring is out of scope for AEP v0.1 and is not enabled by default.

The toolkit currently supports two storage modes:

  • append-only local JSONL files
  • SQLAlchemy-backed SQLite/PostgreSQL databases

The current model treats each record as a semantic event envelope:

  • event.event_type is framework-neutral, such as chain.start or tool.end
  • event.context.source_event_type preserves the raw framework event name
  • hashes.previous_event_hash links to the prior event
  • hashes.chain_hash provides a cumulative chain tip for integrity checks

Integration Priorities

Current priority order:

  1. LangChain / LangGraph
  2. OpenAI-compatible runtimes
  3. Automaton sidecar export, marked experimental

The goal is one narrow evidence handoff surface, not many adapters at once.

Automaton Sidecar Exporter

The read-only Automaton sidecar/exporter reads state.db, git history, and persisted on-chain references, then emits an AEP bundle plus fdo-stub.json and erc8004-validation-stub.json.

agent-evidence export automaton \
  --state-db /path/to/state.db \
  --repo /path/to/state/repo \
  --runtime-root /path/to/automaton-checkout \
  --out ./automaton-aep-bundle

agent-evidence export automaton has been validated against a live isolated-home Automaton run and remains marked experimental while the live data contract is still settling.

CLI examples

agent-evidence record \
  --store ./data/evidence.jsonl \
  --actor planner \
  --event-type tool.call \
  --input '{"task":"summarize"}' \
  --output '{"status":"ok"}' \
  --context '{"source":"cli","component":"tool"}'

agent-evidence list --store ./data/evidence.jsonl
agent-evidence show --store ./data/evidence.jsonl --index 0
agent-evidence verify --store ./data/evidence.jsonl

SQL stores use a SQLAlchemy URL instead of a file path:

agent-evidence record \
  --store sqlite+pysqlite:///./data/evidence.db \
  --actor planner \
  --event-type tool.call \
  --context '{"source":"cli","component":"tool"}'

agent-evidence query \
  --store sqlite+pysqlite:///./data/evidence.db \
  --event-type tool.call \
  --source cli

Development

make install
make test
make lint
make hooks

For PostgreSQL support, install the extra driver dependencies:

pip install -e ".[dev,postgres]"

For a repeatable real-database validation path, use the bundled Docker-backed integration script:

make install-postgres
make test-postgres

Scope Boundaries

agent-evidence is the active code surface here for bundle, receipt, and summary.

It is not:

  • the full Digital Biosphere Architecture stack
  • the audit control plane
  • the walkthrough demo
  • the execution-integrity kernel
  • a generic agent governance platform
  • an official FDO standard

Related Surfaces

English Summary

agent-evidence turns AI agent operations into structured evidence objects that can be validated, reviewed, and retained outside the original runtime. The project focuses on a minimal operation evidence profile, JSON Schema, validator, examples, LangChain export, offline bundle verification, and FDO-facing discussion material. It is experimental and discussion-oriented, not an official FDO standard.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_evidence-0.3.1.tar.gz (90.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_evidence-0.3.1-py3-none-any.whl (70.3 kB view details)

Uploaded Python 3

File details

Details for the file agent_evidence-0.3.1.tar.gz.

File metadata

  • Download URL: agent_evidence-0.3.1.tar.gz
  • Upload date:
  • Size: 90.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agent_evidence-0.3.1.tar.gz
Algorithm Hash digest
SHA256 787dcc8f0a5ea76fadff144ba2067e0ca2272ab7c2a5355640ead232dd84db37
MD5 6e6237ebc74ea9c2c38a1d14960515fc
BLAKE2b-256 bae0b3fdceee711feefc2846c5eeccc4af9ac788af871d53144379dd268f8482

See more details on using hashes here.

File details

Details for the file agent_evidence-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: agent_evidence-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 70.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for agent_evidence-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a2a4e7baf876aca0ab8c6bbe844627ee4496d272fb4f92d8ebf46b106bd56023
MD5 2ad30c230f2f3767a90c3c59d36f9be1
BLAKE2b-256 b8bb24db9dbd55fa55b098ca1bd02d2c9a4355197d00ec10192072fca523ce4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page