Skip to main content

Agent Action Protocol — cryptographic flight recorder for autonomous agents

Project description

◼ blackbox

The flight recorder for autonomous agents.


import { blackbox } from "@blackbox/runtime"

const { bb, record } = blackbox({ agent_id: "deploy-agent@1.0.0" })

await record("file_write", "update_config", input, () => fs.writeFile(...))
await record("exec",       "run_migrations", input, () => db.migrate())
await record("deploy",     "release_v2",     input, () => k8s.rollout())

const artifact = bb.end()
// → one portable file. every action. cryptographically chained. replayable.

The problem

We gave agents production access.

We gave them tools to read files, write databases, trigger deployments, call APIs.

We forgot to give them a flight recorder.

When an autonomous agent modifies a critical pipeline at 3am and something breaks — who is responsible? What did it do, exactly, and in what order? Can you prove the logs weren't altered after the fact?

Right now: no.

Every framework logs differently. Nothing is interoperable. Nothing is cryptographically verifiable. Nothing is replayable.


What blackbox is

A minimal runtime layer that sits between your agent and the world.

One constraint: an agent cannot act without producing a verifiable trace.

Every action becomes a structured event — hashed, chained to the previous event, and bundled into a single portable .blackbox artifact at the end of each run.

run_start  (seq=0, prev="genesis")       hash: 9f86d...
    ↓
file_read  (seq=1, prev=9f86d...)        hash: 3ac67...
    ↓
llm_call   (seq=2, prev=3ac67...)        hash: b14a8...
    ↓
deploy     (seq=3, prev=b14a8...)        hash: 7f3c2...
    ↓
run_end    (seq=4, prev=7f3c2...)        hash: 1d9e4...  ← root_hash

Tamper any event. The chain breaks. You know exactly where.


What blackbox is not

  • Not a logging framework
  • Not an orchestrator
  • Not a monitoring system
  • Not a compliance product

It is one thing: a deterministic, verifiable record of what an agent did.


Install

npm install @blackbox/runtime

No config. No infrastructure. No dependencies beyond Node's native crypto.


API

blackbox(options) — one-liner init

const { bb, record } = blackbox({
  agent_id: "my-agent@1.0.0",   // required
  tags: ["production"],          // optional
  onEvent: (e) => console.log(e) // optional — stream events live
})

record(type, action, input, fn) — wrap any action

const result = await record(
  "file_write",          // EventType
  "update_config",       // human-readable label
  { path, content },     // input — what the agent received
  () => fs.writeFile(path, content)  // the actual action
)

If the action throws — the error is recorded, the chain continues, the exception propagates normally.

bb.end() — get the artifact

const artifact = bb.end()
// BlackboxArtifact — serialize to .blackbox file, send to S3, attach to incident report

verify(artifact) — standalone verifier

import { verify } from "@blackbox/runtime"

const result = verify(artifact)
// { valid: true }
// { valid: false, reason: "hash mismatch at seq 3 — event was tampered", at_seq: 3 }

The verifier has zero dependency on the runtime that produced the artifact. Any language with SHA-256 can verify a .blackbox file.


Event types

type when
tool_call agent calls an external tool
tool_result result received from tool
llm_call prompt sent to LLM
llm_result LLM response received
decision agent makes a branching choice
file_read agent reads a file
file_write agent writes or modifies a file
exec agent executes a command
http_request outbound HTTP call
http_response HTTP response received
deploy agent triggers a deployment
memory_read agent reads from memory
memory_write agent writes to memory
run_start first event of a run
run_end last event — includes summary

The .blackbox artifact

One run. One file.

{
  "version": "0.1",
  "run_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "agent_id": "deploy-agent@1.0.0",
  "started_at": "2024-01-15T10:23:45.000Z",
  "ended_at": "2024-01-15T10:24:12.847Z",
  "event_count": 12,
  "root_hash": "sha256:1d9e4...",
  "events": [ ... ]
}

Share this file to prove what an agent did — or didn't do. Attach it to your incident report. Send it to your auditor. Store it in S3.


Integrity model

Each event is hashed over all its fields using SHA-256 with canonical JSON (keys sorted alphabetically, no extra whitespace — deterministic across languages and platforms).

Each event includes the hash of the previous event. This forms a chain: altering any event invalidates every subsequent hash.

hash = SHA256(canonicalJSON({
  id, run_id, seq, timestamp,
  type, action, input, output,
  state_before, state_after,
  prev_hash, agent_id, tags
}))

The root_hash in the artifact header is the hash of the final run_end event. Verify the artifact by recomputing the chain from seq=0.


Event format

Built on AAP (Agent Action Protocol) — the open standard for verifiable agent actions.

blackbox is the runtime that enforces AAP. AAP is the format that makes blackbox artifacts portable.


Why not just use structured logging?

Logs can be appended to. Logs can be deleted. Logs have no integrity guarantee.

A .blackbox artifact is tamper-evident. If someone modifies event seq=3 after the fact, the hashes for seq=4 through seq=N are all invalid. You know exactly what was changed and when the chain was broken.

This is the difference between a diary and a flight recorder.


Roadmap

  • TypeScript runtime
  • Chain integrity + verifier
  • Python runtime (wraps AAP natively)
  • CLI: blackbox verify run.blackbox
  • CLI: blackbox replay run.blackbox
  • LangChain integration
  • AutoGen integration
  • OpenAI Agents SDK integration
  • VS Code extension — visualize runs

Contributing

The spec lives in SPEC.md. The format is intentionally minimal and stable.

If you want to add a feature — ask first whether it belongs in the runtime or in a layer above it. blackbox should stay small.


License

MIT


We made agents powerful. We forgot to make them accountable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aap_python-0.1.0.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aap_python-0.1.0-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file aap_python-0.1.0.tar.gz.

File metadata

  • Download URL: aap_python-0.1.0.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for aap_python-0.1.0.tar.gz
Algorithm Hash digest
SHA256 09e78945d98759423e7d22485c7784e158b15921a43dafb8f3e80b874be253e9
MD5 7eb04f735061f01979ae334bcf29c378
BLAKE2b-256 b3a9652aa954d8da884b3496b4ea1804681ddc9b1b7ac2d6a924c94968701857

See more details on using hashes here.

File details

Details for the file aap_python-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: aap_python-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for aap_python-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85368fe8d76ca1a88de897df491e804539f56be0fe766899db8bd308af3c35a0
MD5 98063639e87d43528e4a6bd2d7a2b192
BLAKE2b-256 0dc2a4858d5fdf485e9375e61723da9a6e5b73f78cb230950004e871ba0d0c87

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page