An orchestration engine for AI agents that records every step in a ledger you can verify
Project description
Forum
Every few months there's a new framework for orchestrating AI agents. You wire one up, hand it a task, and it works. Then you try to run it for real, and you hit the question that actually matters: what happened on that run, and can you prove it? Usually all you've got is a pile of model output and a log you're supposed to trust.
Forum starts from that question. It's an orchestration engine for fleets of agents, and the idea underneath it is simple. The record of what happened isn't a side effect of the work. It is the work. Every routing decision, every task, every result goes into a ledger you can verify, replay, and trace. Think of how a bank reconciles its books instead of trusting the teller's memory.
Here's why it's built this way. A language model has no memory of its own. Each call starts from nothing. If you want to build something dependable on top of that, you have to give a forgetful mind two things it can't supply for itself: a record that outlives the conversation, and a way to check that record instead of trusting it. You also need reach, the ability to act across a lot of agents at once. That's the real project. The small zero-dependency pieces in this repo aren't the goal. They're the bricks.
Everything here is built and runs. The foundation (the ledger, the router, the
planner), the runtime that executes a plan across agents and witnesses every step,
real executors (a task can shell out to any command, including a model CLI, or call a
model over the API), the control loop that turns a plain request into a plan and a
single verified answer, a durable ledger that survives a restart, an always-on daemon
over HTTP and MCP, and a forum command to drive it all. Every routing decision,
plan, task, result, and verdict goes into a ledger you can verify, replay, and trace.
The examples below show it, and the small zero-dependency pieces are still the bricks.
Watch it work
git clone https://github.com/HarperZ9/forum
cd forum
python examples/demo.py # no install, nothing to download
The demo routes a few requests, plans a small dependency graph, records every step, and then does the interesting part. It quietly corrupts a stored result and checks whether the ledger notices.
1. Routing (deterministic Tier-0; decides a lane or escalates)
'build the database schema and the auth endpoint' -> backend
'build the react component and css for the page' -> frontend
'write the readme docs and the guide' -> docs
'summon a unicorn' -> escalate -> needs an LLM classifier (confidence 0.00)
2. Planning (DAG -> parallel waves, capped by policy max_parallel=2)
wave 0: ['T1']
wave 1: ['T2']
wave 2: ['T3', 'T4']
4. Accountability: verify, tamper-detect, replay
verify() (chain) : True
verify(deep=True) : True
causal chain of last : request -> plan -> task -> result
...now tamper with a stored payload body (seq 2)
verify() (chain only) : True <- chain hashes still link
verify(deep=True) : False <- body tamper caught
Look at those last two lines. The chain of hashes still links, so a quick check passes. But the contents of one record no longer match what was promised, and the deeper check says so. You don't have to trust the record. You can check it.
To see the engine run a whole plan instead of just the ledger, there's a second example:
python examples/run.py
It routes a request, runs a three-step plan across agents (with a stub standing in for a real model), and verifies the entire run from the ledger at the end.
From the command line
Install it with pip install forum-engine (pure standard library, no dependencies come with it), and Forum gives you a forum command:
forum route "build the auth endpoint and the database schema" # which lane, no model needed
forum submit "ship a login API" --cmd "ollama run llama3" # plan, run, answer with a local model, no account
forum serve --chat-url http://localhost:11434/v1/chat/completions --model llama3 # the HTTP daemon
forum mcp --cmd "ollama run llama3" # the MCP stdio server
forum ledger verify # check the record
forum ledger show --limit 20 # the last 20 entries
submit, serve, and mcp reach a model, and Forum is model-agnostic about which.
--cmd "<any command>" runs any model (a local CLI needs no account), --chat-url
talks to any OpenAI-compatible server (local or cloud), and --api is one specific
provider (Anthropic). Routing and the ledger commands need no model at all. See
RUNNING.md.
How the ledger works
A log tells you what a program says it did. A ledger lets you prove it. Two old ideas do most of the work.
The first is a hash chain. Every entry carries a fingerprint of the one before it.
Edit a past entry, drop one, or shuffle the order, and the fingerprints stop lining
up. verify() walks the chain and tells you where.
The second is content addressing. The bulky parts, the prompts and the outputs, are
stored under a fingerprint of their own bytes rather than inline. That keeps the chain
small, and it has a useful side effect: you can redact a sensitive body down to its
fingerprint and the chain still checks out. When the bodies are there,
verify(deep=True) re-hashes each one to make sure it still matches. That's what
catches the swapped result in the demo.
Everything else falls out of those two. replay(until=...) rebuilds the exact state
at any past point, which works because the core is pure and entries never change.
causal_chain(seq) follows the parent links to answer the question every postmortem
comes back to: why did this happen? And checkpoint() folds the whole history into
one Merkle root. The leaves and the internal nodes are tagged differently, and odd
nodes get carried up rather than duplicated, so it avoids the second-preimage
collision (CVE-2012-2459) that naive Merkle code runs into.
None of this is worth much if the record dies with the process. By default the
ledger lives in memory, which is right for a test or a single run. Point it at a
FileStorage instead and every entry is appended to a file and fsynced before the
next one, so the ledger survives a restart and still verifies, replays, and
checkpoints exactly. If a crash cuts the final write short, that half-written line
is dropped on reload and the rest of the record stands. Tampering does not get a
quieter treatment: a reordered file still loads, and verify() still says no.
What's here
forum.ledger: the record. Hash chain, content-addressed bodies,verify/verify(deep=True),replay,causal_chain, Merklecheckpoint.forum.storage: where the record lives. An in-memory store for tests and short runs, and a durableFileStorage(append-only JSONL) so a ledger survives a restart and stays verifiable.forum.routing: a router that reads a request, picks a lane, and only falls back to a model when the keywords genuinely can't decide.forum.planandforum.dispatch: a task graph compiled into parallel waves (cycles and missing dependencies caught up front), with typed edges. A data edge feeds its upstream's witnessed output into the downstream task so it builds on real work; an order edge only sequences. Every edge and every data hand-off is witnessed.forum.roster: the cast of specialists, written as plain data in a TOML file and validated on load. Ships with a built-in default roster of 24 plain capability lanes (load_default()), so a fresh install has a real roster out of the box.forum.policy: the rules of the room. Which work can run, and how much at once.forum.executor/forum.chat_executor/forum.api_executor: how work actually runs, model-agnostic. A stub for tests, aSubprocessExecutorthat runs any command (a local model CLI needs no account), aChatExecutorfor any OpenAI-compatible server (local or cloud), and anApiExecutorfor the Anthropic API. A failing task is witnessed, not fatal; each result records which model produced it, and a failed task can escalate up a ladder of stronger executors, witnessed.forum.controlandOrchestrator.submit: the control loop. A Coordinator turns a plain request into a plan, a Classifier picks an agent when keywords can't, a Validator judges each result, and a Synthesizer writes one answer. Every step is witnessed.forum.contextandforum.budget: the run contract. AContextProviderseam so a run plans on organized context from a brain (the index flagship), witnessed as the exact context that shaped it; and aRunBudgetthat bounds a run and witnesses where it stopped.forum.verify: the verification seam. AVerifierProviderlets an external verifier (a peer flagship, a proof-checker, a test runner) check the answer Forum produced, and the verdict is witnessed. The peer of the context seam: context flows in before the run, verification comes back after it. The default abstains, so Forum stands alone.forum.daemon/forum.http_surface: an always-on HTTP service (stdlib asyncio, no framework) over one long-lived, durable ledger. Submit a request, read a witnessed answer, and verify or replay the record over HTTP.forum.mcp_surface: the same tools over MCP (JSON-RPC on stdio), the lone optional edge. It is a thin adapter over the HTTP surface, so the two can never drift.forum.intentand the intent-judge: did the run answer the request? After synthesis, a deterministic coverage of the request's vocabulary by the answer is witnessed (a lexical floor that flags drift, never blocks). When it flags and you opt in (IntentJudge, orforum submit --judge-intent), a model resolves whether the answer truly drifted or just paraphrased, witnessed as its own entry and bounded by the budget. Cheap floor first, the model only when the floor earns it.forum.report: reading the record.summarize(ledger)aggregates a witnessed run into counts, model calls, the checkpoint, and the verify result, reading only what was witnessed;compare(a, b)(andforum bench A B) is the delta between two runs, so you can prove a change helped instead of asserting it.
Pure standard library. No third-party runtime dependencies. The tests run the primitives directly, tamper detection and the Merkle property included.
Roadmap
- Done, the foundation. Ledger, router, roster, planner, policy. Tested and runnable.
- Done, the runtime. An asyncio dispatcher that runs a plan's waves with bounded concurrency, a mailbox actor and a restart supervisor, and an Orchestrator that ties routing, planning, and witnessed dispatch into one call. The engine runs end to end against a stub executor today.
- Done, real executors. A
SubprocessExecutorthat runs any command (so any CLI, including a model CLI), and anApiExecutorthat drives a model over the Anthropic API, both behind the one executor seam. A failing task is witnessed, not fatal. - Done, the control loop. A Coordinator that turns a plain request into a plan, a Classifier, a Validator that judges each result (a failed task is witnessed, not blessed), and a Synthesizer that writes one answer.
Orchestrator.submitruns the whole loop, witnessed. - Done, durable storage. A file-backed
FileStorage(append-only JSONL) so a ledger outlives the process: it recovers exactly on restart, tolerates a crash-torn final write, and stays tamper-evident. - Done, the default roster. 24 domain-neutral capability lanes (engineering, graphics, support, research) shipped in the box and loaded with
roster.load_default(). Plain capability names, every lane keyword-routable. - Done, the daemon (HTTP). A stdlib-asyncio HTTP service over one durable ledger: route, plan, submit, and verify or replay the record over HTTP. Every request witnessed into the same record.
- Done, the MCP surface. The same tools over MCP (JSON-RPC on stdio), a thin adapter over the HTTP surface so the two cannot drift. The lone optional edge.
- Done, the CLI. A
forumcommand: route, submit, serve, mcp, and ledger verify / show / replay / get. Pick a model with--apior--cmd. - Done, hardened and proven. Each verdict chains to the result it judged, the routing ladder reaches the Classifier on escalation (
assign/submit_one), and a gated test proves the whole loop against a real model. See RUNNING.md. - 1.0. Durable, verifiable, daemonized, installable, documented. The functional engine is complete.
- 1.1, the run contract. A ContextProvider seam (plan on a brain's organized context, witnessed) and a RunBudget that bounds a run. Research-informed.
- 1.2, witnessed escalation. Model identity in the ledger and validator-driven escalation up a ladder of stronger executors, on a verifiable signal not model confidence. Research-informed.
- 1.3, reading the record. A run summary aggregated purely from the witnessed ledger (
forum ledger summary), and a ledger A/B (forum bench) so an improvement is measured from the record, not claimed. - 1.4, did the run answer? A witnessed intent check: how much of the request the final answer covers, recorded and surfaced in the summary and A/B. A reproducible lexical floor; a grounded model intent-judge is the next rung.
- 1.5, the intent-judge. The rung above the floor: when the lexical check flags drift, an opt-in model judge resolves whether the answer truly drifted or just paraphrased, witnessed and budget-bounded. Cheap-first, like routing and escalation.
- 1.6, the DAG flows data. Typed edges: a data edge feeds its upstream's witnessed output into the downstream task, an order edge only sequences. Existing plans now flow data downstream instead of dropping it, and every edge and hand-off is witnessed.
- 1.7, the verification seam. A VerifierProvider seam, the peer of the context seam, so an external verifier checks the answer after the run, witnessed. The default abstains; Forum stands alone.
- Beyond. Phased checkpoints and resume, opt-in batched fsync, and a ledger-reading dashboard.
Docs
- ARCHITECTURE.md: the layers, the ledger, and the surfaces.
- RUNNING.md: run it against a real model, over the API or a model CLI.
- SECURITY.md: the trust model, the no-shell guarantee, and sandboxing.
- RELEASING.md: how a release is built and published.
License
Forum is fair-source: the code is open to read, run, and build on, with commercial use reserved so the project can fund its own development. Copyright stays with the author. See LICENSE for the exact terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file forum_engine-1.7.0.tar.gz.
File metadata
- Download URL: forum_engine-1.7.0.tar.gz
- Upload date:
- Size: 72.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
228d033b12598d0cba90456a1c90d7bafdc443b0a8b5a1d08998f69ee44803a4
|
|
| MD5 |
546f77c9b1d44ba735d73bd6fc9e53e9
|
|
| BLAKE2b-256 |
ee2b11d9a2506daa4c5cf423767ee524429845e5ead1c9d179e3e5628a997204
|
Provenance
The following attestation bundles were made for forum_engine-1.7.0.tar.gz:
Publisher:
release.yml on HarperZ9/forum
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
forum_engine-1.7.0.tar.gz -
Subject digest:
228d033b12598d0cba90456a1c90d7bafdc443b0a8b5a1d08998f69ee44803a4 - Sigstore transparency entry: 1949719082
- Sigstore integration time:
-
Permalink:
HarperZ9/forum@5c2f6ab2174231cd6bc55a57188f5ecda5f9e703 -
Branch / Tag:
refs/tags/v1.7.0 - Owner: https://github.com/HarperZ9
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5c2f6ab2174231cd6bc55a57188f5ecda5f9e703 -
Trigger Event:
push
-
Statement type:
File details
Details for the file forum_engine-1.7.0-py3-none-any.whl.
File metadata
- Download URL: forum_engine-1.7.0-py3-none-any.whl
- Upload date:
- Size: 51.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24f3798a78c951338a311408e1b50e37613835ef03418c0caa8c3a99c165a14d
|
|
| MD5 |
2ebf09363711eb8a13c5d89992c24972
|
|
| BLAKE2b-256 |
93b12506e6b218de5a8636eb9781a457bf7cfdc2c66068de65482969ec42792f
|
Provenance
The following attestation bundles were made for forum_engine-1.7.0-py3-none-any.whl:
Publisher:
release.yml on HarperZ9/forum
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
forum_engine-1.7.0-py3-none-any.whl -
Subject digest:
24f3798a78c951338a311408e1b50e37613835ef03418c0caa8c3a99c165a14d - Sigstore transparency entry: 1949719180
- Sigstore integration time:
-
Permalink:
HarperZ9/forum@5c2f6ab2174231cd6bc55a57188f5ecda5f9e703 -
Branch / Tag:
refs/tags/v1.7.0 - Owner: https://github.com/HarperZ9
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5c2f6ab2174231cd6bc55a57188f5ecda5f9e703 -
Trigger Event:
push
-
Statement type: