Plumbing for calibrated AI trading agents — capture, score, instrument, anchor.
Project description
trench-core
Plumbing for calibrated AI trading agents. Capture, score, instrument, anchor.
trench-core is the open-source framework that powers
TrenchSignals — an autonomous AI paper-trading
geopolitical conflict markets in public, with every prediction Brier-scored
and every loss publicly post-mortemed.
This package is the plumbing: capture, score, instrument, anchor. It is deliberately not the agent itself — TrenchSignals' specific ontology, prompts, brand voice, and operating record stay private.
If you want to:
- Capture every LLM call with full input/output for later replay through a different model
- Score predictions against actual market settlements (Brier, calibration curves, threshold backtests, P&L attribution)
- Anchor predictions into a public hash chain so anyone can verify them without trusting your server
- Instrument every loop iteration with a single structured outcome line so failure modes are observable in production
- Run a multi-variant tournament of decision policies on the same intelligence pipeline
…then trench-core gives you the building blocks.
Status
Alpha (
0.x). All eight modules are shipped (0.8.0). API may break between minor versions until1.0.0. Pin exactly if you depend on it.
See the changelog for what's landed.
Install
pip install trench-core # core + most modules (stdlib only)
pip install 'trench-core[sources]' # adds feedparser for RSS polling
pip install 'trench-core[all]' # everything
Six of the eight modules need only the standard library. Only sources
(RSS polling, via feedparser) declares an optional runtime dependency.
Quickstart
A minimal end-to-end loop: capture a Claude call, log its outcome,
register the bundle's hash on a public chain, and score the prediction
once the market settles. No requests, no Anthropic SDK — the framework
stays provider-agnostic; you wire your own model caller.
from pathlib import Path
from trench_core.replay import BundleWriter, load_bundles, replay_bundle, diff_signals
from trench_core.cycle_outcomes import OutcomeLogger
from trench_core.registry import RegistryWriter, verify_chain
from trench_core.calibration import calibration_report
# 1. Capture: at the moment of generation
writer = BundleWriter(
path="bundles.jsonl",
system_prompt="You are an analyst...", # auto-hashed for grouping
)
my_prompt = "Will Iran sign a deal by June 2027?"
my_raw = your_llm_call(my_prompt, model="sonnet-4-6") # caller-supplied
my_parsed = your_parser(my_raw)
writer.write(prompt=my_prompt, response_raw=my_raw,
parsed=my_parsed, model="sonnet-4-6")
# 2. Instrument: emit one structured outcome line per loop iteration
outcomes = OutcomeLogger("/var/log/agent.log")
outcomes.emit("traded", confidence=my_parsed["confidence"], side="NO")
# 3. Pre-register: anchor today's bundles in a public hash chain
chain = RegistryWriter(
bundle_paths={"baseline": "bundles.jsonl"},
registry_root="registry/",
summary_fields=("direction", "confidence"),
)
chain.update() # appends today's record
verify_chain("registry/") # raises ChainBroken if anything tampered
# 4. Score: when the market settles, run a calibration report
trades = [...] # your closed-trade dicts
evals = [...] # your per-market evaluation dicts
report = calibration_report(trades, evaluations=evals)
print(report["brier"]["mean_brier"], report["trade_summary"]["roi_pct"])
# 5. Replay: re-run a captured cycle through a different model
bundles = load_bundles("bundles.jsonl")
res = replay_bundle(
bundle=bundles[-1],
model="haiku-4-5",
model_caller=your_llm_call,
parser=your_parser,
)
print(diff_signals(res.bundle.parsed, res.candidate_parsed))
Each module is independently usable — pick the ones you need. Per-module
quickstarts live in their __init__.py docstrings (also rendered in the
generated API docs).
Modules
| Module | What it does | Status |
|---|---|---|
trench_core.calibration |
Brier scoring, calibration curves, threshold backtests, P&L attribution | ✅ shipped (0.3.0) |
trench_core.replay |
Capture-then-replay + diff harness for LLM agents | ✅ shipped (0.4.0) |
trench_core.cycle_outcomes |
Structured "one outcome line per loop iteration" instrumentation | ✅ shipped (0.2.0) |
trench_core.registry |
Public hash-chain pre-registration of agent outputs | ✅ shipped (0.2.0) |
trench_core.ontology |
Generic typed entity graph + alias resolver, SQLite-backed | ✅ shipped (0.5.0) |
trench_core.sources |
RSS poller + USGS seismic poller (Twitter/Telegram/financial deferred) | ✅ shipped (0.6.0) |
trench_core.markets |
Public-data clients — Kalshi (read-only), Manifold (Polymarket trading deferred) | ✅ shipped (0.7.0) |
trench_core.tournament |
Multi-variant runner pattern — same intel, different policies | ✅ shipped (0.8.0) |
What's deliberately not in scope
- ❌ A backtesting engine (use zipline or vectorbt for historical sim)
- ❌ A strategy library (no canned signals)
- ❌ A managed/SaaS version (you run it yourself)
- ❌ Brokerage integration (the framework provides data, not execution glue)
- ❌ A multi-provider AI abstraction layer (Anthropic-first by design; wrap your own analyzer for OpenAI)
Examples
The examples/ directory has runnable demos for every
module. All of them run offline (network calls are mocked):
python examples/01_cycle_outcomes.py
python examples/02_registry.py
python examples/03_calibration.py
python examples/04_replay.py
python examples/05_ontology.py
python examples/06_sources.py
python examples/07_markets.py
python examples/08_tournament.py
Contributing
The project is alpha — issues and PRs welcome, but expect API churn
until 1.0.0. See CONTRIBUTING.md for the
development setup and the discipline that's kept the extraction
clean (audit before code, ruff + pytest pre-flight, byte-identical
proof for refactors).
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trench_core-0.8.0.tar.gz.
File metadata
- Download URL: trench_core-0.8.0.tar.gz
- Upload date:
- Size: 73.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
635bde9463cd3891e62b8390cc5e585c80f03fe9ba614992d6e226d0efd66594
|
|
| MD5 |
28b36881f225103f69cc811dbfd20e93
|
|
| BLAKE2b-256 |
0d08aa76ade07a5e8c22c442538c8017871ff5324ecc53a5096ad137cb3b2d77
|
File details
Details for the file trench_core-0.8.0-py3-none-any.whl.
File metadata
- Download URL: trench_core-0.8.0-py3-none-any.whl
- Upload date:
- Size: 65.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3099d918c9aed7141f0cea75636ef69574cef1fce6cab4b29cfa55b6baa3e799
|
|
| MD5 |
7edca95359c07bad898e9bc7535c632e
|
|
| BLAKE2b-256 |
437b76ee9ee629e212b7300789b51243d32c4591d637e21cb9083622b068f3d6
|