Skip to main content

Plumbing for calibrated AI trading agents — capture, score, instrument, anchor.

Project description

trench-core

Plumbing for calibrated AI trading agents. Capture, score, instrument, anchor.

PyPI version Python versions License: MIT

trench-core is the open-source framework that powers TrenchSignals — an autonomous AI paper-trading geopolitical conflict markets in public, with every prediction Brier-scored and every loss publicly post-mortemed.

This package is the plumbing: capture, score, instrument, anchor. It is deliberately not the agent itself — TrenchSignals' specific ontology, prompts, brand voice, and operating record stay private.

If you want to:

  • Capture every LLM call with full input/output for later replay through a different model
  • Score predictions against actual market settlements (Brier, calibration curves, threshold backtests, P&L attribution)
  • Anchor predictions into a public hash chain so anyone can verify them without trusting your server
  • Instrument every loop iteration with a single structured outcome line so failure modes are observable in production
  • Run a multi-variant tournament of decision policies on the same intelligence pipeline

…then trench-core gives you the building blocks.

Status

Alpha (0.x). All eight modules are shipped (0.8.0). API may break between minor versions until 1.0.0. Pin exactly if you depend on it.

See the changelog for what's landed.

Install

pip install trench-core              # core + most modules (stdlib only)
pip install 'trench-core[sources]'   # adds feedparser for RSS polling
pip install 'trench-core[all]'       # everything

Six of the eight modules need only the standard library. Only sources (RSS polling, via feedparser) declares an optional runtime dependency.

Quickstart

A minimal end-to-end loop: capture a Claude call, log its outcome, register the bundle's hash on a public chain, and score the prediction once the market settles. No requests, no Anthropic SDK — the framework stays provider-agnostic; you wire your own model caller.

from pathlib import Path

from trench_core.replay import BundleWriter, load_bundles, replay_bundle, diff_signals
from trench_core.cycle_outcomes import OutcomeLogger
from trench_core.registry import RegistryWriter, verify_chain
from trench_core.calibration import calibration_report

# 1. Capture: at the moment of generation
writer = BundleWriter(
    path="bundles.jsonl",
    system_prompt="You are an analyst...",   # auto-hashed for grouping
)
my_prompt = "Will Iran sign a deal by June 2027?"
my_raw    = your_llm_call(my_prompt, model="sonnet-4-6")  # caller-supplied
my_parsed = your_parser(my_raw)
writer.write(prompt=my_prompt, response_raw=my_raw,
             parsed=my_parsed, model="sonnet-4-6")

# 2. Instrument: emit one structured outcome line per loop iteration
outcomes = OutcomeLogger("/var/log/agent.log")
outcomes.emit("traded", confidence=my_parsed["confidence"], side="NO")

# 3. Pre-register: anchor today's bundles in a public hash chain
chain = RegistryWriter(
    bundle_paths={"baseline": "bundles.jsonl"},
    registry_root="registry/",
    summary_fields=("direction", "confidence"),
)
chain.update()                    # appends today's record
verify_chain("registry/")         # raises ChainBroken if anything tampered

# 4. Score: when the market settles, run a calibration report
trades = [...]        # your closed-trade dicts
evals  = [...]        # your per-market evaluation dicts
report = calibration_report(trades, evaluations=evals)
print(report["brier"]["mean_brier"], report["trade_summary"]["roi_pct"])

# 5. Replay: re-run a captured cycle through a different model
bundles = load_bundles("bundles.jsonl")
res = replay_bundle(
    bundle=bundles[-1],
    model="haiku-4-5",
    model_caller=your_llm_call,
    parser=your_parser,
)
print(diff_signals(res.bundle.parsed, res.candidate_parsed))

Each module is independently usable — pick the ones you need. Per-module quickstarts live in their __init__.py docstrings (also rendered in the generated API docs).

Modules

Module What it does Status
trench_core.calibration Brier scoring, calibration curves, threshold backtests, P&L attribution ✅ shipped (0.3.0)
trench_core.replay Capture-then-replay + diff harness for LLM agents ✅ shipped (0.4.0)
trench_core.cycle_outcomes Structured "one outcome line per loop iteration" instrumentation ✅ shipped (0.2.0)
trench_core.registry Public hash-chain pre-registration of agent outputs ✅ shipped (0.2.0)
trench_core.ontology Generic typed entity graph + alias resolver, SQLite-backed ✅ shipped (0.5.0)
trench_core.sources RSS poller + USGS seismic poller (Twitter/Telegram/financial deferred) ✅ shipped (0.6.0)
trench_core.markets Public-data clients — Kalshi (read-only), Manifold (Polymarket trading deferred) ✅ shipped (0.7.0)
trench_core.tournament Multi-variant runner pattern — same intel, different policies ✅ shipped (0.8.0)

What's deliberately not in scope

  • ❌ A backtesting engine (use zipline or vectorbt for historical sim)
  • ❌ A strategy library (no canned signals)
  • ❌ A managed/SaaS version (you run it yourself)
  • ❌ Brokerage integration (the framework provides data, not execution glue)
  • ❌ A multi-provider AI abstraction layer (Anthropic-first by design; wrap your own analyzer for OpenAI)

Examples

The examples/ directory has runnable demos for every module. All of them run offline (network calls are mocked):

python examples/01_cycle_outcomes.py
python examples/02_registry.py
python examples/03_calibration.py
python examples/04_replay.py
python examples/05_ontology.py
python examples/06_sources.py
python examples/07_markets.py
python examples/08_tournament.py

Contributing

The project is alpha — issues and PRs welcome, but expect API churn until 1.0.0. See CONTRIBUTING.md for the development setup and the discipline that's kept the extraction clean (audit before code, ruff + pytest pre-flight, byte-identical proof for refactors).

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trench_core-0.8.0.tar.gz (73.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trench_core-0.8.0-py3-none-any.whl (65.2 kB view details)

Uploaded Python 3

File details

Details for the file trench_core-0.8.0.tar.gz.

File metadata

  • Download URL: trench_core-0.8.0.tar.gz
  • Upload date:
  • Size: 73.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for trench_core-0.8.0.tar.gz
Algorithm Hash digest
SHA256 635bde9463cd3891e62b8390cc5e585c80f03fe9ba614992d6e226d0efd66594
MD5 28b36881f225103f69cc811dbfd20e93
BLAKE2b-256 0d08aa76ade07a5e8c22c442538c8017871ff5324ecc53a5096ad137cb3b2d77

See more details on using hashes here.

File details

Details for the file trench_core-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: trench_core-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 65.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for trench_core-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3099d918c9aed7141f0cea75636ef69574cef1fce6cab4b29cfa55b6baa3e799
MD5 7edca95359c07bad898e9bc7535c632e
BLAKE2b-256 437b76ee9ee629e212b7300789b51243d32c4591d637e21cb9083622b068f3d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page