Skip to main content

Plumbing for calibrated AI trading agents — capture, score, instrument, anchor.

Project description

trench-core

Plumbing for calibrated AI trading agents. Capture, score, instrument, anchor.

PyPI version Python versions License: MIT

trench-core is the open-source framework that powers TrenchSignals — an autonomous AI paper-trading geopolitical conflict markets in public, with every prediction Brier-scored and every loss publicly post-mortemed.

This package is the plumbing: capture, score, instrument, anchor. It is deliberately not the agent itself — TrenchSignals' specific ontology, prompts, brand voice, and operating record stay private.

If you want to:

  • Capture every LLM call with full input/output for later replay through a different model
  • Score predictions against actual market settlements (Brier, calibration curves, threshold backtests, P&L attribution)
  • Anchor predictions into a public hash chain so anyone can verify them without trusting your server
  • Instrument every loop iteration with a single structured outcome line so failure modes are observable in production
  • Run a multi-variant tournament of decision policies on the same intelligence pipeline

…then trench-core gives you the building blocks.

Status

Alpha (0.x). All eight modules are shipped (0.8.0). API may break between minor versions until 1.0.0. Pin exactly if you depend on it.

See the changelog for what's landed.

Install

pip install trench-core              # core + most modules (stdlib only)
pip install 'trench-core[sources]'   # adds feedparser for RSS polling
pip install 'trench-core[all]'       # everything

Six of the eight modules need only the standard library. Only sources (RSS polling, via feedparser) declares an optional runtime dependency.

Quickstart

A minimal end-to-end loop: capture a Claude call, log its outcome, register the bundle's hash on a public chain, and score the prediction once the market settles. No requests, no Anthropic SDK — the framework stays provider-agnostic; you wire your own model caller.

from pathlib import Path

from trench_core.replay import BundleWriter, load_bundles, replay_bundle, diff_signals
from trench_core.cycle_outcomes import OutcomeLogger
from trench_core.registry import RegistryWriter, verify_chain
from trench_core.calibration import calibration_report

# 1. Capture: at the moment of generation
writer = BundleWriter(
    path="bundles.jsonl",
    system_prompt="You are an analyst...",   # auto-hashed for grouping
)
my_prompt = "Will Iran sign a deal by June 2027?"
my_raw    = your_llm_call(my_prompt, model="sonnet-4-6")  # caller-supplied
my_parsed = your_parser(my_raw)
writer.write(prompt=my_prompt, response_raw=my_raw,
             parsed=my_parsed, model="sonnet-4-6")

# 2. Instrument: emit one structured outcome line per loop iteration
outcomes = OutcomeLogger("/var/log/agent.log")
outcomes.emit("traded", confidence=my_parsed["confidence"], side="NO")

# 3. Pre-register: anchor today's bundles in a public hash chain
chain = RegistryWriter(
    bundle_paths={"baseline": "bundles.jsonl"},
    registry_root="registry/",
    summary_fields=("direction", "confidence"),
)
chain.update()                    # appends today's record
verify_chain("registry/")         # raises ChainBroken if anything tampered

# 4. Score: when the market settles, run a calibration report
trades = [...]        # your closed-trade dicts
evals  = [...]        # your per-market evaluation dicts
report = calibration_report(trades, evaluations=evals)
print(report["brier"]["mean_brier"], report["trade_summary"]["roi_pct"])

# 5. Replay: re-run a captured cycle through a different model
bundles = load_bundles("bundles.jsonl")
res = replay_bundle(
    bundle=bundles[-1],
    model="haiku-4-5",
    model_caller=your_llm_call,
    parser=your_parser,
)
print(diff_signals(res.bundle.parsed, res.candidate_parsed))

Each module is independently usable — pick the ones you need. Per-module quickstarts live in their __init__.py docstrings (also rendered in the generated API docs).

Modules

Module What it does Status
trench_core.calibration Brier scoring, calibration curves, threshold backtests, P&L attribution ✅ shipped (0.3.0)
trench_core.replay Capture-then-replay + diff harness for LLM agents ✅ shipped (0.4.0)
trench_core.cycle_outcomes Structured "one outcome line per loop iteration" instrumentation ✅ shipped (0.2.0)
trench_core.registry Public hash-chain pre-registration of agent outputs ✅ shipped (0.2.0)
trench_core.ontology Generic typed entity graph + alias resolver, SQLite-backed ✅ shipped (0.5.0)
trench_core.sources RSS poller + USGS seismic poller (Twitter/Telegram/financial deferred) ✅ shipped (0.6.0)
trench_core.markets Public-data clients — Kalshi (read-only), Manifold (Polymarket trading deferred) ✅ shipped (0.7.0)
trench_core.tournament Multi-variant runner pattern — same intel, different policies ✅ shipped (0.8.0)

What's deliberately not in scope

  • ❌ A backtesting engine (use zipline or vectorbt for historical sim)
  • ❌ A strategy library (no canned signals)
  • ❌ A managed/SaaS version (you run it yourself)
  • ❌ Brokerage integration (the framework provides data, not execution glue)
  • ❌ A multi-provider AI abstraction layer (Anthropic-first by design; wrap your own analyzer for OpenAI)

Examples

The examples/ directory has runnable demos for every module. All of them run offline (network calls are mocked):

python examples/01_cycle_outcomes.py
python examples/02_registry.py
python examples/03_calibration.py
python examples/04_replay.py
python examples/05_ontology.py
python examples/06_sources.py
python examples/07_markets.py
python examples/08_tournament.py

Contributing

The project is alpha — issues and PRs welcome, but expect API churn until 1.0.0. See CONTRIBUTING.md for the development setup and the discipline that's kept the extraction clean (audit before code, ruff + pytest pre-flight, byte-identical proof for refactors).

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trench_core-0.8.1.tar.gz (73.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trench_core-0.8.1-py3-none-any.whl (65.2 kB view details)

Uploaded Python 3

File details

Details for the file trench_core-0.8.1.tar.gz.

File metadata

  • Download URL: trench_core-0.8.1.tar.gz
  • Upload date:
  • Size: 73.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for trench_core-0.8.1.tar.gz
Algorithm Hash digest
SHA256 3c89bfb507b622e0bc205037e11eb8479f17a919a72219d13d6a32e156840f5a
MD5 d95e279166cf7b11d2eca9fa8e12453e
BLAKE2b-256 84ae25820d557d03671a0f910e24c3be61f4e15a6136fe9c257e6cf35b963d49

See more details on using hashes here.

File details

Details for the file trench_core-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: trench_core-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 65.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for trench_core-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d8fab38149a22ff5fa94a6ef9c62bbf96fa67d7d28954c5401d9b8831f91e7f5
MD5 1bc643665ae45bd3fc1535016d222e70
BLAKE2b-256 7c196007e6f3b75d03d381fd57d4fe6ce415909004aed45220f38ee2a0ae55c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page