Skip to main content

autocontext control plane for iterative strategy evolution.

Project description

autocontext

autocontext is the Python control-plane package for running scenarios, carrying forward validated knowledge, exporting artifacts, and distilling stable behavior into cheaper runtimes over time.

The intended use is to hand the harness a real task in plain language, let it solve or simulate the problem mostly hands-off, and then inspect the resulting traces, reports, playbooks, datasets, and optional distilled model.

Install

pip install autocontext

The current PyPI release line is autocontext==0.3.2. The PyPI package name is now autocontext. The CLI entrypoint remains autoctx.

Working Directory

Run the commands in this README from the autocontext/ directory. The Python package, CLI entrypoint, tests, and migrations all live here.

What It Does

  • Runs iterative generation loops against game scenarios and agent-task scenarios
  • Adds a first-class simulate surface for modeled-world exploration, replay, compare, and export
  • Persists playbooks, hints, tools, reports, and snapshots across runs
  • Supports staged validation, harness synthesis, and harness-aware routing
  • Exports training data and runs autoresearch-style local training loops
  • Exposes evaluation, validation, artifact, and discovery operations over MCP and HTTP

Surface Summary

The Python package is the full control-plane surface in this repo. It currently includes:

  • generation-loop execution via autoctx run
  • plain-language simulation via autoctx simulate
  • local training workflows via autoctx export-training-data and autoctx train
  • scenario creation and materialization via autoctx new-scenario
  • HTTP API and MCP server surfaces via autoctx serve and autoctx mcp-serve

Some newer operator-facing surfaces are currently TypeScript-first:

  • autoctx investigate
  • autoctx analyze
  • the interactive terminal UI via npx autoctx tui

Quick Start

From the repo root:

cd autocontext
uv venv
source .venv/bin/activate
uv sync --group dev

Use the repo-level .env.example as the reference for available AUTOCONTEXT_* settings.

operator-in-the-loop remains a typed scenario family for capability discovery and experimentation, but autocontext does not scaffold executable operator-loop runtimes. Use datasets, tools, or live-agent experiments instead of harness-owned escalation scripts.

Run a deterministic local scenario:

AUTOCONTEXT_AGENT_PROVIDER=deterministic \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

Run with Anthropic:

AUTOCONTEXT_AGENT_PROVIDER=anthropic \
AUTOCONTEXT_ANTHROPIC_API_KEY=... \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

Run with Pi CLI (local Pi agent runtime):

AUTOCONTEXT_AGENT_PROVIDER=pi \
AUTOCONTEXT_PI_COMMAND=pi \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

Run with Pi RPC (remote Pi agent via HTTP):

AUTOCONTEXT_AGENT_PROVIDER=pi-rpc \
AUTOCONTEXT_PI_RPC_ENDPOINT=http://localhost:3284 \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

Run with Hermes (via OpenAI-compatible gateway):

AUTOCONTEXT_AGENT_PROVIDER=openai-compatible \
AUTOCONTEXT_AGENT_BASE_URL=http://localhost:8080/v1 \
AUTOCONTEXT_AGENT_API_KEY=no-key \
AUTOCONTEXT_AGENT_DEFAULT_MODEL=hermes-3-llama-3.1-8b \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

Start the API server:

uv run autoctx serve --host 127.0.0.1 --port 8000

Inspect http://127.0.0.1:8000/ for the API index after the server starts. For an interactive terminal UI, use the TypeScript package: npx autoctx tui.

Start the MCP server:

uv sync --group dev --extra mcp
uv run autoctx mcp-serve

Main CLI Commands

uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3
uv run autoctx simulate --description "simulate deploying a web service with rollback"
uv run autoctx simulate --replay deploy_sim --variables threshold=0.9
uv run autoctx list
uv run autoctx status <run_id>
uv run autoctx replay <run_id> --generation 1
uv run autoctx run --scenario support_triage --gens 3
uv run autoctx benchmark --scenario support_triage --runs 5
uv run autoctx new-scenario --template prompt-optimization --name support_triage
uv run autoctx export-training-data --scenario support_triage --all-runs --output training/support_triage.jsonl
uv run autoctx train --scenario support_triage --data training/support_triage.jsonl --time-budget 300
uv run autoctx serve --host 127.0.0.1 --port 8000
uv run autoctx mcp-serve
uv run autoctx wait <condition_id> --json

Useful variants:

AUTOCONTEXT_AGENT_PROVIDER=anthropic AUTOCONTEXT_ANTHROPIC_API_KEY=... \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

AUTOCONTEXT_AGENT_PROVIDER=deterministic AUTOCONTEXT_RLM_ENABLED=true \
uv run autoctx solve --description "improve customer-support replies for billing disputes" --gens 3

Training Workflow

Export JSONL training data from completed runs:

uv run autoctx export-training-data \
  --scenario support_triage \
  --all-runs \
  --output training/support_triage.jsonl

Launch the autoresearch-style training loop:

uv sync --group dev --extra mlx
uv run autoctx train \
  --scenario support_triage \
  --data training/support_triage.jsonl \
  --time-budget 300

MLX training is host-only. It must run on an Apple Silicon macOS machine with Metal access. It will not run correctly inside a Docker sandbox on macOS.

If you only want to inspect generated training data first, export without training and open the JSONL directly.

For host setup details and OpenClaw automation via a file-based watcher bridge, see docs/mlx-training.md.

Configuration

Configuration is loaded from AUTOCONTEXT_* environment variables in src/autocontext/config/settings.py.

Common settings:

  • AUTOCONTEXT_AGENT_PROVIDER
  • AUTOCONTEXT_EXECUTOR_MODE
  • AUTOCONTEXT_MODEL_COMPETITOR
  • AUTOCONTEXT_MATCHES_PER_GENERATION
  • AUTOCONTEXT_MAX_RETRIES
  • AUTOCONTEXT_JUDGE_PROVIDER
  • AUTOCONTEXT_RLM_ENABLED
  • AUTOCONTEXT_HARNESS_PREFLIGHT_ENABLED
  • AUTOCONTEXT_STAGED_VALIDATION_ENABLED

See the repo-level .env.example for a working starting point.

Repository Structure

autocontext/
  src/autocontext/   Python package
  tests/             Pytest suite
  docs/              Package-specific documentation
  migrations/        SQLite migrations
ts/                  TypeScript package
infra/               Docker, Fly.io, bootstrap scripts

Validation and Development

uv run ruff check src tests
uv run mypy src
uv run pytest

If you change protocol messages, regenerate the derived protocol artifacts from the repo root:

cd ..
uv run --directory autocontext python scripts/generate_protocol.py

OpenClaw / ClawHub

autocontext exposes:

  • artifact contracts for harnesses, policies, and distilled models
  • REST and MCP operations for evaluate, validate, publish, import, and discover
  • ClawHub skill manifests and scenario discovery metadata
  • an adapter layer for running OpenClaw agents inside the harness

Additional Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autocontext-0.3.2.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autocontext-0.3.2-py3-none-any.whl (719.0 kB view details)

Uploaded Python 3

File details

Details for the file autocontext-0.3.2.tar.gz.

File metadata

  • Download URL: autocontext-0.3.2.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for autocontext-0.3.2.tar.gz
Algorithm Hash digest
SHA256 b56b00ac019a397a9008900fb8df94eaee400c2405cb6fecd0dc6c0e694d479b
MD5 8d88c81700c21c95a93ba1b0a6c21ec4
BLAKE2b-256 90829c70af94afebc1fd6b9006a2b367f81f66727543336f15c3352f8170f87a

See more details on using hashes here.

Provenance

The following attestation bundles were made for autocontext-0.3.2.tar.gz:

Publisher: publish-python.yml on greyhaven-ai/autocontext

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file autocontext-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: autocontext-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 719.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for autocontext-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 46579c36a9a5552f0118b45554e357e4c0a9df5cdb3227e8f9f02fc67d3e660a
MD5 b451bc2e086629636e7c88ba77f8ef5c
BLAKE2b-256 43c3084ae3fc9afa54324c70486768af05f1d9e824fe498658a0039b32ed3d01

See more details on using hashes here.

Provenance

The following attestation bundles were made for autocontext-0.3.2-py3-none-any.whl:

Publisher: publish-python.yml on greyhaven-ai/autocontext

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page