Skip to main content

autocontext control plane for iterative strategy evolution.

Project description

autocontext

autocontext is a control plane for improving agent behavior over repeated runs. It combines multi-agent candidate generation, staged validation, scenario execution, knowledge accumulation, optional local distillation, and OpenClaw-facing APIs.

Working Directory

Run the commands in this README from the autocontext/ directory. The Python package, CLI entrypoint, tests, migrations, and dashboard assets all live here.

What It Does

  • Runs iterative generation loops against game scenarios and agent-task scenarios
  • Persists playbooks, hints, tools, reports, and snapshots across runs
  • Supports staged validation, harness synthesis, and harness-aware routing
  • Exports training data and runs autoresearch-style local training loops
  • Exposes evaluation, validation, artifact, and discovery operations over MCP and HTTP

Quick Start

From the repo root:

cd autocontext
uv venv
source .venv/bin/activate
uv sync --group dev

Use the repo-level .env.example as the reference for available AUTOCONTEXT_* settings.

operator-in-the-loop remains a typed scenario family for capability discovery and experimentation, but autocontext does not scaffold executable operator-loop runtimes. Use datasets, tools, or live-agent experiments instead of harness-owned escalation scripts.

Run a deterministic local scenario:

AUTOCONTEXT_AGENT_PROVIDER=deterministic \
uv run autoctx run --scenario grid_ctf --gens 3 --run-id quickstart

Run with Anthropic:

AUTOCONTEXT_AGENT_PROVIDER=anthropic \
AUTOCONTEXT_ANTHROPIC_API_KEY=... \
uv run autoctx run --scenario grid_ctf --gens 3

Run with Pi CLI (local Pi agent runtime):

AUTOCONTEXT_AGENT_PROVIDER=pi \
AUTOCONTEXT_PI_COMMAND=pi \
uv run autoctx run --scenario grid_ctf --gens 3

Run with Pi RPC (remote Pi agent via HTTP):

AUTOCONTEXT_AGENT_PROVIDER=pi-rpc \
AUTOCONTEXT_PI_RPC_ENDPOINT=http://localhost:3284 \
uv run autoctx run --scenario grid_ctf --gens 3

Run with Hermes (via OpenAI-compatible gateway):

AUTOCONTEXT_AGENT_PROVIDER=openai-compatible \
AUTOCONTEXT_AGENT_BASE_URL=http://localhost:8080/v1 \
AUTOCONTEXT_AGENT_API_KEY=no-key \
AUTOCONTEXT_AGENT_DEFAULT_MODEL=hermes-3-llama-3.1-8b \
uv run autoctx run --scenario grid_ctf --gens 3

Start the API server and dashboard:

uv run autoctx serve --host 127.0.0.1 --port 8000

Open http://127.0.0.1:8000 after the server starts.

Start the MCP server:

uv sync --group dev --extra mcp
uv run autoctx mcp-serve

Main CLI Commands

uv run autoctx run --scenario grid_ctf --gens 3
uv run autoctx list
uv run autoctx status <run_id>
uv run autoctx replay <run_id> --generation 1
uv run autoctx benchmark --scenario grid_ctf --runs 5
uv run autoctx new-scenario --template prompt-optimization --name my-task
uv run autoctx serve --host 127.0.0.1 --port 8000
uv run autoctx mcp-serve
uv run autoctx wait <condition_id> --json

Useful variants:

AUTOCONTEXT_AGENT_PROVIDER=anthropic AUTOCONTEXT_ANTHROPIC_API_KEY=... \
uv run autoctx run --scenario grid_ctf --gens 3

AUTOCONTEXT_AGENT_PROVIDER=deterministic AUTOCONTEXT_RLM_ENABLED=true \
uv run autoctx run --scenario grid_ctf --gens 3

Training Workflow

Export JSONL training data from completed runs:

uv run autoctx export-training-data \
  --scenario grid_ctf \
  --all-runs \
  --output training/grid_ctf.jsonl

Launch the autoresearch-style training loop:

uv sync --group dev --extra mlx
uv run autoctx train \
  --scenario grid_ctf \
  --data training/grid_ctf.jsonl \
  --time-budget 300

MLX training is host-only. It must run on an Apple Silicon macOS machine with Metal access. It will not run correctly inside a Docker sandbox on macOS.

If you only want to inspect generated training data first, export without training and open the JSONL directly.

For host setup details and OpenClaw automation via a file-based watcher bridge, see docs/mlx-training.md.

Configuration

Configuration is loaded from AUTOCONTEXT_* environment variables in src/autocontext/config/settings.py.

Common settings:

  • AUTOCONTEXT_AGENT_PROVIDER
  • AUTOCONTEXT_EXECUTOR_MODE
  • AUTOCONTEXT_MODEL_COMPETITOR
  • AUTOCONTEXT_MATCHES_PER_GENERATION
  • AUTOCONTEXT_MAX_RETRIES
  • AUTOCONTEXT_JUDGE_PROVIDER
  • AUTOCONTEXT_RLM_ENABLED
  • AUTOCONTEXT_HARNESS_PREFLIGHT_ENABLED
  • AUTOCONTEXT_STAGED_VALIDATION_ENABLED

See the repo-level .env.example for a working starting point.

Repository Structure

autocontext/
  src/autocontext/   Python package
  tests/             Pytest suite
  dashboard/         Static dashboard assets
  docs/              Package-specific documentation
  migrations/        SQLite migrations
ts/                  TypeScript package
tui/                 Interactive terminal UI
infra/               Docker, Fly.io, bootstrap scripts

Validation and Development

uv run ruff check src tests
uv run mypy src
uv run pytest

If you change protocol messages, regenerate the derived protocol artifacts from the repo root:

cd ..
uv run --directory autocontext python scripts/generate_protocol.py

OpenClaw / ClawHub

autocontext exposes:

  • artifact contracts for harnesses, policies, and distilled models
  • REST and MCP operations for evaluate, validate, publish, import, and discover
  • ClawHub skill manifests and scenario discovery metadata
  • an adapter layer for running OpenClaw agents inside the harness

Additional Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoctx-0.2.3.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoctx-0.2.3-py3-none-any.whl (704.5 kB view details)

Uploaded Python 3

File details

Details for the file autoctx-0.2.3.tar.gz.

File metadata

  • Download URL: autoctx-0.2.3.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for autoctx-0.2.3.tar.gz
Algorithm Hash digest
SHA256 da1c59eff2e1449ecf3f28a91b362926cd6aad62ef0b05a9294071b4431afa11
MD5 504fe0c3a8f1b5d5cdde52e0508a0210
BLAKE2b-256 af896508f39d05f922e8f485674fe4096a240462b9267bcfea898dd41fbfd3ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for autoctx-0.2.3.tar.gz:

Publisher: publish-python.yml on greyhaven-ai/autocontext

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file autoctx-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: autoctx-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 704.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for autoctx-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 714e57213283f1611b253624ca459cd2191aab3e319580d250094566e532ebba
MD5 9ec4b751a1e73557ffb898018423c61a
BLAKE2b-256 2f797b042abccac16cc05903a7ece8fe03bcb11acd42c0b54d9e109bbc2dce5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for autoctx-0.2.3-py3-none-any.whl:

Publisher: publish-python.yml on greyhaven-ai/autocontext

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page