Skip to main content

Domain-agnostic autonomous optimization framework

Project description

anneal

Let an AI agent improve your code, prompts, and configs — overnight, unattended.

anneal infographic — autonomous optimization for any measurable artifact

Point anneal at any text file in a git repo, tell it how to measure "better," and walk away. The agent generates hypotheses, runs experiments, keeps winners, discards losers, and compounds learnings — all while you sleep.

How It Works

  1. Register a target — the file to improve, how to score it, what's off-limits
  2. Run the loop — the agent mutates → evaluates → keeps or reverts → learns → repeats
  3. Review — every experiment is a git commit; check the history, scores, and cost
anneal register \
  --name my-target \
  --artifact path/to/file.py \
  --eval-mode deterministic \
  --run-cmd "python benchmark.py" \
  --parse-cmd "grep 'score' | awk '{print \$2}'" \
  --direction maximize \
  --scope scope.yaml

anneal run --target my-target --experiments 20

Install

uv tool install anneal-cli

# Or with pip
pip install anneal-cli

Requires Python 3.12+.

Examples

Prompt Optimization — stochastic eval

Improve an article summarizer prompt. The agent rewrites system_prompt.md, generates summaries from 5 test articles, and an LLM judge scores each against 4 binary criteria (key points captured? concise? plain language? factually accurate?).

anneal register \
  --name prompt-optimizer \
  --artifact examples/prompt-optimizer/system_prompt.md \
  --eval-mode stochastic \
  --criteria examples/prompt-optimizer/eval_criteria.toml \
  --direction maximize \
  --scope examples/prompt-optimizer/scope.yaml

anneal run --target prompt-optimizer --experiments 10

Test Coverage — deterministic eval, maximize

Improve pytest test coverage of a Python module. The agent adds tests to cover untested code paths. pytest --cov provides the score.

anneal register \
  --name test-coverage \
  --artifact examples/test-coverage/tests/test_calculator.py \
  --eval-mode deterministic \
  --run-cmd "bash examples/test-coverage/eval.sh" \
  --parse-cmd "cat" \
  --direction maximize \
  --scope examples/test-coverage/scope.yaml

anneal run --target test-coverage --experiments 10

Code Golf — deterministic eval, minimize

Shrink a verbose Python file while preserving byte-identical output. In a test run: 3,592 → 228 characters (93.7% reduction) in 7 experiments.

anneal register \
  --name code-golf \
  --artifact examples/code-golf/app.py \
  --eval-mode deterministic \
  --run-cmd "bash examples/code-golf/eval.sh" \
  --parse-cmd "cat" \
  --direction minimize \
  --scope examples/code-golf/scope.yaml

anneal run --target code-golf --experiments 10

Local Artifacts (no git tracking required)

Artifact files don't need to be committed to git. If they're untracked, anneal copies them into the worktree automatically during registration. For files you don't want in version control at all, use --in-place to skip worktree isolation entirely:

anneal register \
  --name local-skill \
  --artifact SKILL.md \
  --eval-mode stochastic \
  --criteria eval_criteria.toml \
  --direction maximize \
  --scope scope.yaml \
  --in-place

Two Eval Modes

Deterministic — a shell command produces a number. Run code, parse output, compare. Use for: performance benchmarks, test coverage, file size, build time.

Stochastic — an LLM judges N samples against K binary (YES/NO) criteria. Use for: prompt quality, documentation clarity, content optimization — anything where output varies between runs.

Documentation

Doc What's in it
Overview Motivation, lineage, and the core idea
Eval Guide Writing good binary evaluation criteria
Recipes Copy-paste registration commands for common targets
Use Cases Where anneal works, where it doesn't, and why
Features Search strategies, statistical methods, knowledge system
Architecture Module map and design principles
System Design Full technical design document
CI Integration GitHub Actions workflow and status JSON output

Testing

uv run pytest tests/ -x -q          # 820 tests
uv run pytest tests/ --cov=anneal    # With coverage

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anneal_cli-0.3.0.tar.gz (226.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anneal_cli-0.3.0-py3-none-any.whl (138.9 kB view details)

Uploaded Python 3

File details

Details for the file anneal_cli-0.3.0.tar.gz.

File metadata

  • Download URL: anneal_cli-0.3.0.tar.gz
  • Upload date:
  • Size: 226.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for anneal_cli-0.3.0.tar.gz
Algorithm Hash digest
SHA256 12de0c171cf16545a9b1c287667e6cc342263a1247e9b828bf154438ddcb833d
MD5 fa6fc2cd496c9b185b599ff283123a51
BLAKE2b-256 392a339327dd18040c28e465f3b8f9289dfacf9eb255989d1cb120b0a12162ea

See more details on using hashes here.

File details

Details for the file anneal_cli-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: anneal_cli-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 138.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for anneal_cli-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50e0366e20b976d17cc98b1faffe365b8acfe772aefa6ca23eab3d2913ff3ca0
MD5 4df8ee14f1698ef011f41164e290a8b6
BLAKE2b-256 380b4ff0bd1d09c18665f0de2fd90a682b53c00c41e84a5161d4be74bb3fb868

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page