Domain-agnostic autonomous optimization framework

Project description

anneal

Let an AI agent improve your code, prompts, and configs — overnight, unattended.

Point anneal at any text file in a git repo, tell it how to measure "better," and walk away. The agent generates hypotheses, runs experiments, keeps winners, discards losers, and compounds learnings — all while you sleep.

How It Works

Register a target — the file to improve, how to score it, what's off-limits
Run the loop — the agent mutates → evaluates → keeps or reverts → learns → repeats
Review — every experiment is a git commit; check the history, scores, and cost

anneal register \
  --name my-target \
  --artifact path/to/file.py \
  --eval-mode deterministic \
  --run-cmd "python benchmark.py" \
  --parse-cmd "grep 'score' | awk '{print \$2}'" \
  --direction maximize \
  --scope scope.yaml

anneal run --target my-target --experiments 20

Install

uv tool install anneal-cli

# Or with pip
pip install anneal-cli

Requires Python 3.12+.

Examples

Prompt Optimization — stochastic eval

Improve an article summarizer prompt. The agent rewrites system_prompt.md, generates summaries from 5 test articles, and an LLM judge scores each against 4 binary criteria (key points captured? concise? plain language? factually accurate?).

anneal register \
  --name prompt-optimizer \
  --artifact examples/prompt-optimizer/system_prompt.md \
  --eval-mode stochastic \
  --criteria examples/prompt-optimizer/eval_criteria.toml \
  --direction maximize \
  --scope examples/prompt-optimizer/scope.yaml

anneal run --target prompt-optimizer --experiments 10

Test Coverage — deterministic eval, maximize

Improve pytest test coverage of a Python module. The agent adds tests to cover untested code paths. pytest --cov provides the score.

anneal register \
  --name test-coverage \
  --artifact examples/test-coverage/tests/test_calculator.py \
  --eval-mode deterministic \
  --run-cmd "bash examples/test-coverage/eval.sh" \
  --parse-cmd "cat" \
  --direction maximize \
  --scope examples/test-coverage/scope.yaml

anneal run --target test-coverage --experiments 10

Code Golf — deterministic eval, minimize

Shrink a verbose Python file while preserving byte-identical output. In a test run: 3,592 → 228 characters (93.7% reduction) in 7 experiments.

anneal register \
  --name code-golf \
  --artifact examples/code-golf/app.py \
  --eval-mode deterministic \
  --run-cmd "bash examples/code-golf/eval.sh" \
  --parse-cmd "cat" \
  --direction minimize \
  --scope examples/code-golf/scope.yaml

anneal run --target code-golf --experiments 10

Local Artifacts (no git tracking required)

Artifact files don't need to be committed to git. If they're untracked, anneal copies them into the worktree automatically during registration. For files you don't want in version control at all, use --in-place to skip worktree isolation entirely:

anneal register \
  --name local-skill \
  --artifact SKILL.md \
  --eval-mode stochastic \
  --criteria eval_criteria.toml \
  --direction maximize \
  --scope scope.yaml \
  --in-place

Two Eval Modes

Deterministic — a shell command produces a number. Run code, parse output, compare. Use for: performance benchmarks, test coverage, file size, build time.

Stochastic — an LLM judges N samples against K binary (YES/NO) criteria. Use for: prompt quality, documentation clarity, content optimization — anything where output varies between runs.

Documentation

Doc	What's in it
Overview	Motivation, lineage, and the core idea
Eval Guide	Writing good binary evaluation criteria
Recipes	Copy-paste registration commands for common targets
Use Cases	Where anneal works, where it doesn't, and why
Features	Search strategies, statistical methods, knowledge system
Architecture	Module map and design principles
System Design	Full technical design document
CI Integration	GitHub Actions workflow and status JSON output

Testing

uv run pytest tests/ -x -q          # 820 tests
uv run pytest tests/ --cov=anneal    # With coverage

License

MIT

Project details

Release history Release notifications | RSS feed

0.4.0

Apr 1, 2026

This version

0.3.0

Apr 1, 2026

0.2.0

Mar 24, 2026

0.1.1

Mar 22, 2026

0.1.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anneal_cli-0.3.0.tar.gz (226.0 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anneal_cli-0.3.0-py3-none-any.whl (138.9 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file anneal_cli-0.3.0.tar.gz.

File metadata

Download URL: anneal_cli-0.3.0.tar.gz
Upload date: Apr 1, 2026
Size: 226.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for anneal_cli-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`12de0c171cf16545a9b1c287667e6cc342263a1247e9b828bf154438ddcb833d`
MD5	`fa6fc2cd496c9b185b599ff283123a51`
BLAKE2b-256	`392a339327dd18040c28e465f3b8f9289dfacf9eb255989d1cb120b0a12162ea`

See more details on using hashes here.

File details

Details for the file anneal_cli-0.3.0-py3-none-any.whl.

File metadata

Download URL: anneal_cli-0.3.0-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 138.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for anneal_cli-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`50e0366e20b976d17cc98b1faffe365b8acfe772aefa6ca23eab3d2913ff3ca0`
MD5	`4df8ee14f1698ef011f41164e290a8b6`
BLAKE2b-256	`380b4ff0bd1d09c18665f0de2fd90a682b53c00c41e84a5161d4be74bb3fb868`

See more details on using hashes here.

anneal-cli 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

anneal

How It Works

Install

Examples

Prompt Optimization — stochastic eval

Test Coverage — deterministic eval, maximize

Code Golf — deterministic eval, minimize

Local Artifacts (no git tracking required)

Two Eval Modes

Documentation

Testing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes