Recursive Evolutionary Program Search — LLM-driven evolutionary code search

These details have not been verified by PyPI

Project links

Project description

REPS

A self-improving evolutionary code search agent that reflects, diversifies, and steers.

Python 3.12

REPS evolves programs with an LLM-driven loop that reflects between batches, balances explorer/exploiter workers, detects convergence, and steers compute by distance to a known target.

Result: Circle Packing n=26

System	sum_radii	Iterations	Model
Prior SOTA	2.634	—	—
OpenEvolve (shipped best)	2.6342924	470	gemini-2.0-flash + claude-3.7-sonnet
AlphaEvolve (paper)	2.6358628	—	Gemini 2.0 Pro
FICO Xpress Solver	2.6359155	—	—
REPS	2.6359831	100	claude-sonnet-4.6

Verified against DeepMind's official validator.

REPS Circle Packing

What REPS does

Adaptive selection — selection_strategy="map_elites" | "pareto" | "mixed" with pareto_fraction for blending MAP-Elites bins and per-instance Pareto fronts. (reps/api/optimizer.py:64, GEPA Phase 2)
Trace reflection — trace_reflection=True: the reflection LLM sees per-instance scores + feedback from the parent's failures, not just aggregate scores. (reps/api/optimizer.py:66, GEPA Phase 3)
Ancestry-aware reflection — lineage_depth=N: extends reflection with the last N parents in a candidate's chain. (reps/api/optimizer.py:67, GEPA Phase 5)
System-aware merge — merge=True: candidates from different islands recombine via an LLM-driven merge prompt that targets disjoint instance dimensions. (reps/api/optimizer.py:68, GEPA Phase 4)
Convergence + SOTA steering — built-in convergence monitor (edit entropy + strategy divergence) and gap-aware compute steering when a target score is set. On by default.

Status: pre-1.0

REPS is pre-1.0. The Python API (docs/python_api_spec.md) shipped recently and may still evolve. Per docs/release_spec.md, minor version bumps (0.1 → 0.2) may include breaking changes during the pre-1.0 era. Pin to a specific minor version (e.g. reps-py==0.1.*) if you need stability across upgrades. Strict semver applies once REPS reaches 1.0.0.

Install

Requires Python 3.12+ and uv.

git clone https://github.com/zkhorozianbc/reps.git
cd reps
uv venv .venv --python 3.12
uv pip install -e .

Install from PyPI with pip install reps-py. Optional extras: [dspy] (the dspy_react worker), [benchmarks] (scipy + matplotlib for the bundled circle-packing benchmark).

Set the API key matching your model's provider:

export ANTHROPIC_API_KEY=sk-ant-...      # provider: anthropic
export OPENROUTER_API_KEY=sk-or-...      # provider: openrouter
export OPENAI_API_KEY=sk-...             # provider: openai

A sibling .env file is auto-loaded.

Quick start (Python)

REPS is a Python library. Pass a seed program string and an evaluator callable; get back the best evolved program.

import reps

def evaluate(code: str) -> float:
    # Run the candidate, return a score. Higher is better.
    namespace = {}
    exec(code, namespace)
    return float(namespace["solve"]())

result = reps.Optimizer(
    model="anthropic/claude-sonnet-4.6",   # api_key from $ANTHROPIC_API_KEY
    max_iterations=20,
).optimize(
    initial=open("seed.py").read(),
    evaluate=evaluate,
)

print(result.best_score)
print(result.best_code)

What's an evaluator?

An evaluator is any Callable[[str], float | dict | reps.EvaluationResult]. REPS calls it with the candidate program text and uses the returned score to drive selection. Return a float for a quick start, a dict with combined_score and optional per_instance_scores / feedback for richer signal, or a reps.EvaluationResult to unlock the per-objective Pareto + trace-reflection paths described in docs/python_api_spec.md.

def eval_simple(code: str) -> float:    return 1.0
def eval_dict(code: str) -> dict:       return {"combined_score": 0.9, "feedback": "..."}
def eval_full(code: str) -> reps.EvaluationResult: ...

GEPA-style features (constructor knobs)

Kwarg	Effect	Default
`selection_strategy`	`"map_elites"` (REPS classic), `"pareto"` (GEPA-style frontier), or `"mixed"`	`"map_elites"`
`pareto_fraction`	Blend ratio when `selection_strategy="mixed"`	`0.0`
`trace_reflection`	Reflection sees per-instance scores + feedback, not aggregates	`False`
`lineage_depth`	How many ancestors the reflection prompt sees	`3`
`merge`	Enable LLM-driven cross-island merge	`False`
`num_islands`	Population islands for diversity	`5`
`max_iterations`	Search budget	`100`
`output_dir`	Persist run artifacts; `None` ⇒ tempdir	`None`

Full surface (escape hatches, model knobs, deferred kwargs) in docs/python_api_spec.md.

Reusing a Model

Most users pass a model-name string to Optimizer(model=...). Build a reps.Model directly when you want to call the model outside the optimizer or share one configured client across multiple runs.

import reps

model = reps.Model("anthropic/claude-sonnet-4.6", temperature=0.7)
print(model("hello"))                                    # standalone use

# Share one Model across multiple optimizers
o1 = reps.Optimizer(model=model, max_iterations=20)
o2 = reps.Optimizer(model=model, max_iterations=50, merge=True)

Power-user: CLI / YAML

For batch experiments, reproducible sweeps, or YAML-driven configuration, REPS ships a CLI: reps-run --config <yaml>. The Python API above is built on the same engine, so anything achievable via YAML is achievable via Optimizer(...) plus Optimizer.from_config(cfg).

Run

Everything lives in the YAML — point reps-run at a config and go:

reps-run --config experiment/configs/circle_sonnet_reps.yaml

Results land in experiment/results/<config-stem>/run_NNN/ (auto-versioned). The best program is saved as best_program.py; per-iteration metrics under metrics/.

Common overrides:

reps-run --config <yaml> --iterations 50 --output my_runs/
reps-run --config <yaml> -o llm.temperature=0.9 -o reps.batch_size=10

The config decides everything else — model, workers, harness (reps or openevolve), and which benchmark to evolve (via task:).

Add a benchmark

Drop two files into experiment/benchmarks/<name>/:

experiment/benchmarks/<name>/
├── initial_program.py    # seed code (wrap evolvable region in EVOLVE-BLOCK markers)
└── evaluator.py          # defines evaluate(program_path) -> {"combined_score": float, ...}

initial_program.py:

# EVOLVE-BLOCK-START
def solve():
    return naive_result
# EVOLVE-BLOCK-END

evaluator.py:

def evaluate(program_path):
    # import program_path, run it, score it
    return {"combined_score": score}

Optional files in the same directory:

system_prompt.md — task-specific system prompt (auto-loaded)
visualize.py — visualize_from_program(path, save_path) for best-program plots

Then point a config at it:

task: ../benchmarks/<name>     # resolved relative to this YAML
max_iterations: 100
provider: anthropic
# ... see experiment/configs/circle_sonnet_reps.yaml for a full example

Run it: reps-run --config experiment/configs/<your_config>.yaml.

For cascade evaluation, also define evaluate_stage1 / evaluate_stage2. If the primary objective metric isn't combined_score, set reps.sota.target_metric: so SOTA steering compares the right value.

Configs

Reference configs in experiment/configs/:

circle_sonnet_reps.yaml, circle_opus47_anthropic.yaml, reps_full.yaml — full REPS runs
verify_*.yaml — minimal smoke tests, one per worker impl
circle_base.yaml, circle_sonnet_base.yaml — harness: openevolve baselines (uv pip install openevolve)

reps/config.py is the source of truth for every field and default.

Tests

uv run python -m pytest tests/

Design docs

docs/python_api_spec.md — the v1 Python API contract: every public class, kwarg, return shape, with file:line references into the implementation.
docs/gepa_implementation_plan.md — phase-by-phase rollout plan for the GEPA-inspired features (Pareto selection, trace reflection, merge, ancestry-aware reflection).
docs/optimizer_engine_separation_spec.md — internal refactor splitting the public Optimizer facade from the runtime engine.

Acknowledgements

Forked from OpenEvolve; now self-contained.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

May 12, 2026

This version

0.1.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reps_py-0.1.0.tar.gz (230.2 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

reps_py-0.1.0-py3-none-any.whl (204.7 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file reps_py-0.1.0.tar.gz.

File metadata

Download URL: reps_py-0.1.0.tar.gz
Upload date: May 12, 2026
Size: 230.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for reps_py-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f8a0173f15e2fa5244492cfa2a06476acae418685af0c2011048c249bf27c2f0`
MD5	`fdda748846f60d6c66e34b538d64ec99`
BLAKE2b-256	`454d8978b20864ea2ff759718f3eb5824f8bf169e4b93f49d7fc71db42945bd9`

See more details on using hashes here.

File details

Details for the file reps_py-0.1.0-py3-none-any.whl.

File metadata

Download URL: reps_py-0.1.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 204.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for reps_py-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`09e0addb4784308cf80e150b4e14c2fc31ef291f5f2e784355a11019875af418`
MD5	`1bd5e252501897fd41b4f45c14a4453e`
BLAKE2b-256	`e93f17d932ade4014227ac95f44371165fd7a460b02590802b1e3ec121282666`

See more details on using hashes here.

reps-py 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

REPS

Result: Circle Packing n=26

What REPS does

Status: pre-1.0

Install

Quick start (Python)

What's an evaluator?

GEPA-style features (constructor knobs)

Reusing a Model

Power-user: CLI / YAML

Run

Add a benchmark

Configs

Tests

Design docs

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes