Evolve any agent across any domain using any evolution algorithm — with zero human intervention

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

A-Evolve 🧬: The Universal Infrastructure for Self-Improving Agents

The PyTorch for Agentic AI. A-Evolve is an open-source infrastructure that evolves any agent, across any domain, using any evolution algorithm — with zero human intervention.

Quick Start | News | Benchmark Highlights | Architecture & Design | Contribution

A-Evolve Teaser

What Does A-Evolve Do?

You provide a Base Agent. A-Evolve returns a SOTA Agent. 3 lines of code. 0 hours of manual harness engineering. One infra, any domain, any evolution algorithm.

import agent_evolve as ae

evolver = ae.Evolver(agent="./my_agent", benchmark="swe-verified")
results = evolver.run(cycles=10)

Benchmark Highlights

By applying our open-source reference evolution algorithms to a base Claude Opus-4.6 model with zero manual harness engineering, A-Evolve pushed agents into top-tier performance across four diverse benchmarks:

🟢 MCP-Atlas

🥇 #1
_{Baseline → 79.4% (+3.4pp)}

🔵 SWE-bench Verified

~#5
_{Baseline → 76.8% (+2.6pp)}

🟣 Terminal-Bench 2.0

~#7
_{Baseline → 76.5% (+13.0pp)}

🟡 SkillsBench

#2
_{Baseline → 34.9% (+15.2pp)}

A-Evolve Benchmarks

All results achieved with a single Claude Opus-4.6 base model, evolved using A-Evolve's sample algorithms. 0 hours of human harness engineering. Data checked March 2026.

News

03/30 Integration, A-Evolve is officially integrated into AutoResearchClaw
03/25 🚀 Open-source A-Evolve, the universal infrastructure for developing and testing evolving algorithms.
03/25 📊 Open-source 4 evolving algorithms developed with A-Evolve, achieving SOTA (#1, ~#5, ~#7, #2) on MCP-Atlas, SWE-bench Verified, Terminal-Bench 2.0, and SkillsBench.
02/17 📄 Release the official implementation of Position: Agentic Evolution is the Path to Evolving LLMs (arXiv 2602.00359).

We are evolving fast! Support our research by leaving a ⭐.

What Does an Evolved Agent Look Like?

A-Evolve mutates real files in the workspace. Here's a before/after from our MCP-Atlas evolution:

Before (Seed Workspace) After (Evolved — 79.4% on MCP-Atlas)

Before (Seed Workspace)	After (Evolved — 79.4% on MCP-Atlas)
`mcp_agent/ ├── manifest.yaml ├── prompts/system.md ← 20 lines, generic ├── skills/ ← empty └── memory/ ← empty`	`mcp_agent/ ├── manifest.yaml ├── prompts/system.md ← 20 lines, unchanged ├── skills/ │ ├── entity-verification/SKILL.md ← NEW │ ├── search-iteration/SKILL.md ← NEW │ ├── multi-requirement/SKILL.md ← NEW │ ├── code-execution/SKILL.md ← NEW │ └── conditional-handler/SKILL.md ← NEW └── memory/ └── episodic.jsonl ← 6 entries`

mcp_agent/
├── manifest.yaml
├── prompts/system.md      ← 20 lines, generic
├── skills/                ← empty
└── memory/                ← empty

mcp_agent/
├── manifest.yaml
├── prompts/system.md      ← 20 lines, unchanged
├── skills/
│   ├── entity-verification/SKILL.md   ← NEW
│   ├── search-iteration/SKILL.md      ← NEW
│   ├── multi-requirement/SKILL.md     ← NEW
│   ├── code-execution/SKILL.md        ← NEW
│   └── conditional-handler/SKILL.md   ← NEW
└── memory/
    └── episodic.jsonl     ← 6 entries

5 targeted skills outperformed 10 generic ones. Every mutation is git-tagged (evo-1, evo-2, …) for full reproducibility.

Quick Start

1. Install

# PyPI (recommended)
pip install a-evolve              # core
pip install a-evolve[anthropic]   # Claude support
pip install a-evolve[mcp]         # MCP-Atlas benchmark
pip install a-evolve[swe]         # SWE-bench benchmark
pip install a-evolve[all]         # everything

# From source (for development)
git clone https://github.com/A-EVO-Lab/a-evolve.git && cd a-evolve
pip install -e ".[all,dev]"

2. Evolve — 3 Lines of Code

import agent_evolve as ae

evolver = ae.Evolver(
    agent="swe-verified",           # built-in seed workspace (or path to yours)
    benchmark="swe-verified",       # built-in benchmark adapter
)
results = evolver.run(cycles=10)

print(f"Final score: {results.final_score:.3f}")
print(f"Converged:   {results.converged}")

A-Evolve ships with built-in seed workspaces (swe, mcp, terminal, skillbench) and benchmark adapters (swe-verified, mcp-atlas, terminal-bench 2.0, skill-bench). Point agent= at any of them — or at your own workspace directory.

3. Bring Your Own Agent (BYOA)

To make any agent evolvable, implement one method — solve():

from agent_evolve.protocol.base_agent import BaseAgent
from agent_evolve.types import Task, Trajectory

class MyAgent(BaseAgent):
    def solve(self, task: Task) -> Trajectory:
        return Trajectory(task_id=task.id, output="result")

Then evolve it:

evolver = ae.Evolver(agent=MyAgent("./my_workspace"), benchmark="mcp-atlas")
results = evolver.run(cycles=10)

Your agent's evolvable state (prompts, skills, memory) lives as a standard directory — the Agent Workspace. A-Evolve mutates these files; your agent reloads. See Architecture & Design for the full picture.

For benchmark-specific walkthroughs, see SWE-bench Demo Guide, MCP-Atlas Demo Guide, and SkillBench Setup Guide.

Architecture & Design

A-Evolve Framework

The Agent Workspace: A File System Contract

A-Evolve's core insight: all evolvable agent state lives on the file system as a standard directory structure. This lets the evolution engine mutate any agent via LLM-driven file operations — without knowing the agent's internals.

my_agent/
├── manifest.yaml          # identity, entrypoint, evolvable layers
├── prompts/system.md      # system prompt
├── skills/                # SKILL.md files (dynamic skill library)
├── tools/                 # tool configurations
└── memory/                # episodic + semantic memory (JSONL)

The evolution engine reads these files, analyzes performance logs, and writes mutations back. The agent reloads. That's the entire contract.

The Evolution Loop

Every cycle follows five phases:

┌─────────┐    ┌─────────┐    ┌─────────┐    ┌──────┐    ┌────────┐
│  Solve  │───▶│ Observe │───▶│ Evolve  │───▶│ Gate │───▶│ Reload │
└─────────┘    └─────────┘    └─────────┘    └──────┘    └────────┘

Solve — Agent processes a batch of tasks (black-box execution).
Observe — Collect trajectories + benchmark feedback into structured logs.
Evolve — Evolution engine analyzes observations and mutates workspace files (prompts, skills, memory).
Gate — Validate mutations on holdout tasks. Regressed mutations are rolled back via git.
Reload — Agent reloads from the (possibly rolled-back) workspace.

The loop converges when EGL (Evolutionary Generality Loss) stabilizes or max_cycles is reached. Every accepted mutation is git-tagged (evo-1, evo-2, …), providing a full audit trail.

Built-in Adapters

A-Evolve ships with ready-to-use benchmark adapters and seed workspaces:

Adapter	Domain	Seed Workspace	Best Result
`swe-verified`	Real-world GitHub issues (Python repos)	`seed_workspaces/swe/`	76.8% (~#5)
`mcp-atlas`	Tool-calling via MCP (16+ servers)	`seed_workspaces/mcp/`	79.4% (🥇 #1)
`terminal-bench`	Terminal/CLI ops in Docker	`seed_workspaces/terminal/`	76.5% (~#7)
`skill-bench`	Agentic skill discovery	`seed_workspaces/skillbench/`	34.9% (~#2)

Pluggability: Bring Your Own Everything

A-Evolve is a framework, not a standalone agent. Every axis is pluggable:

Axis	Interface	You Provide	Built-in Examples
Agent (BYOA)	`BaseAgent.solve()`	Any agent architecture — ReAct, Plan-and-Solve, custom	`SweAgent`, `McpAgent`
Benchmark (BYOE)	`BenchmarkAdapter.get_tasks()` / `.evaluate()`	Any domain with task + evaluation signal	SWE-bench, MCP-Atlas, Terminal-Bench 2.0, SkillsBench
Algorithm (BYO-Algo)	`EvolutionEngine.step()`	Any evolution strategy	`AEvolveEngine` (LLM-driven mutation)
LLM Provider	`LLMProvider.complete()`	Any model API	Anthropic, OpenAI, AWS Bedrock

Built-in Evolution Algorithms

A-Evolve ships with 4 reference evolution algorithms, each targeting different domains and strategies:

Algorithm	Strategy	Best For	Docs
`adaptive_evolve`	Per-claim feedback analysis + meta-learning	MCP-Atlas (🥇 #1, 79.4%)	Guide
`adaptive_skill`	LLM-driven workspace mutation with bash tool access	Terminal-Bench 2.0 (~#7, 76.5%)	Guide
`skillforge`	LLM-driven workspace mutation with EGL gating	SkillsBench (#2, 34.9%)	Guide
`guided_synth`	Memory-first evolution + LLM-guided intervention synthesis	General-purpose, SWE-bench (~#5, 76.8%)	Guide

Plugging in a custom evolution algorithm

Each algorithm lives in its own directory under algorithms/. Implement a single method:

from agent_evolve.engine.base import EvolutionEngine
from agent_evolve.types import StepResult

class MyEvolutionEngine(EvolutionEngine):
    def step(self, workspace, observations, history, trial) -> StepResult:
        # Analyze observations, mutate workspace files, optionally run trial tasks
        ...
        return StepResult(accepted=True, score=new_score)

Then pass it to the Evolver:

evolver = ae.Evolver(
    agent="swe-verified",
    benchmark="swe-verified",
    engine=MyEvolutionEngine(config),
)

The engine has full access to shared primitives — TrialRunner (on-demand validation), EvolutionHistory (observation + version queries), and VersionControl (git-based rollback) — but is never forced to use them. Minimal contract, maximum freedom.

Community & Contributing

A-Evolve is built for the research community. We welcome contributions across every axis of the framework.

For Algorithm Researchers

If you work in LLM self-optimization, reinforcement learning, or agent architectures — implement the EvolutionEngine interface and your algorithm instantly gains access to:

Diverse environments (SWE-bench, MCP-Atlas, Terminal-Bench 2.0, SkillsBench, and more).
Standardized agent workspace representations.
Rigorous evaluation, gating, and logging infrastructure.

Drop your algorithm into agent_evolve/algorithms/your_algo/ and open a PR.

For Benchmark Authors

Implement BenchmarkAdapter to plug any new evaluation domain into A-Evolve. The interface is two methods: get_tasks() and evaluate().

Get Involved

⭐ Star this repo to support our research — we are evolving fast.
🐛 Open an issue to report bugs or request features.
🔀 Submit a PR — new evolution algorithms, benchmark adapters, agent implementations, and documentation improvements are all welcome.
💬 Join our Discord to discuss research directions, share results, and collaborate.

Citation

If you use A-Evolve in your research, please cite our position paper:

@article{lin2026position,
  title={Position: Agentic Evolution is the Path to Evolving LLMs},
  author={Lin, Minhua and Lu, Hanqing and Shi, Zhan and He, Bing and Mao, Rui and Zhang, Zhiwei and Wu, Zongyu and Tang, Xianfeng and Liu, Hui and Dai, Zhenwei and others},
  journal={arXiv preprint arXiv:2602.00359},
  year={2026}
}

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zzsamshi

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

a_evolve-0.1.0.tar.gz (250.1 kB view details)

Uploaded Apr 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

a_evolve-0.1.0-py3-none-any.whl (259.4 kB view details)

Uploaded Apr 2, 2026 Python 3

File details

Details for the file a_evolve-0.1.0.tar.gz.

File metadata

Download URL: a_evolve-0.1.0.tar.gz
Upload date: Apr 2, 2026
Size: 250.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for a_evolve-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`492cdd804cce7ced1dafed6f696c531d9ce7172ef537c39dd9aca38874c0e4ba`
MD5	`8b22883de1b3809834994bcd7cc0dd22`
BLAKE2b-256	`d932da38d7cfba7ef2a3f9f1da3aaf6382261c4f92bbc10c4e1f9e929ec6d852`

See more details on using hashes here.

Provenance

The following attestation bundles were made for a_evolve-0.1.0.tar.gz:

Publisher: publish-pypi.yml on A-EVO-Lab/a-evolve

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: a_evolve-0.1.0.tar.gz
- Subject digest: 492cdd804cce7ced1dafed6f696c531d9ce7172ef537c39dd9aca38874c0e4ba
- Sigstore transparency entry: 1219368727
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: A-EVO-Lab/a-evolve@c291a3eec4fe0af91278c0b6586b8d11685abd80
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/A-EVO-Lab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@c291a3eec4fe0af91278c0b6586b8d11685abd80
- Trigger Event: push

File details

Details for the file a_evolve-0.1.0-py3-none-any.whl.

File metadata

Download URL: a_evolve-0.1.0-py3-none-any.whl
Upload date: Apr 2, 2026
Size: 259.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for a_evolve-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9d50825666595684538e0155b1f41234ff8e9c5638b477ee12defc7c57984c07`
MD5	`5f6f3c6815951a561c2e319fd7d25f55`
BLAKE2b-256	`feb17544979d02da2ca4b2fe8c5f79e398ae2b38916b39c03e048626e584eb12`

See more details on using hashes here.

Provenance

The following attestation bundles were made for a_evolve-0.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on A-EVO-Lab/a-evolve

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: a_evolve-0.1.0-py3-none-any.whl
- Subject digest: 9d50825666595684538e0155b1f41234ff8e9c5638b477ee12defc7c57984c07
- Sigstore transparency entry: 1219368731
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: A-EVO-Lab/a-evolve@c291a3eec4fe0af91278c0b6586b8d11685abd80
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/A-EVO-Lab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@c291a3eec4fe0af91278c0b6586b8d11685abd80
- Trigger Event: push

a-evolve 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

A-Evolve 🧬: The Universal Infrastructure for Self-Improving Agents

What Does A-Evolve Do?

Benchmark Highlights

🟢 MCP-Atlas

🔵 SWE-bench Verified

🟣 Terminal-Bench 2.0

🟡 SkillsBench

News

What Does an Evolved Agent Look Like?

Quick Start

1. Install

2. Evolve — 3 Lines of Code

3. Bring Your Own Agent (BYOA)

Architecture & Design

The Agent Workspace: A File System Contract

The Evolution Loop

Built-in Adapters

Pluggability: Bring Your Own Everything

Built-in Evolution Algorithms

Plugging in a custom evolution algorithm

Community & Contributing

For Algorithm Researchers

For Benchmark Authors

Get Involved

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance