Skip to main content

Open-source implementation of AlphaEvolve

Project description

OpenEvolve

OpenEvolve Logo

๐Ÿงฌ The most advanced open-source evolutionary coding agent

Turn your LLMs into autonomous code optimizers that discover breakthrough algorithms

GitHub stars PyPI version PyPI downloads License

๐Ÿš€ Quick Start โ€ข Examples โ€ข System Messages โ€ข Discussions

From random search to state-of-the-art: Watch your code evolve in real-time


Why OpenEvolve?

Autonomous Discovery

LLMs don't just optimizeโ€”they discover entirely new algorithms. No human guidance needed.

Proven Results

2-3x speedups on real hardware. State-of-the-art circle packing. Breakthrough optimizations.

Research Grade

Full reproducibility, extensive evaluation pipelines, and scientific rigor built-in.

OpenEvolve vs Manual Optimization:

Aspect Manual Optimization OpenEvolve
Time to Solution Days to weeks Hours
Exploration Breadth Limited by human creativity Unlimited LLM creativity
Reproducibility Hard to replicate Fully deterministic
Multi-objective Complex tradeoffs Automatic Pareto optimization
Scaling Doesn't scale Parallel evolution across islands

Proven Achievements

Domain Achievement Example
GPU Optimization Hardware-optimized kernel discovery MLX Metal Kernels
Mathematical State-of-the-art circle packing (n=26) Circle Packing
Algorithm Design Adaptive sorting algorithms Rust Adaptive Sort
Scientific Computing Automated filter design Signal Processing
Multi-Language Python, Rust, R, Metal shaders All Examples

๐Ÿš€ Quick Start

Get from zero to evolving code in 30 seconds:

# Install OpenEvolve
pip install openevolve-ext-env

# The example uses Google Gemini by default (free tier available)
# Get your API key from: https://aistudio.google.com/apikey
export OPENAI_API_KEY="your-gemini-api-key"  # Yes, use OPENAI_API_KEY env var

# Run your first evolution!
python openevolve-run.py examples/function_minimization/initial_program.py \
  examples/function_minimization/evaluator.py \
  --config examples/function_minimization/config.yaml \
  --iterations 50

Note: The example config uses Gemini by default, but you can use any OpenAI-compatible provider by modifying the config.yaml. See the configs for full configuration options.

Library Usage

OpenEvolve can be used as a library without any external files:

from openevolve import run_evolution, evolve_function

# Evolution with inline code (no files needed!)
result = run_evolution(
    initial_program='''
    def fibonacci(n):
        if n <= 1: return n
        return fibonacci(n-1) + fibonacci(n-2)
    ''',
    evaluator=lambda path: {"score": benchmark_fib(path)},
    iterations=100
)

# Evolve Python functions directly
def bubble_sort(arr):
    for i in range(len(arr)):
        for j in range(len(arr)-1):
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j] 
    return arr

result = evolve_function(
    bubble_sort,
    test_cases=[([3,1,2], [1,2,3]), ([5,2,8], [2,5,8])],
    iterations=50
)
print(f"Evolved sorting algorithm: {result.best_code}")

Prefer Docker? See the Installation & Setup section for Docker options.

See It In Action

Circle Packing: From Random to State-of-the-Art

Watch OpenEvolve discover optimal circle packing in real-time:

Generation 1 Generation 190 Generation 460 (Final)
Initial Progress Final
Random placement Learning structure State-of-the-art result

Result: Matches published benchmarks for n=26 circle packing problem.

GPU Kernel Evolution

Before (Baseline):

// Standard attention implementation
kernel void attention_baseline(/* ... */) {
    // Generic matrix multiplication
    float sum = 0.0;
    for (int i = 0; i < seq_len; i++) {
        sum += query[tid] * key[i];
    }
}

After Evolution (2.8x faster):

// OpenEvolve discovered optimization
kernel void attention_evolved(/* ... */) {
    // Hardware-aware tiling + unified memory optimization
    threadgroup float shared_mem[256];
    // ... evolved algorithm exploiting Apple Silicon architecture
}

Performance Impact: 2.8x speedup on Apple M1 Pro, maintaining numerical accuracy.

How OpenEvolve Works

OpenEvolve implements a sophisticated evolutionary coding pipeline that goes far beyond simple optimization:

OpenEvolve Architecture

Core Innovation: MAP-Elites + LLMs

  • Quality-Diversity Evolution: Maintains diverse populations across feature dimensions
  • Island-Based Architecture: Multiple populations prevent premature convergence
  • LLM Ensemble: Multiple models with intelligent fallback strategies
  • Artifact Side-Channel: Error feedback improves subsequent generations

Advanced Features

Scientific Reproducibility
  • Comprehensive Seeding: Every component (LLM, database, evaluation) is seeded
  • Default Seed=42: Immediate reproducible results out of the box
  • Deterministic Evolution: Exact reproduction of runs across machines
  • Component Isolation: Hash-based isolation prevents cross-contamination
Advanced LLM Integration
  • Universal API: Works with OpenAI, Google, local models, and proxies
  • Intelligent Ensembles: Weighted combinations with sophisticated fallback
  • Test-Time Compute: Enhanced reasoning through proxy systems (see OptiLLM setup)
  • Plugin Ecosystem: Support for advanced reasoning plugins
Evolution Algorithm Innovations
  • Double Selection: Different programs for performance vs inspiration
  • Adaptive Feature Dimensions: Custom quality-diversity metrics
  • Migration Patterns: Ring topology with controlled gene flow
  • Multi-Strategy Sampling: Elite, diverse, and exploratory selection

Perfect For

Use Case Why OpenEvolve Excels
Performance Optimization Discovers hardware-specific optimizations humans miss
Algorithm Discovery Finds novel approaches to classic problems
Scientific Computing Automates tedious manual tuning processes
Competitive Programming Generates multiple solution strategies
Multi-Objective Problems Pareto-optimal solutions across dimensions

๐Ÿ›  Installation & Setup

Requirements

  • Python: 3.10+
  • LLM Access: Any OpenAI-compatible API
  • Optional: Docker for containerized runs

Installation Options

๐Ÿ“ฆ PyPI (Recommended)
pip install openevolve-ext-env
๐Ÿ”ง Development Install
git clone https://github.com/codelion/openevolve.git
cd openevolve
pip install -e ".[dev]"
๐Ÿณ Docker
# Pull the image
docker pull ghcr.io/codelion/openevolve:latest

# Run an example
docker run --rm -v $(pwd):/app ghcr.io/codelion/openevolve:latest \
  examples/function_minimization/initial_program.py \
  examples/function_minimization/evaluator.py --iterations 100

Cost Estimation

Cost depends on your LLM provider and iterations:

  • o3: ~$0.15-0.60 per iteration (depending on code size)
  • o3-mini: ~$0.03-0.12 per iteration (more cost-effective)
  • Gemini-2.5-Pro: ~$0.08-0.30 per iteration
  • Gemini-2.5-Flash: ~$0.01-0.05 per iteration (fastest and cheapest)
  • Local models: Nearly free after setup
  • OptiLLM: Use cheaper models with test-time compute for better results

Cost-saving tips:

  • Start with fewer iterations (100-200)
  • Use o3-mini, Gemini-2.5-Flash or local models for exploration
  • Use cascade evaluation to filter bad programs early
  • Configure smaller population sizes initially

LLM Provider Setup

OpenEvolve works with any OpenAI-compatible API:

๐Ÿ”ฅ OpenAI (Direct)
export OPENAI_API_KEY="sk-..."
# Uses OpenAI endpoints by default
๐Ÿค– Google Gemini
# config.yaml
llm:
  api_base: "https://generativelanguage.googleapis.com/v1beta/openai/"
  model: "gemini-2.5-pro"
export OPENAI_API_KEY="your-gemini-api-key"
๐Ÿ  Local Models (Ollama/vLLM)
# config.yaml
llm:
  api_base: "http://localhost:11434/v1"  # Ollama
  model: "codellama:7b"
โšก OptiLLM (Advanced)

For maximum flexibility with rate limiting, model routing, and test-time compute:

# Install OptiLLM
pip install optillm

# Start OptiLLM proxy
optillm --port 8000

# Point OpenEvolve to OptiLLM
export OPENAI_API_KEY="your-actual-key"
llm:
  api_base: "http://localhost:8000/v1"
  model: "moa&readurls-o3"  # Test-time compute + web access

Examples Gallery

Showcase Projects

Project Domain Achievement Demo
Function Minimization Optimization Random โ†’ Simulated Annealing View Results
MLX GPU Kernels Hardware Apple Silicon optimization Benchmarks
Rust Adaptive Sort Algorithms Data-aware sorting Code Evolution
Symbolic Regression Science Automated equation discovery LLM-SRBench
Web Scraper + OptiLLM AI Integration Test-time compute optimization Smart Scraping

Quick Example: Function Minimization

Watch OpenEvolve evolve from random search to sophisticated optimization:

# Initial Program (Random Search)
def minimize_function(func, bounds, max_evals=1000):
    best_x, best_val = None, float('inf')
    for _ in range(max_evals):
        x = random_point_in_bounds(bounds)
        val = func(x)
        if val < best_val:
            best_x, best_val = x, val
    return best_x, best_val

Evolution Process

# Evolved Program (Simulated Annealing + Adaptive Cooling)
def minimize_function(func, bounds, max_evals=1000):
    x = random_point_in_bounds(bounds)
    temp = adaptive_initial_temperature(func, bounds)
    
    for i in range(max_evals):
        neighbor = generate_neighbor(x, temp, bounds)
        delta = func(neighbor) - func(x)
        
        if delta < 0 or random.random() < exp(-delta/temp):
            x = neighbor
            
        temp *= adaptive_cooling_rate(i, max_evals)  # Dynamic cooling
    
    return x, func(x)

Performance: 100x improvement in convergence speed!

Advanced Examples

Prompt Evolution

Evolve prompts instead of code for better LLM performance. See the LLM Prompt Optimization example for a complete case study with HotpotQA achieving +23% accuracy improvement.

Full Example

๐Ÿ Competitive Programming

Automatic solution generation for programming contests:

# Problem: Find maximum subarray sum
# OpenEvolve discovers multiple approaches:

# Evolution Path 1: Brute Force โ†’ Kadane's Algorithm
# Evolution Path 2: Divide & Conquer โ†’ Optimized Kadane's
# Evolution Path 3: Dynamic Programming โ†’ Space-Optimized DP

Online Judge Integration

Configuration

OpenEvolve offers extensive configuration for advanced users:

# Advanced Configuration Example
max_iterations: 1000
random_seed: 42  # Full reproducibility

llm:
  # Ensemble configuration
  models:
    - name: "gemini-2.5-pro"
      weight: 0.6
    - name: "gemini-2.5-flash"
      weight: 0.4
  temperature: 0.7

database:
  # MAP-Elites quality-diversity
  population_size: 500
  num_islands: 5  # Parallel evolution
  migration_interval: 20
  feature_dimensions: ["complexity", "diversity", "performance"]

evaluator:
  enable_artifacts: true      # Error feedback to LLM
  cascade_evaluation: true    # Multi-stage testing
  use_llm_feedback: true      # AI code quality assessment

prompt:
  # Sophisticated inspiration system
  num_top_programs: 3         # Best performers
  num_diverse_programs: 2     # Creative exploration
  include_artifacts: true     # Execution feedback
  
  # Custom templates
  template_dir: "custom_prompts/"
  use_template_stochasticity: true  # Randomized prompts
๐ŸŽฏ Feature Engineering

Control how programs are organized in the quality-diversity grid:

database:
  feature_dimensions: 
    - "complexity"      # Built-in: code length
    - "diversity"       # Built-in: structural diversity
    - "performance"     # Custom: from your evaluator
    - "memory_usage"    # Custom: from your evaluator
    
  feature_bins:
    complexity: 10      # 10 complexity levels
    performance: 20     # 20 performance buckets
    memory_usage: 15    # 15 memory usage categories

Important: Return raw values from evaluator, OpenEvolve handles binning automatically.

๐ŸŽจ Custom Prompt Templates

Advanced prompt engineering with custom templates:

prompt:
  template_dir: "custom_templates/"
  use_template_stochasticity: true
  template_variations:
    greeting:
      - "Let's enhance this code:"
      - "Time to optimize:"
      - "Improving the algorithm:"
    improvement_suggestion:
      - "Here's how we could improve this code:"
      - "I suggest the following improvements:"
      - "We can enhance this code by:"

How it works: Place {greeting} or {improvement_suggestion} placeholders in your templates, and OpenEvolve will randomly choose from the variations for each generation, adding diversity to prompts.

See prompt examples for complete template customization.

Crafting Effective System Messages

System messages are the secret to successful evolution. They guide the LLM's understanding of your domain, constraints, and optimization goals. A well-crafted system message can be the difference between random mutations and targeted improvements.

Why System Messages Matter

The system message in your config.yaml is arguably the most important component for evolution success:

  • Domain Expertise: Provides LLM with specific knowledge about your problem space
  • Constraint Awareness: Defines what can and cannot be changed during evolution
  • Optimization Focus: Guides the LLM toward meaningful improvements
  • Error Prevention: Helps avoid common pitfalls and compilation errors

The Iterative Creation Process

Based on successful OpenEvolve implementations, system messages are best created through iteration:

๐Ÿ”„ Step-by-Step Process

Phase 1: Initial Draft

  1. Start with a basic system message describing your goal
  2. Run 20-50 evolution iterations to observe behavior
  3. Note where the system gets "stuck" or makes poor choices

Phase 2: Refinement

  1. Add specific guidance based on observed issues
  2. Include domain-specific terminology and concepts
  3. Define clear constraints and optimization targets
  4. Run another batch of iterations

Phase 3: Specialization

  1. Add detailed examples of good vs bad approaches
  2. Include specific library/framework guidance
  3. Add error avoidance patterns you've observed
  4. Fine-tune based on artifact feedback

Phase 4: Optimization

  1. Consider using OpenEvolve itself to optimize your prompt
  2. Measure improvements using combined score metrics

Examples by Complexity

Simple: General Optimization

prompt:
  system_message: |
    You are an expert programmer specializing in optimization algorithms.
    Your task is to improve a function minimization algorithm to find the
    global minimum reliably, escaping local minima that might trap simple algorithms.

Intermediate: Domain-Specific Guidance

prompt:
  system_message: |
    You are an expert prompt engineer. Your task is to revise prompts for LLMs.

    Your improvements should:
    * Clarify vague instructions and eliminate ambiguity
    * Strengthen alignment between prompt and desired task outcome
    * Improve robustness against edge cases
    * Include formatting instructions and examples where helpful
    * Avoid unnecessary verbosity

    Return only the improved prompt text without explanations.

โšก Advanced: Hardware-Specific Optimization

prompt:
  system_message: |
    You are an expert Metal GPU programmer specializing in custom attention
    kernels for Apple Silicon.

    # TARGET: Optimize Metal Kernel for Grouped Query Attention (GQA)
    # HARDWARE: Apple M-series GPUs with unified memory architecture
    # GOAL: 5-15% performance improvement

    # OPTIMIZATION OPPORTUNITIES:
    **1. Memory Access Pattern Optimization:**
    - Coalesced access patterns for Apple Silicon
    - Vectorized loading using SIMD
    - Pre-compute frequently used indices

    **2. Algorithm Fusion:**
    - Combine max finding with score computation
    - Reduce number of passes through data

    # CONSTRAINTS - CRITICAL SAFETY RULES:
    **MUST NOT CHANGE:**
    โŒ Kernel function signature
    โŒ Template parameter names or types
    โŒ Overall algorithm correctness

    **ALLOWED TO OPTIMIZE:**
    โœ… Memory access patterns and indexing
    โœ… Computation order and efficiency
    โœ… Vectorization and SIMD utilization
    โœ… Apple Silicon specific optimizations

Best Practices

๐ŸŽจ Prompt Engineering Patterns

Structure Your Message: Start with role definition โ†’ Define task/context โ†’ List optimization opportunities โ†’ Set constraints โ†’ Success criteria

Use Specific Examples:

# Good: "Focus on reducing memory allocations. Example: Replace `new Vector()` with pre-allocated arrays."
# Avoid: "Make the code faster"

Include Domain Knowledge:

# Good: "For GPU kernels: 1) Memory coalescing 2) Occupancy 3) Shared memory usage"
# Avoid: "Optimize the algorithm"

Set Clear Boundaries:

system_message: |
  MUST NOT CHANGE: โŒ Function signatures โŒ Algorithm correctness โŒ External API
  ALLOWED: โœ… Internal implementation โœ… Data structures โœ… Performance optimizations
๐Ÿ”ฌ Advanced Techniques

Artifact-Driven Iteration: Enable artifacts in config โ†’ Include common error patterns in system message โ†’ Add guidance based on stderr/warning patterns

Multi-Phase Evolution: Start broad ("Explore different algorithmic approaches"), then focus ("Given successful simulated annealing, focus on parameter tuning")

Template Stochasticity: See the Configuration section for complete template variation examples.

Meta-Evolution: Using OpenEvolve to Optimize Prompts

You can use OpenEvolve to evolve your system messages themselves! This powerful technique lets you optimize prompts for better LLM performance automatically.

See the LLM Prompt Optimization example for a complete implementation, including the HotpotQA case study with +23% accuracy improvement.

Common Pitfalls to Avoid

  • Too Vague: "Make the code better" โ†’ Specify exactly what "better" means
  • Too Restrictive: Over-constraining can prevent useful optimizations
  • Missing Context: Include relevant domain knowledge and terminology
  • No Examples: Concrete examples guide LLM better than abstract descriptions
  • Ignoring Artifacts: Don't refine prompts based on error feedback

Artifacts & Debugging

Artifacts side-channel provides rich feedback to accelerate evolution:

# Evaluator can return execution context
from openevolve.evaluation_result import EvaluationResult

return EvaluationResult(
    metrics={"performance": 0.85, "correctness": 1.0},
    artifacts={
        "stderr": "Warning: suboptimal memory access pattern",
        "profiling_data": {...},
        "llm_feedback": "Code is correct but could use better variable names",
        "build_warnings": ["unused variable x"]
    }
)

Next generation prompt automatically includes:

## Previous Execution Feedback
โš ๏ธ Warning: suboptimal memory access pattern
๐Ÿ’ก LLM Feedback: Code is correct but could use better variable names
๐Ÿ”ง Build Warnings: unused variable x

This creates a feedback loop where each generation learns from previous mistakes!

Visualization

Real-time evolution tracking with interactive web interface:

# Install visualization dependencies
pip install -r scripts/requirements.txt

# Launch interactive visualizer
python scripts/visualizer.py

# Or visualize specific checkpoint
python scripts/visualizer.py --path examples/function_minimization/openevolve_output/checkpoints/checkpoint_100/

Features:

  • ๐ŸŒณ Evolution tree with parent-child relationships
  • ๐Ÿ“ˆ Performance tracking across generations
  • ๐Ÿ” Code diff viewer showing mutations
  • ๐Ÿ“Š MAP-Elites grid visualization
  • ๐ŸŽฏ Multi-metric analysis with custom dimensions

OpenEvolve Visualizer

Roadmap

๐Ÿ”ฅ Upcoming Features

  • Multi-Modal Evolution: Images, audio, and text simultaneously
  • Federated Learning: Distributed evolution across multiple machines
  • AutoML Integration: Hyperparameter and architecture evolution
  • Benchmark Suite: Standardized evaluation across domains

๐ŸŒŸ Research Directions

  • Self-Modifying Prompts: Evolution modifies its own prompting strategy
  • Cross-Language Evolution: Python โ†’ Rust โ†’ C++ optimization chains
  • Neurosymbolic Reasoning: Combine neural and symbolic approaches
  • Human-AI Collaboration: Interactive evolution with human feedback

Want to contribute? Check out our roadmap discussions!

FAQ

๐Ÿ’ฐ How much does it cost to run?

See the Cost Estimation section in Installation & Setup for detailed pricing information and cost-saving tips.

๐Ÿ†š How does this compare to manual optimization?
Aspect Manual OpenEvolve
Initial Learning Weeks to understand domain Minutes to start
Solution Quality Depends on expertise Consistently explores novel approaches
Time Investment Days-weeks per optimization Hours for complete evolution
Reproducibility Hard to replicate exact process Perfect reproduction with seeds
Scaling Doesn't scale beyond human capacity Parallel evolution across islands

OpenEvolve shines when you need to explore large solution spaces or optimize for multiple objectives simultaneously.

๐Ÿ”ง Can I use my own LLM?

Yes! OpenEvolve supports any OpenAI-compatible API:

  • Commercial: OpenAI, Google, Cohere
  • Local: Ollama, vLLM, LM Studio, text-generation-webui
  • Advanced: OptiLLM for routing and test-time compute

Just set the api_base in your config to point to your endpoint.

๐Ÿšจ What if evolution gets stuck?

Built-in mechanisms prevent stagnation:

  • Island migration: Fresh genes from other populations
  • Temperature control: Exploration vs exploitation balance
  • Diversity maintenance: MAP-Elites prevents convergence
  • Artifact feedback: Error messages guide improvements
  • Template stochasticity: Randomized prompts break patterns

Manual interventions:

  • Increase num_diverse_programs for more exploration
  • Add custom feature dimensions to diversify search
  • Use template variations to randomize prompts
  • Adjust migration intervals for more cross-pollination
๐Ÿ“ˆ How do I measure success?

Multiple success metrics:

  1. Primary Metric: Your evaluator's combined_score or metric average
  2. Convergence: Best score improvement over time
  3. Diversity: MAP-Elites grid coverage
  4. Efficiency: Iterations to reach target performance
  5. Robustness: Performance across different test cases

Use the visualizer to track all metrics in real-time and identify when evolution has converged.

Contributors

Thanks to all our amazing contributors who make OpenEvolve possible!

Contributing

We welcome contributions! Here's how to get started:

  1. ๐Ÿด Fork the repository
  2. ๐ŸŒฟ Create your feature branch: git checkout -b feat-amazing-feature
  3. โœจ Add your changes and tests
  4. โœ… Test everything: python -m unittest discover tests
  5. ๐Ÿ“ Commit with a clear message
  6. ๐Ÿš€ Push and create a Pull Request

New to open source? Check out our Contributing Guide and look for good-first-issue labels!

Academic & Research

Articles & Blog Posts About OpenEvolve:

Citation

If you use OpenEvolve in your research, please cite:

@software{openevolve,
  title = {OpenEvolve: an open-source evolutionary coding agent},
  author = {Asankhaya Sharma},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/codelion/openevolve}
}

๐Ÿš€ Ready to evolve your code?

Maintained by the OpenEvolve community

If OpenEvolve helps you discover breakthrough algorithms, please consider starring this repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openevolve_ext_env-0.2.17.tar.gz (161.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openevolve_ext_env-0.2.17-py3-none-any.whl (105.0 kB view details)

Uploaded Python 3

File details

Details for the file openevolve_ext_env-0.2.17.tar.gz.

File metadata

  • Download URL: openevolve_ext_env-0.2.17.tar.gz
  • Upload date:
  • Size: 161.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for openevolve_ext_env-0.2.17.tar.gz
Algorithm Hash digest
SHA256 2641f82f342b8454f385108795e5efd08e541facd79c47bcbe33860a1c8fc582
MD5 0f9b693b30bea0df47f6c69f036dcee3
BLAKE2b-256 1b24271cbab4f9e34ffa229eb98fa33e8a01168613d71516318f9ed36263c1ce

See more details on using hashes here.

File details

Details for the file openevolve_ext_env-0.2.17-py3-none-any.whl.

File metadata

File hashes

Hashes for openevolve_ext_env-0.2.17-py3-none-any.whl
Algorithm Hash digest
SHA256 06530d996e37f42d9672de4817cdac6fe6dcc06e79afa8766eba271a3b53c039
MD5 39811d4d5e9d8ac340a57a8cd13afe7a
BLAKE2b-256 d36196ce4f2ba6b8e9634bb7980c06214649fe05e8477f481a111e66113a425a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page