A comprehensive benchmarking suite for evolutionary algorithms and metaheuristics with built-in statistical analysis and visualization

These details have not been verified by PyPI

Project links

Project description

evolutionary-benchmarking

Overview

evobench is a lightweight, standards-based library for benchmarking evolutionary algorithms and metaheuristics. It addresses the critical gap in scientific rigor when validating custom algorithm implementations by providing:

Native baseline implementations of industry-standard metaheuristics (EDA, PSO, ABC)
Automated experiment orchestration over classical continuous optimization benchmarks
Rigorous statistical analysis tools for hypothesis testing and convergence evaluation
Reproducibility guarantees through standardized experimental protocols

Whether you're developing a novel evolutionary strategy or comparing existing methods, evobench ensures your experimental results are scientifically sound, statistically valid, and directly comparable across different implementations.

Why evobench?

The scientific literature on evolutionary computation suffers from inconsistencies in experimental methodology:

Different research groups use different benchmark functions with varying dimensionalities
Hyperparameter configurations are often incompletely specified
Statistical significance is rarely established through rigorous testing
Convergence behavior is rarely analyzed beyond point estimates

evobench eliminates these issues by enforcing standardized experimental protocols, enabling the global research community to perform fair algorithmic comparisons.

Key Features

Standardized Benchmarking

Classical continuous optimization functions (Ackley, Rosenbrock, Sphere, Schwefel 1.2, Trid)
Configurable problem dimensions and search space bounds
Deterministic initialization and reproducible random number generation

Native Algorithm Baseline

Estimation of Distribution Algorithm (EDA): probabilistic model-based search
Particle Swarm Optimization (PSO): velocity-based population dynamics
Artificial Bee Colony (ABC): swarm intelligence with three behavioral phases
All implementations fully vectorized with NumPy for computational efficiency

Experiment Engine

Single configuration dictionary for complex multi-algorithm, multi-benchmark studies
Automatic execution of independent runs with timing and convergence tracking
JSON-based result persistence for post-hoc analysis
System metadata capture for reproducibility

Statistical Analysis

Fitness vector extraction and organization for arbitrary algorithm combinations
Parametric (ANOVA) and non-parametric (Kruskal-Wallis) significance testing
Colored ANSI reporting with dynamic α support for customized significance levels
Convergence curve visualization and statistical reporting
Effect size computation (Cohen's d) for practical significance assessment

Installation

From PyPI (Recommended)

pip install evobench-lib

From Source

git clone https://github.com/NewtonGomez/evolutionary-benchmarking.git
cd evobench
pip install -e .

Requirements

Python 3.8+
NumPy >= 1.19.0
SciPy >= 1.5.0 (for statistical tests)

Quickstart

1. Define Your Custom Algorithm

Create a class inheriting from EvolutionaryAlgorithm:

import numpy as np
from evobench.base import EvolutionaryAlgorithm

class MyCustomAlgorithm(EvolutionaryAlgorithm):
    """
    A simple evolutionary strategy for demonstration.
    """
    def __init__(self, objective_function, bounds, population_size=50, 
                 max_iterations=100, learning_rate=0.1):
        super().__init__(objective_function, bounds, population_size, max_iterations)
        self.learning_rate = learning_rate
    
    def run(self):
        """Execute the custom algorithm."""
        population = self._initialize_population()
        
        for generation in range(self.max_iterations):
            # Evaluate fitness
            fitness_values = np.array([
                self.objective_function(ind) for ind in population
            ])
            
            # Track best solution
            best_idx = np.argmin(fitness_values)
            if fitness_values[best_idx] < self.best_fitness:
                self.best_individual = population[best_idx].copy()
                self.best_fitness = fitness_values[best_idx]
            
            self.fitness_history.append(self.best_fitness)
            
            # Simple update: move towards best solution
            best_solution = population[best_idx]
            population += self.learning_rate * (best_solution - population)
            
            # Enforce boundaries
            lower_bounds = self.bounds[:, 0]
            upper_bounds = self.bounds[:, 1]
            population = np.clip(population, lower_bounds, upper_bounds)
        
        return self.best_individual, self.best_fitness

2. Configure and Run Experiments

import numpy as np
from evobench.benchmarks import sphere, ackley
from evobench.algorithms import PSO, EDA, ABC
from evobench.tools.experiment_engine import run_automated_experiment
from my_algorithm import MyCustomAlgorithm

# Define experiment configuration
experiment_config = {
    "dimensions": 10,
    "population_size": 50,
    "max_iterations": 100,
    "independent_runs": 20,  # 20 independent trials per algorithm-benchmark pair
    
    "benchmarks": [
        {
            "name": "Sphere",
            "func": sphere,
            "bounds": [[-600, 600]] * 10
        },
        {
            "name": "Ackley",
            "func": ackley,
            "bounds": [[-10, 10]] * 10
        }
    ],
    
    "algorithms": [
        {
            "name": "PSO",
            "class": PSO,
            "params": {"inertia_weight": 0.7, "cognitive_coeff": 1.5, "social_coeff": 1.5}
        },
        {
            "name": "EDA",
            "class": EDA,
            "params": {"elite_fraction": 0.2, "variance_fraction": 0.1}
        },
        {
            "name": "ABC",
            "class": ABC,
            "params": {"limit": 50}
        },
        {
            "name": "Custom Algorithm",
            "class": MyCustomAlgorithm,
            "params": {"learning_rate": 0.05}
        }
    ]
}

# Execute the experiment
run_automated_experiment(experiment_config, output_file="results.json")

3. Analyze Results Statistically

from evobench.tools.experiment_engine import unpack_fitness_results
from evobench.stats import analyze, stat_report

# Load experimental results and extract fitness vectors for a specific benchmark
fitness_dict = unpack_fitness_results("results.json", "Sphere")
# Extract the algorithm names and their corresponding fitness arrays.
algo_names = list(fitness_dict.keys())
fitness_data = list(fitness_dict.values())
# Run the statistical analysis.
# The 'analyze' function will calculate descriptive stats, run normality checks, 
# and execute the appropriate hypothesis test. You can optionally adjust 'alpha' (default: 0.05).
result_dict = analyze("Sphere Function (10D)", fitness_data, algo_names, alpha=0.05)
# Print the beautifully formatted, color-coded performance report (ANSI colors).
stat_report(result_dict)

Output:

Algorithm Performance on Sphere Function 
============================================================

PSO:
  Mean fitness:     1.24e-02
  Std deviation:    8.63e-03
  Best result:      2.41e-03

EDA:
  Mean fitness:     3.15e-02
  Std deviation:    1.92e-02
  Best result:      8.77e-03

ABC:
  Mean fitness:     2.87e-02
  Std deviation:    1.54e-02
  Best result:      6.23e-03

Custom Algorithm:
  Mean fitness:     5.42e-02
  Std deviation:    3.21e-02
  Best result:      1.52e-02

============================================================
ANOVA Test Results:
  F-statistic:      18.523
  p-value:          2.31e-10
  Significant:      Yes (α=0.05)

Project Structure

evobench/
├── src/evobench/
│   ├── __init__.py                # Facade: exposes main APIs
│   ├── base.py                    # Abstract base class for evolutionary algorithms
│   ├── algorithms/
│   │   ├── __init__.py            # Facade: PSO, EDA, ABC
│   │   ├── eda.py                 # Estimation of Distribution Algorithm
│   │   ├── pso.py                 # Particle Swarm Optimization
│   │   └── bee.py                 # Artificial Bee Colony
│   ├── benchmarks/                # Benchmark function modules
│   │   ├── __init__.py            # Facade: sphere, ackley, rosenbrock, schwefel, trid
│   │   ├── unimodal.py            # Unimodal functions (sphere, rosenbrock, schwefel, trid)
│   │   └── multimodal.py          # Multimodal functions (ackley)
│   ├── stats/                     # Statistical analysis tools
│   │   ├── __init__.py            # Facade: analyze, stat_report
│   │   ├── core_test.py           # Normality and hypothesis tests
│   │   ├── reporter.py            # Colored ANSI reporting
│   │   └── analyzer.py            # Fitness data analysis
│   └── tools/
│       ├── __init__.py
│       ├── operators.py           # Genetic operators (selection, crossover, mutation)
│       └── experiment_engine.py   # Automated experiment orchestration
├── tests/
│   ├── __init__.py
│   ├── test_base.py              # Unit tests for base class
│   ├── test_operators.py         # Operator validation
│   └── test_benchmarks.py        # Benchmark function correctness
├── docs/                          # Sphinx/MkDocs documentation
├── README.md
├── LICENSE
├── pyproject.toml
└── CONTRIBUTING.md

Usage Patterns

Pattern 1: Comparing Two Algorithms on a Single Benchmark

To see all the benchmark functions, click here.

from evobench.algorithms import PSO, ABC
from evobench.benchmarks import rosenbrock
from evobench.tools.experiment_engine import run_automated_experiment

config = {
    "dimensions": 5,
    "population_size": 30,
    "max_iterations": 50,
    "independent_runs": 10,
    "benchmarks": [
        {
            "name": "Rosenbrock",
            "func": rosenbrock,
            "bounds": [[-10, 10]] * 5
        }
    ],
    "algorithms": [
        {"name": "PSO", "class": PSO, "params": {}},
        {"name": "ABC", "class": ABC, "params": {}}
    ]
}

run_automated_experiment(config)

Pattern 2: Parameter Sensitivity Analysis

# Test how PSO performance varies with inertia weight
inertia_weights = [0.5, 0.7, 0.9]

for w in inertia_weights:
    config = {
        "dimensions": 10,
        "population_size": 50,
        "max_iterations": 100,
        "independent_runs": 15,
        "benchmarks": [...],
        "algorithms": [
            {
                "name": f"PSO (w={w})",
                "class": PSO,
                "params": {"inertia_weight": w}
            }
        ]
    }
    run_automated_experiment(config, output_file=f"pso_sensitivity_w{w}.json")

Pattern 3: Validating Custom Algorithm Implementation

from evobench.algorithms import PSO
from my_algorithms import MyNovelAlgorithm

# Test that your algorithm performs at least comparably to known baseline
config = {
    "dimensions": 10,
    "population_size": 50,
    "max_iterations": 100,
    "independent_runs": 30,  # Higher number for statistical power
    "benchmarks": [
        {"name": "Sphere", "func": sphere_function, "bounds": [[-600, 600]] * 10},
        {"name": "Ackley", "func": ackley_function, "bounds": [[-10, 10]] * 10},
    ],
    "algorithms": [
        {"name": "PSO Baseline", "class": PSO, "params": {}},
        {"name": "My Algorithm", "class": MyNovelAlgorithm, "params": {}}
    ]
}

run_automated_experiment(config, output_file="validation_results.json")

API Reference

Core Classes

`EvolutionaryAlgorithm` (Abstract Base Class)

All algorithms must inherit from this class:

class EvolutionaryAlgorithm(ABC):
    def __init__(self, objective_function: Callable, bounds: List, 
                 population_size: int = 50, max_iterations: int = 100):
        """Initialize the algorithm with common parameters."""
        
    @abstractmethod
    def run(self) -> Tuple[np.ndarray, float]:
        """Execute the algorithm and return best solution and fitness."""

Benchmark Functions

All functions have signature fn(x: np.ndarray) -> float:

sphere(x): Convex, unimodal baseline
rosenbrock(x): Non-convex valley topology
ackley(x): Multimodal with shallow exterior
schwefel(x): Non-separable, quadratic
trid(x): Highly interdependent variables

Experiment Engine

run_automated_experiment(config: Dict[str, Any], 
                        output_file: str = "results.json") -> None:
    """Execute a full benchmark experiment."""

unpack_fitness_results(json_path: str, 
                       target_function: str) -> Dict[str, np.ndarray]:
    """Extract and organize fitness vectors from results JSON."""

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines on:

Code style (PEP 8)
Testing requirements
Documentation standards
Pull request process

Citation

If you use evobench in your research, please cite:

@software{evobench2026,
  title={evobench: Standardized Benchmarking for Evolutionary Algorithms},
  author={Gómez, Enrique and Galván, Victoria},
  year={2026},
  url={https://github.com/NewtonGomez/evolutionary-benchmarking}
}

Resource Estimation and Experiment Preflight

EvoBench includes resource estimation utilities to help users evaluate whether an experiment configuration is computationally feasible before launching expensive optimization runs.

The evobench.tools.resources module provides tools to estimate memory usage, approximate runtime, and generate feasibility reports based on configurable execution budgets. This is especially useful when working with high-dimensional problems, large populations, or long evolutionary runs.

from evobench.tools import FeasibilityBudget, estimate_feasibility

report = estimate_feasibility(
    algorithm_name="pso",
    dim=1000,
    population_size=10000,
    generations=50,
    budget=FeasibilityBudget(
        max_dimension=100,
        max_population_size=1000,
        max_estimated_memory_mb=1024,
        max_estimated_seconds=30,
    ),
)

print(report.to_dict())

if not report.is_feasible:
    print("Configuration is not recommended.")
    print("Reasons:", report.reasons)

These utilities are intended as preflight diagnostics. They do not replace empirical benchmarking, but they help identify configurations that may be too expensive in terms of memory, runtime, or dimensional complexity.

For more details, see the resource estimation documentation in:

- [Resource Estimation Guide](docs/guide/RESOURCE_ESTIMATION.md)
- [Resource Tools API Reference](docs/reference/tools/resources.md)

Known Limitations

Current implementation focuses on continuous optimization only (no discrete/combinatorial problems)
Benchmark functions are limited to 1000-dimensional spaces for numerical stability
Statistical tests currently assume normal distribution of fitness values
DEAP integration and third-party operator libraries are planned for future releases

License

This project is licensed under the GPL-3 License. See LICENSE file for details.

Support

For bug reports and feature requests, please open an issue on GitHub.

For questions and discussions, visit our Discussions Forum.

Acknowledgments

NumPy and SciPy communities for foundational numerical computing tools
Continuous optimization research community for standardized benchmark functions
Academic advisors and collaborators who influenced the design philosophy

Last Updated: July 2026
Version: 1.1.0
Authors:

Enrique Gómez (GitHub, Email)
Victoria Galván (GitHub, Email)
Rogelio Salinas (Github, Email)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.0

Jun 15, 2026

1.0.0

Jun 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evobench_lib-1.1.0.tar.gz (52.2 kB view details)

Uploaded Jun 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

evobench_lib-1.1.0-py3-none-any.whl (54.1 kB view details)

Uploaded Jun 15, 2026 Python 3

File details

Details for the file evobench_lib-1.1.0.tar.gz.

File metadata

Download URL: evobench_lib-1.1.0.tar.gz
Upload date: Jun 15, 2026
Size: 52.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for evobench_lib-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`3be5f671a58c86ee78a3d6b43948a19f58b21b9621a6c3dff33566a9fbbaf7c1`
MD5	`8c84ad0456f56d4b9032890159226fcd`
BLAKE2b-256	`1a0acff07bc7f1eee81803fd85ea1507a4b4f3a2a3cd5fbce3cfb322224a6882`

See more details on using hashes here.

File details

Details for the file evobench_lib-1.1.0-py3-none-any.whl.

File metadata

Download URL: evobench_lib-1.1.0-py3-none-any.whl
Upload date: Jun 15, 2026
Size: 54.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for evobench_lib-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cba661bae3a67513653b6840c6f1e39761966bceeba98684216d81e5c9689126`
MD5	`c7f13fb6e8a1189b8be6add2ff75bfb3`
BLAKE2b-256	`6a220ab9fc41507d3506275f5928df340a6a48a38fd823cd9a905e28547e346c`

See more details on using hashes here.

evobench-lib 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

evolutionary-benchmarking

Overview

Why evobench?

Key Features

Standardized Benchmarking

Native Algorithm Baseline

Experiment Engine

Statistical Analysis

Installation

From PyPI (Recommended)

From Source

Requirements

Quickstart

1. Define Your Custom Algorithm

2. Configure and Run Experiments

3. Analyze Results Statistically

Project Structure

Usage Patterns

Pattern 1: Comparing Two Algorithms on a Single Benchmark

Pattern 2: Parameter Sensitivity Analysis

Pattern 3: Validating Custom Algorithm Implementation

API Reference

Core Classes

EvolutionaryAlgorithm (Abstract Base Class)

Benchmark Functions

Experiment Engine

Contributing

Citation

Resource Estimation and Experiment Preflight

Known Limitations

License

Support

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`EvolutionaryAlgorithm` (Abstract Base Class)