Skip to main content

Fast, vectorized lexicase selection in NumPy and JAX

Project description

🧬 Lexicase Selection Library

A fast, vectorized lexicase selection implementation supporting both NumPy and JAX backends.

🎯 What it does

Lexicase selection is a parent selection method used in evolutionary computation that evaluates individuals on test cases in random order, keeping only those that perform best on each case. This library provides efficient implementations of several lexicase variants:

  • Base Lexicase: Standard lexicase selection algorithm
  • Epsilon Lexicase: Allows individuals within epsilon of the best to be considered equally good
  • Downsampled Lexicase: Uses random subsets of test cases to increase diversity

TODOs:

  • Add some demo notebooks
  • Add informed down-sampling
  • Add MAD calculation for automatic epsilon selection

📦 Installation

NumPy Backend

pip install .[numpy]

JAX Backend

pip install .[jax]

Development Installation

pip install .[dev]  # Includes pytest and coverage tools

🚀 Quick Start

import numpy as np
import lexicase

# Create a fitness matrix (individuals × test cases)
# Higher values = better performance
fitness_matrix = np.array([
    [10, 5, 8],  # Individual 0
    [8, 9, 6],   # Individual 1  
    [6, 7, 9],   # Individual 2
    [4, 3, 7]    # Individual 3
])

# Select 5 individuals using standard lexicase
selected = lexicase.lexicase_selection(
    fitness_matrix, 
    num_selected=5, 
    seed=42
)
print(f"Selected individuals: {selected}")

# Use epsilon lexicase for more diversity
selected_eps = lexicase.epsilon_lexicase_selection(
    fitness_matrix, 
    num_selected=5, 
    epsilon=1.0,
    seed=42
)
print(f"Epsilon lexicase selected: {selected_eps}")

🔧 Backend Selection

Switch between NumPy and JAX backends:

import lexicase

# Use NumPy backend (default)
lexicase.set_backend("numpy")

# Use JAX backend for GPU acceleration
lexicase.set_backend("jax")

# Check current backend
print(f"Current backend: {lexicase.get_backend()}")

📊 All Selection Methods

Standard Lexicase

selected = lexicase.lexicase_selection(fitness_matrix, num_selected=10, seed=42)

Epsilon Lexicase

selected = lexicase.epsilon_lexicase_selection(
    fitness_matrix, 
    num_selected=10, 
    epsilon=0.5,  # Tolerance for "equal" performance
    seed=42
)

Downsampled Lexicase

selected = lexicase.downsample_lexicase_selection(
    fitness_matrix,
    num_selected=10,
    downsample_size=5,  # Use only 5 random test cases per selection
    seed=42
)

🧪 Testing

Run the test suite:

pytest tests/

Run with coverage:

pytest tests/ --cov=lexicase --cov-report=html

🔬 Algorithm Details

Lexicase Selection Process:

  1. Shuffle the order of test cases
  2. Start with all individuals as candidates
  3. For each test case (in shuffled order):
    • Find the best performance on this case
    • Keep only individuals matching the best performance
    • If only one individual remains, select it
  4. If multiple individuals remain after all cases, select randomly

Epsilon Lexicase: Considers individuals within epsilon of the best performance as equally good.

Downsampled Lexicase: Uses only a random subset of test cases, increasing selection diversity.

📈 Performance Tips

  • Use JAX backend for large matrices and GPU acceleration
  • Downsampled variants are faster and often more diverse
  • Set appropriate epsilon values (typically 0.1-1.0 of fitness range)
  • Use seeds for reproducible results

📚 Citation

If you use this library in your research, please cite:

@software{lexicase_selection,
  title={Lexicase Selection Library},
  author={Ryan Bahlous-Boldi},
  year={2024},
  url={https://github.com/ryanboldi/lexicase}
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

🔗 References

  • Spector, L. (2012). Assessment of problem modality by differential performance of lexicase selection in genetic programming. GECCO.
  • La Cava, W., et al. (2019). Epsilon-lexicase selection for regression. GECCO.
  • Hernandez, J. G., et al. (2019). Random subsampling improves performance in lexicase selection. GECCO.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexicase-0.1.0.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lexicase-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file lexicase-0.1.0.tar.gz.

File metadata

  • Download URL: lexicase-0.1.0.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for lexicase-0.1.0.tar.gz
Algorithm Hash digest
SHA256 145ab444e3a1c96e57c82297f10f3f52d8d7f8b4866dff929e8cfc3f725a68c4
MD5 c607c5d843c83731c201443f8f3b8ecf
BLAKE2b-256 f708f4ce8f82641a2470177757220765ed22580e542f6ac8ccf3a37ae0e479bc

See more details on using hashes here.

File details

Details for the file lexicase-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lexicase-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for lexicase-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 caab4e44b82178ee05dfa740535d4662aab9ae02d41be4a045a4d9da70e00d87
MD5 6379dff18e469b5765f06d38f60b3deb
BLAKE2b-256 cfdc1c5a50e0a00620f8896ddfadc3e0419e34b6a3f6acf3c3e50fae4358c08f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page