Skip to main content

Automated co-design of ML predictors and learning-augmented algorithms

Project description


license: agpl-3.0 tags:

  • learning-augmented-algorithms
  • automated-algorithm-design
  • crispo
  • code-generation
  • ai-code-generation

Crispo: Autonomous Co-Design of ML Predictors and Learning-Augmented Algorithms

Crispo is a production-ready, research-grade system for the automated co-design of Learning-Augmented Algorithms (LAA). It transforms high-level objectives into complete, two-part "Solution Packages" containing a machine learning predictor and a specialized algorithm that consumes its predictions.

🎯 System Overview

The core innovation of Crispo is its ability to bridge the gap between machine learning and classical algorithm design. For a given online problem (e.g., ski rental), it generates:

  1. A Predictor Script: An ML model (e.g., ARIMA) that learns from historical data to predict future values and quantifies its own uncertainty.
  2. An Algorithm Script: A Learning-Augmented Algorithm that takes the ML prediction as input and intelligently balances it against a robust worst-case strategy using a trust parameter (λ).

The entire system is designed to be autonomous, optimizing its own components and learning from past performance to improve future solutions.

🏗️ Core Architecture

Crispo is built on a three-tier optimization stack, ensuring a clear separation of concerns:

  1. Genetic Algorithm (Strategic): The GAOptimizer evolves high-level parameters for code generation, searching for the best overall strategy. It now features adaptive population sizing for improved efficiency.
  2. Reinforcement Learning (Tactical): The RLAgent fine-tunes the parameters for a specific layer, using a Q-table to learn optimal, context-aware adjustments. The Q-table is now pruned to prevent unbounded memory growth.
  3. Attention Mechanism (Coordination): The AttentionRouter allows different layers of the generated pipeline to share information, ensuring a cohesive and well-coordinated final output.

This stack feeds into an intent-driven CodeGenerator that selects and parameterizes code templates based on the user's objective.

✨ Key Features & Innovations

1. Learning-Augmented Algorithm (LAA) Co-Design

Crispo's primary feature is its end-to-end framework for generating and evaluating LAAs. The system automatically co-designs a predictor and an algorithm that work in tandem.

2. Two-Stage "Live" Evaluation

To ensure solutions are robust, the Verifier performs a rigorous, two-stage evaluation that simulates a real-world deployment:

  • Stage 1: Prediction: The generated predictor is run on historical data to produce a "live" prediction interval.
  • Stage 2: Execution: The generated algorithm is run with the live prediction, and its performance (e.g., competitive_ratio) is measured.

This methodology is novel and provides a much more realistic assessment than mock evaluations.

3. Solution Registry

Verified solutions are automatically versioned and saved to the solution_registry/ directory. This creates a persistent, queryable knowledge base of high-quality solutions.

Example Query:

python3 crispo.py --query-registry "competitive_ratio:1.2"

4. Meta-Learning with UCB1

The MetaLearner allows Crispo to learn from its own performance. It has been upgraded from a simple epsilon-greedy strategy to an Upper Confidence Bound (UCB1) algorithm, which provides a more principled and efficient balance between exploring new strategies and exploiting known good ones.

⚙️ Component Analysis & Recent Improvements

  • GAOptimizer: Now uses adaptive population sizing to scale its search space based on problem complexity, improving performance. It also evaluates fitness in parallel using a ProcessPoolExecutor.
  • RLAgent: The Q-table is now pruned after each training episode to prevent memory exhaustion in long-running sessions.
  • Verifier: Now includes a PredictorEvaluator that calculates Uncertainty Quantification (UQ) metrics (coverage_rate and interval_sharpness) for the generated predictor, providing a more complete picture of the solution's quality.
  • Security: Subprocess execution is now sandboxed with resource limits to prevent runaway processes, and file writes are validated to prevent directory traversal attacks.

🚀 Advanced Features

Bayesian Neural Architecture Search (NAS)

The NAS pipeline has been upgraded from a random search to a Bayesian Optimization strategy, using Gaussian Processes to intelligently search for optimal neural network architectures. This results in a ~10x speedup in finding near-optimal architectures.

Federated Optimizer

The placeholder FederatedOptimizer has been replaced with a functional Federated Averaging (FedAvg) implementation, enabling true federated learning across multiple clients.

Transfer Learning

A production-ready, three-step transfer learning pipeline (load_model, apply_model, log_to_registry) is available to transfer knowledge from previously trained models.

Usage

Crispo is a command-line tool. The main entry point is crispo.py.

Basic Example

python3 crispo.py --project "MyDataPipeline" --objective "Fetch data from an API, process it with pandas, and analyze with numpy"

LAA Co-Design Example

To generate a Learning-Augmented Algorithm for the ski rental problem:

python3 crispo.py --project "SkiRentalLAA" --objective "Generate a learning-augmented algorithm for the ski rental problem" --trust-parameter 0.7

Note: This requires a ski_rental_history.csv file in the root directory.

Enabling Advanced Features

python3 crispo.py --objective "Optimize a deep learning model" \
                  --enable-nas \
                  --enable-transfer-learning \
                  --enable-federated-optimization

Saving and Loading Meta-Knowledge

You can persist the MetaLearner's state across runs:

# Save the learned state
python3 crispo.py --objective "My first run" --save-metaknowledge knowledge.pkl

# Load the state for a new run
python3 crispo.py --objective "My second run, building on the first" --load-metaknowledge knowledge.pkl

Licensing

crispo is licensed under the GNU Affero General Public License v3.0 (AGPLv3). This means you are free to use, modify, and distribute this software for any open-source project that is also licensed under the AGPLv3.

For use in a closed-source, proprietary, or commercial application, a separate commercial license is required. Please contact us at crispo.contact@gmail.com to inquire about obtaining a commercial license.

Testing

The project uses the built-in unittest framework. To run the full test suite:

python3 -m unittest test_crispo.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crispo-1.0.2.tar.gz (40.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crispo-1.0.2-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file crispo-1.0.2.tar.gz.

File metadata

  • Download URL: crispo-1.0.2.tar.gz
  • Upload date:
  • Size: 40.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for crispo-1.0.2.tar.gz
Algorithm Hash digest
SHA256 604c8dc33231e96aafb96ac8789730c5ed52b22aa4578598c27b49a47aeeb431
MD5 22a46f5816719940dccb5a7d100b0bd4
BLAKE2b-256 0eb1a18cbe8abbcb1935f480f9dde7c21074c5f8b62eeedbdc2bf6a5029f0231

See more details on using hashes here.

File details

Details for the file crispo-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: crispo-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 40.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for crispo-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 643f9e80229fde623e93a56785a1ff1914b438982b6813c45baa99077de6aa18
MD5 7f775fc5613cc2afa851df7524c4eb1e
BLAKE2b-256 12752a28b91bd936b71b11663b58f1d5a6ae50ac46c5dafecb52eaded3922af4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page