Skip to main content

Automated co-design of ML predictors and learning-augmented algorithms

Project description

Crispo: Autonomous Co-Design of ML Predictors and Learning-Augmented Algorithms

Crispo is a production-ready, research-grade system for the automated co-design of Learning-Augmented Algorithms (LAA). It transforms high-level objectives into complete, two-part "Solution Packages" containing a machine learning predictor and a specialized algorithm that consumes its predictions.

🎯 System Overview

The core innovation of Crispo is its ability to bridge the gap between machine learning and classical algorithm design. For a given online problem (e.g., ski rental), it generates:

  1. A Predictor Script: An ML model (e.g., ARIMA) that learns from historical data to predict future values and quantifies its own uncertainty.
  2. An Algorithm Script: A Learning-Augmented Algorithm that takes the ML prediction as input and intelligently balances it against a robust worst-case strategy using a trust parameter (λ).

The entire system is designed to be autonomous, optimizing its own components and learning from past performance to improve future solutions.

🏗️ Core Architecture

Crispo is built on a three-tier optimization stack, ensuring a clear separation of concerns:

  1. Genetic Algorithm (Strategic): The GAOptimizer evolves high-level parameters for code generation, searching for the best overall strategy. It now features adaptive population sizing for improved efficiency.
  2. Reinforcement Learning (Tactical): The RLAgent fine-tunes the parameters for a specific layer, using a Q-table to learn optimal, context-aware adjustments. The Q-table is now pruned to prevent unbounded memory growth.
  3. Attention Mechanism (Coordination): The AttentionRouter allows different layers of the generated pipeline to share information, ensuring a cohesive and well-coordinated final output.

This stack feeds into an intent-driven CodeGenerator that selects and parameterizes code templates based on the user's objective.

✨ Key Features & Innovations

1. Learning-Augmented Algorithm (LAA) Co-Design

Crispo's primary feature is its end-to-end framework for generating and evaluating LAAs. The system automatically co-designs a predictor and an algorithm that work in tandem.

2. Two-Stage "Live" Evaluation

To ensure solutions are robust, the Verifier performs a rigorous, two-stage evaluation that simulates a real-world deployment:

  • Stage 1: Prediction: The generated predictor is run on historical data to produce a "live" prediction interval.
  • Stage 2: Execution: The generated algorithm is run with the live prediction, and its performance (e.g., competitive_ratio) is measured.

This methodology is novel and provides a much more realistic assessment than mock evaluations.

3. Solution Registry

Verified solutions are automatically versioned and saved to the solution_registry/ directory. This creates a persistent, queryable knowledge base of high-quality solutions.

Example Query:

python3 crispo.py --query-registry "competitive_ratio:1.2"

4. Meta-Learning with UCB1

The MetaLearner allows Crispo to learn from its own performance. It has been upgraded from a simple epsilon-greedy strategy to an Upper Confidence Bound (UCB1) algorithm, which provides a more principled and efficient balance between exploring new strategies and exploiting known good ones.

⚙️ Component Analysis & Recent Improvements

  • GAOptimizer: Now uses adaptive population sizing to scale its search space based on problem complexity, improving performance. It also evaluates fitness in parallel using a ProcessPoolExecutor.
  • RLAgent: The Q-table is now pruned after each training episode to prevent memory exhaustion in long-running sessions.
  • Verifier: Now includes a PredictorEvaluator that calculates Uncertainty Quantification (UQ) metrics (coverage_rate and interval_sharpness) for the generated predictor, providing a more complete picture of the solution's quality.
  • Security: Subprocess execution is now sandboxed with resource limits to prevent runaway processes, and file writes are validated to prevent directory traversal attacks.

🚀 Advanced Features

Bayesian Neural Architecture Search (NAS)

The NAS pipeline has been upgraded from a random search to a Bayesian Optimization strategy, using Gaussian Processes to intelligently search for optimal neural network architectures. This results in a ~10x speedup in finding near-optimal architectures.

Federated Optimizer

The placeholder FederatedOptimizer has been replaced with a functional Federated Averaging (FedAvg) implementation, enabling true federated learning across multiple clients.

Transfer Learning

A production-ready, three-step transfer learning pipeline (load_model, apply_model, log_to_registry) is available to transfer knowledge from previously trained models.

Usage

Crispo is a command-line tool. The main entry point is crispo.py.

Basic Example

python3 crispo.py --project "MyDataPipeline" --objective "Fetch data from an API, process it with pandas, and analyze with numpy"

LAA Co-Design Example

To generate a Learning-Augmented Algorithm for the ski rental problem:

python3 crispo.py --project "SkiRentalLAA" --objective "Generate a learning-augmented algorithm for the ski rental problem" --trust-parameter 0.7

Note: This requires a ski_rental_history.csv file in the root directory.

Enabling Advanced Features

python3 crispo.py --objective "Optimize a deep learning model" \
                  --enable-nas \
                  --enable-transfer-learning \
                  --enable-federated-optimization

Saving and Loading Meta-Knowledge

You can persist the MetaLearner's state across runs:

# Save the learned state
python3 crispo.py --objective "My first run" --save-metaknowledge knowledge.pkl

# Load the state for a new run
python3 crispo.py --objective "My second run, building on the first" --load-metaknowledge knowledge.pkl

Testing

The project uses the built-in unittest framework. To run the full test suite:

python3 -m unittest test_crispo.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crispo-1.0.0.tar.gz (29.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crispo-1.0.0-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file crispo-1.0.0.tar.gz.

File metadata

  • Download URL: crispo-1.0.0.tar.gz
  • Upload date:
  • Size: 29.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for crispo-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d078ddf21ab51739061d7235ee704c0c30c03dab74a0741cf080d2f43e31135f
MD5 ac198e2f1654ca8f6baac3e9d3e6b8f7
BLAKE2b-256 21da07604b9b238703d0aa04a52c7c91adccfe3f5882cd45167dd7b91116945c

See more details on using hashes here.

File details

Details for the file crispo-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: crispo-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for crispo-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ce7f09c1dbf1a0446c9bce078b21e307a89fcbb42cb734c9f6763da7a54a671f
MD5 1c0289a58250bd7b052ae112160651f5
BLAKE2b-256 b8f7ed230aaba2742205814090304362727de0f35bb77d9897fe920c56947e78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page