Skip to main content

Framework for machine and deep learning, with regression, classification and time series analysis

Project description

crapaud

🐸 LeCrapaud

An all-in-one machine learning framework

PyPI version Python versions Documentation


LeCrapaud is a high-level Python library for end-to-end machine learning on tabular and time series data. It handles feature engineering, model selection, training, and prediction in one command.

Key Features

  • 🔄 End-to-end ML pipeline — feature engineering, preprocessing, feature selection, hyperparameter optimization, and training in a single fit() call
  • 🤖 11+ models — from Linear Regression to XGBoost, LightGBM, CatBoost, and deep learning architectures (LSTM, GRU, TCN, Transformer)
  • 🎯 Automated feature selection — ensemble of 10+ methods (Chi2, ANOVA, Mutual Information, SHAP, RFE, etc.)
  • Hyperparameter optimization — HyperOpt (TPE) and Ray Tune with cross-validation support
  • 🔍 Explainability — built-in SHAP, LIME, feature importance, and tree visualization
  • 🗄️ Experiment tracking — every experiment is stored in the database (PostgreSQL or MySQL) with full reproducibility
  • 🧩 Modular — use the full pipeline or individual components (FeatureEngineer, FeaturePreprocessor, FeatureSelector) in sklearn-compatible pipelines

Why LeCrapaud?

Most ML tools solve one piece of the puzzle. LeCrapaud handles the entire workflow in a single fit() call.

LeCrapaud MLflow scikit-learn Auto-sklearn / TPOT
Feature engineering ✅ Automated (Fourier dates, target encoding, imputation) ❌ Manual ❌ Manual ❌ Generic only
Feature selection ✅ Ensemble of 10+ methods with voting ❌ Manual ❌ One method at a time ⚠️ Implicit
Hyperparameter optimization ✅ HyperOpt + Ray Tune ❌ Manual ⚠️ GridSearchCV ✅ Built-in
Multi-target support ✅ Native (regression + classification)
Deep learning models ✅ LSTM, GRU, TCN, Transformer ⚠️ MLP only
Time series support ✅ Fourier features, temporal CV, RNNs ⚠️ Basic
Explainability ✅ SHAP + LIME + feature importance ⚠️ Feature importance only
Experiment tracking ✅ Full artifacts in PostgreSQL/MySQL ✅ Tracking server
Reproducibility ✅ Reload any experiment with get(id=...) ⚠️
sklearn compatibility ✅ fit/transform pattern ✅ Native

In short:

  • MLflow tracks experiments but doesn't train models or engineer features — you still write all the ML code yourself
  • scikit-learn provides building blocks but requires manual pipeline composition, no experiment tracking, and limited model support
  • AutoML tools (auto-sklearn, TPOT) automate model selection but act as black boxes with no feature engineering transparency, no explainability, and no time series support
  • LeCrapaud combines automated feature engineering, ensemble feature selection, hyperparameter optimization, multi-target training, explainability, and experiment tracking — all in one fit() call, while remaining transparent and customizable

Prerequisites

  • Python 3.12 (strictly required)
  • PostgreSQL or MySQL database for experiment storage
  • macOS onlylibomp for LightGBM/XGBoost:
    brew install libomp
    

Installation

📦 From PyPI (recommended)

Install the latest stable release:

pip install lecrapaud

Or pin a specific version:

pip install lecrapaud==0.31.7

🔧 From source

Install the latest development version directly from GitHub:

pip install git+https://github.com/PierreGallet/lecrapaud.git

Or clone the repository and install locally:

git clone https://github.com/PierreGallet/lecrapaud.git
cd lecrapaud
pip install .

Quick Start

from lecrapaud import LeCrapaud

LeCrapaud.set_uri("mysql+pymysql://user:password@host:port/dbname")

lc = LeCrapaud(
    experiment_name="my_experiment",
    target_numbers=[1],
    target_clf=[1],
    models_idx=["lgb", "xgb"],
)

lc.fit(data)
predictions = lc.predict(new_data)
# eval scores (when new_data has TARGET columns): lc.regression_scores / lc.classification_scores

Documentation

Full documentation available at lecrapaud.pierregallet.com

Contributing

Contributions are welcome! Here's how to get started.

Development Setup

git clone https://github.com/PierreGallet/lecrapaud.git
cd lecrapaud
python3.12 -m venv .venv
source .venv/bin/activate
make install

Workflow

  1. Open an issue first to discuss the change you'd like to make
  2. Fork the repo and create a branch from main:
    • feat/your-feature for new features
    • fix/your-bugfix for bug fixes
    • docs/your-change for documentation
  3. Write or update tests when changing behavior
  4. Run the test suite before submitting:
    make test
    
  5. Open a Pull Request against main with a clear description

Commit Convention

We use Conventional Commits. Every commit message and PR title must follow this format:

type: short description
Type Usage
feat: New feature
fix: Bug fix
docs: Documentation only
refactor: Code change that neither fixes a bug nor adds a feature
test: Adding or updating tests
perf: Performance improvement
ci: CI/CD changes
chore: Maintenance tasks

Examples:

feat: add catboost model support
fix: handle missing target column in predict
docs: update getting started guide

Guidelines

  • Keep PRs focused and small — one concern per PR
  • Update documentation when APIs change
  • Follow the existing code style
  • All tests must pass before merging

License

LeCrapaud is licensed under the Apache License 2.0. You are free to use, modify, and distribute this software in compliance with the license terms.


Pierre Gallet 2025

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lecrapaud-2.3.5.tar.gz (234.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lecrapaud-2.3.5-py3-none-any.whl (278.5 kB view details)

Uploaded Python 3

File details

Details for the file lecrapaud-2.3.5.tar.gz.

File metadata

  • Download URL: lecrapaud-2.3.5.tar.gz
  • Upload date:
  • Size: 234.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for lecrapaud-2.3.5.tar.gz
Algorithm Hash digest
SHA256 c277f7579b8da59f36f0321ba979aed191755d70bc5956e45f07ee5b55955dcb
MD5 c509a6aba5ea689ca1ce388ffb306591
BLAKE2b-256 f042e0b33da70693c160b8fd7504fa5dea7e03281f6880c4cbb413b9471e6e09

See more details on using hashes here.

File details

Details for the file lecrapaud-2.3.5-py3-none-any.whl.

File metadata

  • Download URL: lecrapaud-2.3.5-py3-none-any.whl
  • Upload date:
  • Size: 278.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for lecrapaud-2.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6a7eaa00bcb6bb8612cea89106506b5a182daa09fcaa3c826b06718059773d23
MD5 e1e1a63f9f0fc4382e5e4dcc59460d5c
BLAKE2b-256 54469eb1d946c83f82b04b56e726aa93f10ff5658a38a89ea14b74478ae6f130

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page