Framework for machine and deep learning, with regression, classification and time series analysis
Project description
LeCrapaud is a high-level Python library for end-to-end machine learning on tabular and time series data. It handles feature engineering, model selection, training, and prediction in one command.
Key Features
- 🔄 End-to-end ML pipeline — feature engineering, preprocessing, feature selection, hyperparameter optimization, and training in a single
fit()call - 🤖 11+ models — from Linear Regression to XGBoost, LightGBM, CatBoost, and deep learning architectures (LSTM, GRU, TCN, Transformer)
- 🎯 Automated feature selection — ensemble of 10+ methods (Chi2, ANOVA, Mutual Information, SHAP, RFE, etc.)
- ⚡ Hyperparameter optimization — HyperOpt (TPE) and Ray Tune with cross-validation support
- 🔍 Explainability — built-in SHAP, LIME, feature importance, and tree visualization
- 🗄️ Experiment tracking — every experiment is stored in the database (PostgreSQL or MySQL) with full reproducibility
- 🧩 Modular — use the full pipeline or individual components (FeatureEngineer, FeaturePreprocessor, FeatureSelector) in sklearn-compatible pipelines
Why LeCrapaud?
Most ML tools solve one piece of the puzzle. LeCrapaud handles the entire workflow in a single fit() call.
| LeCrapaud | MLflow | scikit-learn | Auto-sklearn / TPOT | |
|---|---|---|---|---|
| Feature engineering | ✅ Automated (Fourier dates, target encoding, imputation) | ❌ Manual | ❌ Manual | ❌ Generic only |
| Feature selection | ✅ Ensemble of 10+ methods with voting | ❌ Manual | ❌ One method at a time | ⚠️ Implicit |
| Hyperparameter optimization | ✅ HyperOpt + Ray Tune | ❌ Manual | ⚠️ GridSearchCV | ✅ Built-in |
| Multi-target support | ✅ Native (regression + classification) | ❌ | ❌ | ❌ |
| Deep learning models | ✅ LSTM, GRU, TCN, Transformer | ❌ | ⚠️ MLP only | ❌ |
| Time series support | ✅ Fourier features, temporal CV, RNNs | ❌ | ⚠️ Basic | ❌ |
| Explainability | ✅ SHAP + LIME + feature importance | ❌ | ⚠️ Feature importance only | ❌ |
| Experiment tracking | ✅ Full artifacts in PostgreSQL/MySQL | ✅ Tracking server | ❌ | ❌ |
| Reproducibility | ✅ Reload any experiment with get(id=...) |
✅ | ❌ | ⚠️ |
| sklearn compatibility | ✅ fit/transform pattern | ❌ | ✅ Native | ✅ |
In short:
- MLflow tracks experiments but doesn't train models or engineer features — you still write all the ML code yourself
- scikit-learn provides building blocks but requires manual pipeline composition, no experiment tracking, and limited model support
- AutoML tools (auto-sklearn, TPOT) automate model selection but act as black boxes with no feature engineering transparency, no explainability, and no time series support
- LeCrapaud combines automated feature engineering, ensemble feature selection, hyperparameter optimization, multi-target training, explainability, and experiment tracking — all in one
fit()call, while remaining transparent and customizable
Prerequisites
- Python 3.12 (strictly required)
- PostgreSQL or MySQL database for experiment storage
- macOS only — libomp for LightGBM/XGBoost:
brew install libomp
Installation
pip install lecrapaud
Quick Start
from lecrapaud import LeCrapaud
LeCrapaud.set_uri("mysql+pymysql://user:password@host:port/dbname")
lc = LeCrapaud(
experiment_name="my_experiment",
target_numbers=[1],
target_clf=[1],
models_idx=["lgb", "xgb"],
)
lc.fit(data)
predictions, scores_reg, scores_clf = lc.predict(new_data)
Documentation
Full documentation available at lecrapaud.pierregallet.com
Contributing
Contributions are welcome! Here's how to get started.
Development Setup
git clone https://github.com/PierreGallet/lecrapaud.git
cd lecrapaud
python3.12 -m venv .venv
source .venv/bin/activate
make install
Workflow
- Open an issue first to discuss the change you'd like to make
- Fork the repo and create a branch from
main:feat/your-featurefor new featuresfix/your-bugfixfor bug fixesdocs/your-changefor documentation
- Write or update tests when changing behavior
- Run the test suite before submitting:
make test
- Open a Pull Request against
mainwith a clear description
Commit Convention
We use Conventional Commits. Every commit message and PR title must follow this format:
type: short description
| Type | Usage |
|---|---|
feat: |
New feature |
fix: |
Bug fix |
docs: |
Documentation only |
refactor: |
Code change that neither fixes a bug nor adds a feature |
test: |
Adding or updating tests |
perf: |
Performance improvement |
ci: |
CI/CD changes |
chore: |
Maintenance tasks |
Examples:
feat: add catboost model support
fix: handle missing target column in predict
docs: update getting started guide
Guidelines
- Keep PRs focused and small — one concern per PR
- Update documentation when APIs change
- Follow the existing code style
- All tests must pass before merging
License
LeCrapaud is licensed under the Apache License 2.0. You are free to use, modify, and distribute this software in compliance with the license terms.
Pierre Gallet 2025
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lecrapaud-0.31.5.tar.gz.
File metadata
- Download URL: lecrapaud-0.31.5.tar.gz
- Upload date:
- Size: 173.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7a5f01206843ccc263ecb7d519e70891b214bf7f0eb09f517577a41cfeb217d
|
|
| MD5 |
fe27a327bfd38ea234906a271449d287
|
|
| BLAKE2b-256 |
6f10a2bf3721daee266703695f2baa5f388a0cc9d2c51ae44578d846ef09bd07
|
File details
Details for the file lecrapaud-0.31.5-py3-none-any.whl.
File metadata
- Download URL: lecrapaud-0.31.5-py3-none-any.whl
- Upload date:
- Size: 209.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
645b8b5f20df2b811f794ebea57bb3bd131ff436216dfd84a2a8461b7cbe25a7
|
|
| MD5 |
383ffe4d27f0bbf4ccf3c19b725c8e58
|
|
| BLAKE2b-256 |
1298e83f0984b5655197538e14cc727dd74c981e32275ff64b9a0e87c88bec3c
|