Skip to main content

The complete ML toolkit โ€” EDA, cleaning, training, explainability, deployment

Project description

mlpilot ๐Ÿš€

PyPI version Python 3.9+ License: MIT Tests Downloads

mlpilot is the complete Python ML toolkit โ€” what currently takes 30โ€“40 hours of repetitive boilerplate takes 5โ€“10 minutes. One import. Every tool you need. Full explainability.

import mlpilot as ml

eda    = ml.analyze(df, target='churn')          # 12-section EDA report
clean  = ml.clean(df, target='churn')            # auto null/outlier/dtype fixing
feats  = ml.features(clean.df, target='churn')  # leakage-safe feature pipeline
board  = ml.baseline(X_train, y_train)           # 15+ model leaderboard in 2 min
tuned  = ml.tune('lgbm', X_train, y_train)       # Bayesian hyperparameter search
exp    = ml.explain(tuned.best_model, X_train)   # SHAP global + local explanations
api    = ml.deploy(tuned.best_model)             # FastAPI + Docker in 5 minutes
api.serve(port=8000)                              # โ†’ localhost:8000/predict

Why mlpilot?

Feature mlpilot ydata-profiling sweetviz PyCaret SHAP
Smart EDA report โœ… โœ… โœ… โŒ โŒ
Auto data cleaning โœ… โŒ โŒ Partial โŒ
Multi-model baseline โœ… โŒ โŒ โœ… โŒ
Hyperparameter tuning โœ… โŒ โŒ โœ… โŒ
Model explainability โœ… โŒ โŒ โŒ โœ…
Time series โœ… โŒ โŒ โœ… โŒ
NLP pipeline โœ… โŒ โŒ โœ… โŒ
API deployment โœ… โŒ โŒ โŒ โŒ
AI data analyst โœ… โŒ โŒ โŒ โŒ
Undo / diff reports โœ… โŒ โŒ โŒ โŒ

Installation

# Core (EDA, cleaning, validation, features, training)
pip install mlplt

# With specific extras
pip install mlplt[xgb,lgbm,shap,optuna]

# Everything
pip install mlplt[full]

Available extras: xgb, lgbm, shap, optuna, prophet, nlp, imb, deploy, ai, full

Modules

Module Function Description
SmartEDA ml.analyze(df) 12-section EDA report with plots
AutoCleaner ml.clean(df) Auto null/outlier/dtype fixing with undo
DataValidator ml.validate(df) Schema, leakage, drift detection
FeatureForge ml.features(df) Leakage-safe encoding + scaling pipeline
BaselineBlitz ml.baseline(X, y) 15+ model comparison leaderboard
EvalSuite ml.evaluate(model, X, y) All metrics + diagnostic plots
HyperX ml.tune(model, X, y) Bayesian hyperparameter optimization
Explainer ml.explain(model, X) SHAP global + local + what-if
BalanceKit ml.balance(X, y) Auto SMOTE/ADASYN/class_weight
TimeSense ml.forecast(df) Multi-model time series forecasting
TextML ml.text_classify(df) NLP classification + embeddings
LaunchPad ml.deploy(model) FastAPI + Docker generation
AIAnalyst ml.analyst(df) Ask questions in plain English

Quick Start โ€” Churn Prediction

import mlpilot as ml
import pandas as pd

df = pd.read_csv('churn.csv')

# 1. Understand your data
eda = ml.analyze(df, target='Churn', report_format='html')

# 2. Clean it
df_clean = ml.clean(df, target='Churn').df

# 3. Engineer features (leakage-safe)
feats = ml.features(df_clean, target='Churn')
X_train, X_test, y_train, y_test = ml.split(feats, test_size=0.2, stratify=True)

# 4. Handle imbalance
bal = ml.balance(X_train, y_train)

# 5. Find the best model
board = ml.baseline(bal.X_resampled, bal.y_resampled, X_test=X_test, y_test=y_test)
board.leaderboard.print()

# 6. Tune + evaluate
tuned = ml.tune('lgbm', bal.X_resampled, bal.y_resampled, time_budget=300)
eval_r = ml.evaluate(tuned.best_model, X_test, y_test, optimize_threshold=True)

# 7. Explain
exp = ml.explain(tuned.best_model, X_train, X_test)
exp.feature_importance()

# 8. Deploy
ml.deploy(tuned.best_model, X_sample=X_test.iloc[:10]).serve(port=8000)

Documentation

Full API reference: mlpilot.readthedocs.io

Contributing

  1. Fork the repo
  2. pip install -e ".[dev]"
  3. pre-commit install
  4. Make your changes + add tests
  5. pytest tests/ --cov=mlpilot
  6. Open a pull request

License

MIT โ€” see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlplt-0.2.0.tar.gz (88.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlplt-0.2.0-py3-none-any.whl (96.3 kB view details)

Uploaded Python 3

File details

Details for the file mlplt-0.2.0.tar.gz.

File metadata

  • Download URL: mlplt-0.2.0.tar.gz
  • Upload date:
  • Size: 88.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 eda4d27afbdaa756bcb4b1612090372ee7a9eb318d5b6a66b608e4677eca6e6e
MD5 2449c457924e3d110e2bc5c3256aa36e
BLAKE2b-256 083f167cba7dd3faa3547a2c27fc89ca3047c28d2e9c456730fb6e8dd001c186

See more details on using hashes here.

File details

Details for the file mlplt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mlplt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 96.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0739e12cfb23d7225ca7ca67180fe6883dca33e7b4f6bc758e85ee8fa1de88b7
MD5 83b4be9cf13f1b2f126a50dd0ea18db2
BLAKE2b-256 75eba722cf6a353fdd18aac3a424c6b459f4393a40f77023ddd40aa9ce51707c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page