Skip to main content

The complete ML toolkit โ€” EDA, cleaning, training, explainability, deployment

Project description

mlpilot ๐Ÿš€

The Complete Python ML Toolkit โ€” Killing the Boilerplate.

PyPI version Python 3.9+ License: MIT Tests

mlpilot is the definitive "all-in-one" library for machine learning practitioners. What currently takes 40 hours of repetitive coding, mlpilot delivers in 5 minutes. No dependency hell. No boilerplate. Just results.


๐Ÿ“ข A Note from the Author

Hi! I'm Anannya Vyas, a student developer. ๐ŸŽ“

I didn't build mlpilot just to add another library to the pile. I built it because I was tired of writing the same 200 lines of null-handling, scaling, and correlation plotting for every single project. I wanted a library that "thinks" like a data scientistโ€”making smart defaults while giving you full override control.

This project is my graduation milestone. Itโ€™s a journey from learning how to build a package to implementing local-first AI intelligence. Whether you are a student like me or a professional looking to move faster, I hope mlpilot saves you from the "boilerplate tax."

Special thanks to my teacher Lovnish Verma for the inspiration and guidance.


๐Ÿš€ The mlpilot Advantage

mlpilot isn't just a library; it's a replacement for half your requirements.txt.

Feature mlpilot Profiling PyCaret SHAP Fairlearn
Smart EDA Report โœ… โœ… โŒ โŒ โŒ
Auto-Cleaning โœ… โŒ Partial โŒ โŒ
Model Leaderboard โœ… โŒ โœ… โŒ โŒ
Bias & Fairness Audit โœ… โŒ โŒ โŒ โœ…
AI Data Analyst โœ… โŒ โŒ โŒ โŒ
API Deployment โœ… โŒ โŒ โŒ โŒ

๐Ÿ› ๏ธ Prerequisites & Installation

AI Features (Recommended)

mlpilot uses local-first AI to keep your data private.

  1. Install Ollama.
  2. Run ollama pull llama3.2 (or llama3).
  3. Keep the Ollama app running to use ml.analyst() and ml.explain_data().

Installation

# Core installation
pip install mlplt

# Full suite (includes AI, NLP, and Deployment tools)
pip install mlplt[full]

โšก Quick Start: The "Instant Analyst" Workflow

Solve a churn prediction problem from scratch in under 60 seconds of coding.

import mlpilot as ml
import pandas as pd

# 1. Load your data
df = pd.read_csv("customer_data.csv")

# 2. Let the AI explain it in 3 sentences
ml.explain_data(df)

# 3. Auto-generate a premium dashboard
viz = ml.visualize(df, target="churn")
viz.to_html("insights.html")

# 4. Clean and Engineer (leakage-safe)
clean = ml.clean(df, target="churn")
feats = ml.features(clean.df, target="churn")

# 5. Training & Auditing
X_train, X_test, y_train, y_test = ml.split(feats, test_size=0.2)
base = ml.baseline(X_train, y_train)

# 6. Audit for Bias & Stability
audit = ml.audit(base.best_model, X_test, y_test)
audit.print_summary()

# 7. Generate your final Project Story
ml.story([df, base, audit])

๐Ÿง  The AI Brain: Intelligent ML

๐Ÿ” AI Analyst (ml.analyst)

Ask questions about your data in plain English. No more searching for pandas syntax.

"Which city has the highest average churn, and is it correlated with revenue?" mlpilot generates the code, shows it to you for review, and runs it instantly.

โš–๏ธ MLAudit (ml.audit)

Go beyond accuracy. Audit your models for:

  • Technical Stability: How does the model react to noise?
  • Social Bias: Check for Demographic Parity and Equalized Odds to ensure your model is fair to all groups.

๐Ÿ“ DataStory (ml.story)

Automatically synthesizes your EDA, Cleaning, Training, and Auditing results into a narrative report. Perfect for presentations!


๐Ÿ“– Complete API Reference

Module Function Description
EDA ml.analyze(df) 12-section comprehensive report.
Visualizer ml.visualize(df) Intelligent auto-plotting dashboard.
Cleaner ml.clean(df) Auto-fix nulls, outliers, and dtypes.
Validator ml.validate(df) Check for leakage, drift, and schema errors.
FeatureForge ml.features(df) One-line encoding and scaling pipeline.
Baseline ml.baseline(X,y) 15+ model comparison leaderboard.
Explainer ml.explain(model,X) Global and local SHAP explanations.
LaunchPad ml.deploy(model) FastAPI + Docker in 5 minutes.

๐Ÿ“ Project Structure

mlpilot/
โ”œโ”€โ”€ ai/          # Analyst, Audit, Story (The Brain)
โ”œโ”€โ”€ clean/       # Cleaner, Strategies, Diff
โ”œโ”€โ”€ eda/         # Analyzer, Visualizer, Plots
โ”œโ”€โ”€ train/       # Baseline, HyperX, Evaluate
โ”œโ”€โ”€ validate/    # Schema, Drift, Checks
โ””โ”€โ”€ deploy/      # LaunchPad, FastAPI

๐Ÿค Contributing

Contributions are welcome! This is a student project, and I'd love to see how you improve it.

  1. Fork the repo.
  2. Install dev dependencies: pip install -e ".[dev]"
  3. Run tests: pytest tests/

๐Ÿ“„ License

Licensed under the MIT License. Use it, fork it, make it yours.


โญ Show Your Support

If mlpilot helped you crush your project boilerplate:

  • โญ Star the repo on GitHub.
  • ๐Ÿ“ข Share it with your classmates.
  • ๐Ÿ› Report bugs to help me learn and improve.

Built with โค๏ธ for the ML community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlplt-0.1.0.tar.gz (88.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlplt-0.1.0-py3-none-any.whl (94.3 kB view details)

Uploaded Python 3

File details

Details for the file mlplt-0.1.0.tar.gz.

File metadata

  • Download URL: mlplt-0.1.0.tar.gz
  • Upload date:
  • Size: 88.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8cab12ce5bf54d6a5cdf2d074f8ffc259699e1b4c793bf4b8aa8dba2314e02d6
MD5 0fa18ab47d69ecbbaa1c08b7592cb574
BLAKE2b-256 fda15fa147395908690e5191251c4fa83aea521927812181d3cdc5e596ad25ea

See more details on using hashes here.

File details

Details for the file mlplt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mlplt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 94.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1c2bcc5965cdbd8eef09bf862c139c1ee582f04cb3a184aa05b7ed55840b29a
MD5 dd07e8b36cc83fcef1fa53f68196efd7
BLAKE2b-256 8b138a66b8bc5adb32b378913d24682938ca8b01f98f877fbfd4dfb86aab84f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page