Skip to main content

The complete ML toolkit โ€” EDA, cleaning, training, explainability, deployment

Project description

mlpilot ๐Ÿš€

The Complete Python ML Toolkit โ€” Killing the Boilerplate.

PyPI version Python 3.9+ License: MIT Tests

mlpilot is the definitive "all-in-one" library for machine learning practitioners. What currently takes 40 hours of repetitive coding, mlpilot delivers in 5 minutes. No dependency hell. No boilerplate. Just results.


๐Ÿ“ข A Note from the Author

Hi! I'm Anannya Vyas, a student developer. ๐ŸŽ“

I didn't build mlpilot just to add another library to the pile. I built it because I was tired of writing the same 200 lines of null-handling, scaling, and correlation plotting for every single project. I wanted a library that "thinks" like a data scientistโ€”making smart defaults while giving you full override control.

This project is my graduation milestone. Itโ€™s a journey from learning how to build a package to implementing local-first AI intelligence. Whether you are a student like me or a professional looking to move faster, I hope mlpilot saves you from the "boilerplate tax."

Special thanks to my teacher Lovnish Verma for the inspiration and guidance.


๐Ÿš€ The mlpilot Advantage

mlpilot isn't just a library; it's a replacement for half your requirements.txt.

Feature mlpilot Profiling PyCaret SHAP Fairlearn
Smart EDA Report โœ… โœ… โŒ โŒ โŒ
Auto-Cleaning โœ… โŒ Partial โŒ โŒ
Model Leaderboard โœ… โŒ โœ… โŒ โŒ
Bias & Fairness Audit โœ… โŒ โŒ โŒ โœ…
AI Data Analyst โœ… โŒ โŒ โŒ โŒ
API Deployment โœ… โŒ โŒ โŒ โŒ

๐Ÿ› ๏ธ Prerequisites & Installation

AI Features (Recommended)

mlpilot uses local-first AI to keep your data private.

  1. Install Ollama.
  2. Run ollama pull llama3.2 (or llama3).
  3. Keep the Ollama app running to use ml.analyst() and ml.explain_data().

Installation

# Core installation
pip install mlplt

# Full suite (includes AI, NLP, and Deployment tools)
pip install mlplt[full]

โšก Quick Start: The "Instant Analyst" Workflow

Solve a churn prediction problem from scratch in under 60 seconds of coding.

import mlpilot as ml
import pandas as pd

# 1. Load your data
df = pd.read_csv("customer_data.csv")

# 2. Let the AI explain it in 3 sentences
ml.explain_data(df)

# 3. Auto-generate a premium dashboard
viz = ml.visualize(df, target="churn")
viz.to_html("insights.html")

# 4. Clean and Engineer (leakage-safe)
clean = ml.clean(df, target="churn")
feats = ml.features(clean.df, target="churn")

# 5. Training & Auditing
X_train, X_test, y_train, y_test = ml.split(feats, test_size=0.2)
base = ml.baseline(X_train, y_train)

# 6. Audit for Bias & Stability
audit = ml.audit(base.best_model, X_test, y_test)
audit.print_summary()

# 7. Generate your final Project Story
ml.story([df, base, audit])

๐Ÿง  The AI Brain: Intelligent ML

๐Ÿ” AI Analyst (ml.analyst)

Ask questions about your data in plain English. No more searching for pandas syntax.

"Which city has the highest average churn, and is it correlated with revenue?" mlpilot generates the code, shows it to you for review, and runs it instantly.

โš–๏ธ MLAudit (ml.audit)

Go beyond accuracy. Audit your models for:

  • Technical Stability: How does the model react to noise?
  • Social Bias: Check for Demographic Parity and Equalized Odds to ensure your model is fair to all groups.

๐Ÿ“ DataStory (ml.story)

Automatically synthesizes your EDA, Cleaning, Training, and Auditing results into a narrative report. Perfect for presentations!


๐Ÿ“– Complete API Reference

Module Function Description
EDA ml.analyze(df) 12-section comprehensive report.
Visualizer ml.visualize(df) Intelligent auto-plotting dashboard.
Cleaner ml.clean(df) Auto-fix nulls, outliers, and dtypes.
Validator ml.validate(df) Check for leakage, drift, and schema errors.
FeatureForge ml.features(df) One-line encoding and scaling pipeline.
Baseline ml.baseline(X,y) 15+ model comparison leaderboard.
Explainer ml.explain(model,X) Global and local SHAP explanations.
LaunchPad ml.deploy(model) FastAPI + Docker in 5 minutes.

๐Ÿ“ Project Structure

mlpilot/
โ”œโ”€โ”€ ai/          # Analyst, Audit, Story (The Brain)
โ”œโ”€โ”€ clean/       # Cleaner, Strategies, Diff
โ”œโ”€โ”€ eda/         # Analyzer, Visualizer, Plots
โ”œโ”€โ”€ train/       # Baseline, HyperX, Evaluate
โ”œโ”€โ”€ validate/    # Schema, Drift, Checks
โ””โ”€โ”€ deploy/      # LaunchPad, FastAPI

๐Ÿค Contributing

Contributions are welcome! This is a student project, and I'd love to see how you improve it.

  1. Fork the repo.
  2. Install dev dependencies: pip install -e ".[dev]"
  3. Run tests: pytest tests/

๐Ÿ“„ License

Licensed under the MIT License. Use it, fork it, make it yours.


โญ Show Your Support

If mlpilot helped you crush your project boilerplate:

  • โญ Star the repo on GitHub.
  • ๐Ÿ“ข Share it with your classmates.
  • ๐Ÿ› Report bugs to help me learn and improve.

Built with โค๏ธ for the ML community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlplt-0.1.1.tar.gz (88.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlplt-0.1.1-py3-none-any.whl (94.3 kB view details)

Uploaded Python 3

File details

Details for the file mlplt-0.1.1.tar.gz.

File metadata

  • Download URL: mlplt-0.1.1.tar.gz
  • Upload date:
  • Size: 88.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3450de88a015b67ec7357527089e60b06204b273f38dfcb614f2ee8e25578ba6
MD5 ef54f7ed7a612ad89ec430be9d82eb2f
BLAKE2b-256 f3407090c33c43a5fd112457ed1ed03e2e5d46d8933cb6a5a4e7af09f271e65a

See more details on using hashes here.

File details

Details for the file mlplt-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mlplt-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 94.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec7350d06e79aefa5f95b1560902b874af3ec26f34eed0e0d65855960d1c2b72
MD5 e2250136f56bfd54061c1720fbaef9e8
BLAKE2b-256 cb6c95010735cbe63f86d9377e176b80b9db06a47cb09f151e1479d0c6a9ea6d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page