Skip to main content

The complete ML toolkit โ€” EDA, cleaning, training, explainability, deployment

Project description

mlpilot ๐Ÿš€

The Complete Python ML Toolkit โ€” Killing the Boilerplate.

PyPI version Python 3.9+ License: MIT Tests

mlpilot is the definitive "all-in-one" library for machine learning practitioners. What currently takes 40 hours of repetitive coding, mlpilot delivers in 5 minutes. No dependency hell. No boilerplate. Just results.


๐Ÿ“ข A Note from the Author

Hi! I'm Anannya Vyas, a student developer. ๐ŸŽ“

I didn't build mlpilot just to add another library to the pile. I built it because I was tired of writing the same 200 lines of null-handling, scaling, and correlation plotting for every single project. I wanted a library that "thinks" like a data scientistโ€”making smart defaults while giving you full override control.

This project is my graduation milestone. Itโ€™s a journey from learning how to build a package to implementing local-first AI intelligence. Whether you are a student like me or a professional looking to move faster, I hope mlpilot saves you from the "boilerplate tax."

Special thanks to my teacher Lovnish Verma for the inspiration and guidance.


๐Ÿš€ The mlpilot Advantage

mlpilot isn't just a library; it's a replacement for half your requirements.txt.

Feature mlpilot Profiling PyCaret SHAP Fairlearn
Smart EDA Report โœ… โœ… โŒ โŒ โŒ
Auto-Cleaning โœ… โŒ Partial โŒ โŒ
Model Leaderboard โœ… โŒ โœ… โŒ โŒ
Bias & Fairness Audit โœ… โŒ โŒ โŒ โœ…
AI Data Analyst โœ… โŒ โŒ โŒ โŒ
API Deployment โœ… โŒ โŒ โŒ โŒ

๐Ÿ› ๏ธ Prerequisites & Installation

AI Features (Recommended)

mlpilot uses local-first AI to keep your data private.

  1. Install Ollama.
  2. Run ollama pull llama3.2 (or llama3).
  3. Keep the Ollama app running to use ml.analyst() and ml.explain_data().

Installation

# Core installation
pip install mlplt

# Full suite (includes AI, NLP, and Deployment tools)
pip install mlplt[full]

โšก Quick Start: The "Instant Analyst" Workflow

Solve a churn prediction problem from scratch in under 60 seconds of coding.

import mlpilot as ml
import pandas as pd

# 1. Load your data
df = pd.read_csv("customer_data.csv")

# 2. Let the AI explain it in 3 sentences
ml.explain_data(df)

# 3. Auto-generate a premium dashboard
viz = ml.visualize(df, target="churn")
viz.to_html("insights.html")

# 4. Clean and Engineer (leakage-safe)
clean = ml.clean(df, target="churn")
feats = ml.features(clean.df, target="churn")

# 5. Training & Auditing
X_train, X_test, y_train, y_test = ml.split(feats, test_size=0.2)
base = ml.baseline(X_train, y_train)

# 6. Audit for Bias & Stability
audit = ml.audit(base.best_model, X_test, y_test)
audit.print_summary()

# 7. Generate your final Project Story
ml.story([df, base, audit])

๐Ÿง  The AI Brain: Intelligent ML

๐Ÿ” AI Analyst (ml.analyst)

Ask questions about your data in plain English. No more searching for pandas syntax.

"Which city has the highest average churn, and is it correlated with revenue?" mlpilot generates the code, shows it to you for review, and runs it instantly.

โš–๏ธ MLAudit (ml.audit)

Go beyond accuracy. Audit your models for:

  • Technical Stability: How does the model react to noise?
  • Social Bias: Check for Demographic Parity and Equalized Odds to ensure your model is fair to all groups.

๐Ÿ“ DataStory (ml.story)

Automatically synthesizes your EDA, Cleaning, Training, and Auditing results into a narrative report. Perfect for presentations!


๐Ÿ“– Complete API Reference

Module Function Description
EDA ml.analyze(df) 12-section comprehensive report.
Visualizer ml.visualize(df) Intelligent auto-plotting dashboard.
Cleaner ml.clean(df) Auto-fix nulls, outliers, and dtypes.
Validator ml.validate(df) Check for leakage, drift, and schema errors.
FeatureForge ml.features(df) One-line encoding and scaling pipeline.
Baseline ml.baseline(X,y) 15+ model comparison leaderboard.
Explainer ml.explain(model,X) Global and local SHAP explanations.
LaunchPad ml.deploy(model) FastAPI + Docker in 5 minutes.

๐Ÿ“ Project Structure

mlpilot/
โ”œโ”€โ”€ ai/          # Analyst, Audit, Story (The Brain)
โ”œโ”€โ”€ clean/       # Cleaner, Strategies, Diff
โ”œโ”€โ”€ eda/         # Analyzer, Visualizer, Plots
โ”œโ”€โ”€ train/       # Baseline, HyperX, Evaluate
โ”œโ”€โ”€ validate/    # Schema, Drift, Checks
โ””โ”€โ”€ deploy/      # LaunchPad, FastAPI

๐Ÿค Contributing

Contributions are welcome! This is a student project, and I'd love to see how you improve it.

  1. Fork the repo.
  2. Install dev dependencies: pip install -e ".[dev]"
  3. Run tests: pytest tests/

๐Ÿ“„ License

Licensed under the MIT License. Use it, fork it, make it yours.


โญ Show Your Support

If mlpilot helped you crush your project boilerplate:

  • โญ Star the repo on GitHub.
  • ๐Ÿ“ข Share it with your classmates.
  • ๐Ÿ› Report bugs to help me learn and improve.

Built with โค๏ธ for the ML community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlplt-0.1.2.tar.gz (88.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlplt-0.1.2-py3-none-any.whl (94.3 kB view details)

Uploaded Python 3

File details

Details for the file mlplt-0.1.2.tar.gz.

File metadata

  • Download URL: mlplt-0.1.2.tar.gz
  • Upload date:
  • Size: 88.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6e4d78347b39b013b0b9c48bed8f7b30ee95eef62e57a5b2e15034a4f89a4ae6
MD5 cdc6c75b67275c2c8a6f6f843703100f
BLAKE2b-256 9945aa6d793e13a468211c152021ae3ba5e4a7d63075b0a0bf538b4637b19d95

See more details on using hashes here.

File details

Details for the file mlplt-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mlplt-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 94.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlplt-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 468091f37cf77499fa2152dc8d7fc40c427abb3323b36044c2c1a27a299a7e6a
MD5 a1d633738ee63a1017273e1615795afe
BLAKE2b-256 c864488f2ded4c23bdb6d2ddd9a11a1104274f4e14f95285b160b72a1d6734b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page